Treffer: A cooccurrence-based thesaurus and two applications to information retrieval
Title:
A cooccurrence-based thesaurus and two applications to information retrieval
Authors:
Source:
Intelligent multimedia information retrieval systems and management (New York NY, October 11-12, 1994). :266-274
Publisher Information:
Paris: CID, 1994.
Publication Year:
1994
Physical Description:
print, 20 ref
Original Material:
INIST-CNRS
Subject Terms:
Documentation, Sciences exactes et technologie, Exact sciences and technology, Sciences et techniques communes, Sciences and techniques of general use, Sciences de l'information. Documentation, Information science. Documentation, Traitement et recherche de l'information, Information processing and retrieval, Outils linguistiques et logiques, Logical and linguistic tools, Terminologie. Lexiques. Thésaurus, Terminology. Lexicons. Thesaurus, Sciences de l'information et de la communication, Information and communication sciences, Traitement et recherche d'information, Informatique documentaire, Documentation data processing, Información documental, Lexicographie, Lexicography, Lexicografía, Analyse lexicale, Lexical analysis, Análisis lexical, Application, Aplicación, Collection, Colección, Construction, Construcción, Essai, Test, Ensayo, Extraction information, Information extraction, Extractión información, Indexation automatique, Automatic indexing, Indización automática, Méthode vectorielle, Vector method, Método vectorial, Recherche documentaire, Document retrieval, Recuperación documental, Système recherche, Search system, Sistema investigación, Texte, Text, Texto, Thesaurus, Tesaurus, Traitement document, Document processing, Tratamiento documento, Analyse cooccurrence, Cooccurrence analysis, Collection test, Test collection, Coocurence information, Orienté thème, Topic oriented, TIPSTER collection, Vecteur contexte, Context vector
Document Type:
Konferenz
Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Xerox Palo Alto res. cent., Palo Alto CA 94304, United States
Rights:
Copyright 1995 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Sciences of information and communication. Documentation
FRANCIS
FRANCIS
Accession Number:
edscal.3553780
Database:
PASCAL Archive
Weitere Informationen
This paper presents a new method for computing a thesaurus form a text corpus. Each word is represented as a vector in a multi-dimensional space that captures cooccurence information. Words are defined to be similar if they have similar cooccurence patterns. Two different methods for using these thesaurus vectors in information retrieval are shown to significantly improve performance over the ARPA Tipster evaluation corpus as compared to a tf.idf baseline.