Treffer: Latent semantic indexing (LSI) : TREC-3 report

Title:

Latent semantic indexing (LSI) : TREC-3 report

Authors:

DUMAIS, S. T

Source:

TREC-3: text retrieval conferenceNIST special publication. (500225):219-230

Publisher Information:

Gaithersburg, MD: National Institute of Standards and Technology, 1995.

Publication Year:

1995

Physical Description:

print, 22 ref

Original Material:

INIST-CNRS

Subject Terms:

Science technology, industry, Sciences et technologies, industries, Sciences exactes et technologie, Exact sciences and technology, Sciences et techniques communes, Sciences and techniques of general use, Sciences de l'information. Documentation, Information science. Documentation, Systèmes de recherche d'informations. Système de gestion documentaire et d'information, Information retrieval systems. Information and document management system, Sciences de l'information et de la communication, Information and communication sciences, Système de recherche documentaire. Système de gestion documentaire et d'information, Informatique documentaire, Documentation data processing, Información documental, Etude cas, Case study, Estudio caso, Evaluation performance, Performance evaluation, Evaluación prestación, Indexation automatique, Automatic indexing, Indización automática, Produit recherche, Search result, Resultado búsqueda, Recherche documentaire, Document retrieval, Recuperación documental, Système recherche, Search system, Sistema investigación, Méthode recherche, Research method, TREC-3

Document Type:

Konferenz Conference Paper

File Description:

text

Language:

English

Author Affiliations:

Bellcore, Morristown NJ 07960, United States

ISSN:

1048-776X

Access URL:

http://pascal-francis.inist.fr/vibad/index.php?action=search&terms=2484609

Rights:

Copyright 1997 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS

Notes:

Sciences of information and communication. Documentation

FRANCIS

Accession Number:

edscal.2484609

Database:

PASCAL Archive

Weitere Informationen

This paper reports on recent developments of the Latent Semantic Indexing (LSI) retrieval method for TREC-3. LSI uses a reduced-dimension vector space to represent words and documents. An important aspect of this representation is that the association between terms is automatically captured, explicitly represented, and used to improve retrieval. We used LSI for both TREC-3 routing and adhoc tasks. For the routing tasks an LSI space was constructed using the training documents. We compared profiles constructed using just the topic words (no training) with profiles constructed using the average of relevant documents (no use of the topic words). Not surprisingly, the centroid of the relevant documents was 30% better than the topic words. This simple feedback method was quite good compared to the routing performance of other systems. Various combinations of information from the topic words and relevant documents provide small additional improvements in performance. For the adhoc task we compared LSI to keyword vector matching (i.e. using no dimension reduction). Small advantages were obtained for LSI even with the long TREC topic statements.

Treffer: Latent semantic indexing (LSI) : TREC-3 report

Weitere Informationen

Links

Zusatz-Funktionen