Copyright 1995 INIST-CNRS CC BY 4.0 Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Sciences of information and communication. Documentation
FRANCIS
Accession Number:
edscal.3601537
Database:
PASCAL Archive
Weitere Informationen
In this paper we describe an information retrieval system in which advanced natural language processing techniques are used to enhance the effectiveness of termbased document retrieval. The backbone of our system is a traditional statistical engine that builds inverted index files from pre-processed documents, and then searches and ranks the documents in response to user queries. Natural language processing is used to (a) preprocess the documents in order to extract content-carrying terms, (b) discover interterm dependencies and build a conceptual hierarchy specific to the database domain, and (c) process the user's natural language requests into effective search queries. During the course of the Text REtrieval Conferences, TREC-1 and TREC-2, * our system has evolved from a scaled-up prototype, originally tested on such collections as CACM-3204 and Cranfield, to its present form, which can be effectively used to process hundreds of millions of words of unrestricted text.