Treffer: FASTR: a unification-based front-end to automatic indexing

Title:
FASTR: a unification-based front-end to automatic indexing
Authors:
Source:
Intelligent multimedia information retrieval systems and management (New York NY, October 11-12, 1994). :34-47
Publisher Information:
Paris: CID, 1994.
Publication Year:
1994
Physical Description:
print, 26 ref
Original Material:
INIST-CNRS
Subject Terms:
Documentation, Sciences exactes et technologie, Exact sciences and technology, Sciences et techniques communes, Sciences and techniques of general use, Sciences de l'information. Documentation, Information science. Documentation, Traitement et recherche de l'information, Information processing and retrieval, Structure et analyse des documents et de l'information, Information and document structure and analysis, Analyse des contenus, Content analysis, Indexation. Classification. Résumé. Synthèses, Indexing. Classification. Abstracting. Syntheses, Sciences de l'information et de la communication, Information and communication sciences, Traitement et recherche d'information, Informatique documentaire, Documentation data processing, Información documental, Linguistique mathématique, Computational linguistics, Linguística matemática, Recherche documentaire, Document retrieval, Recuperación documental, Terminologie, Terminology, Terminología, Analyse morphologique, Morphological analysis, Análisis morfológico, Analyse syntaxique, Syntactic analysis, Análisis sintáxico, Anglais, English, Inglés, Dictionnaire automatique, Automatic dictionary, Diccionario automático, Efficacité, Efficiency, Eficacia, Essai, Test, Ensayo, Extraction, Extracción, Grammaire formelle, Formal grammar, Gramática formal, Grande dimension, Large dimension, Gran dimensión, Indexation automatique, Automatic indexing, Indización automática, Littérature scientifique, Scientific literature, Literatura científica, Modèle linguistique, Linguistic model, Modelo linguístico, Mot clé, Keyword, Palabra clave, Métallurgie, Metallurgy, Metalurgia, Résumé, Abstract, Resumen, Temps traitement, Processing time, Tiempo proceso, Texte intégral, Full text, Texto completo, Vocabulaire contrôlé, Controlled vocabulary, Vocabulario controlado, FAST (Fast term Recognizer), Grammaire unification, Unification grammar, CD93-3:DLJ:cf FR.:cf Trad. aut, METAL corpus, Métarègle, Metarule, Variation linguistique, Linguistic variation
Document Type:
Konferenz Conference Paper
File Description:
text
Language:
English
Author Affiliations:
IUT/Inst. rech. informatique, 44041 Nantes, France
Rights:
Copyright 1995 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Sciences of information and communication. Documentation

FRANCIS
Accession Number:
edscal.3553687
Database:
PASCAL Archive

Weitere Informationen

Most natural language processing approaches to full-text information retrieval are based on indexing by the occurences of controlled terms they contain. An important problem with this approach is that terms accept numerous variations, and can therefore cause many documents not to be retrieved although being relevant. In this paper, we present a linguistic analysis of the observed variations and a three-tier constraint-based formalism for representing them. This technique has been implemented and results in FASTR, a natural lanuguage processing tool that extracts terms and their variants from full-text documents. We justify the choice of a unification-based formalism by its expressivity and by the addition of conceptual and computational devices which make the parser computationally tractable. Contrary to the generally accepted idea, high quality natural language processing through unification and industrial requirements can fit together, provided that application is carefully designed in order to control and minimize data accesses and computation times. The effectiveness of FASTR for extracting correct occurences is supported by experiments on two English corpora of scientific abstracts and a list of 71, 623 controlled terms.