Treffer: FASTR: a unification-based front-end to automatic indexing
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
FRANCIS
Weitere Informationen
Most natural language processing approaches to full-text information retrieval are based on indexing by the occurences of controlled terms they contain. An important problem with this approach is that terms accept numerous variations, and can therefore cause many documents not to be retrieved although being relevant. In this paper, we present a linguistic analysis of the observed variations and a three-tier constraint-based formalism for representing them. This technique has been implemented and results in FASTR, a natural lanuguage processing tool that extracts terms and their variants from full-text documents. We justify the choice of a unification-based formalism by its expressivity and by the addition of conceptual and computational devices which make the parser computationally tractable. Contrary to the generally accepted idea, high quality natural language processing through unification and industrial requirements can fit together, provided that application is carefully designed in order to control and minimize data accesses and computation times. The effectiveness of FASTR for extracting correct occurences is supported by experiments on two English corpora of scientific abstracts and a list of 71, 623 controlled terms.