Treffer: Hierarchical logical structure extraction of book documents by analyzing tables of contents
Title:
Hierarchical logical structure extraction of book documents by analyzing tables of contents
Authors:
Source:
Document recognition and retrieval XI (San Jose CA, 21-22 January 2004)SPIE proceedings series. 5296:6-13
Publisher Information:
Bellingham WA: SPIE, 2004.
Publication Year:
2004
Physical Description:
print, 5 ref
Original Material:
INIST-CNRS
Subject Terms:
Documentation, Electronics, Electronique, Optics, Optique, Physics, Physique, Telecommunications, Télécommunications, Sciences exactes et technologie, Exact sciences and technology, Sciences et techniques communes, Sciences and techniques of general use, Sciences de l'information. Documentation, Information science. Documentation, Traitement et recherche de l'information, Information processing and retrieval, Structure et analyse des documents et de l'information, Information and document structure and analysis, Analyse des contenus, Content analysis, Indexation. Classification. Résumé. Synthèses, Indexing. Classification. Abstracting. Syntheses, Sciences de l'information et de la communication, Information and communication sciences, Traitement et recherche d'information, Automatisation, Automation, Automatización, Extraction, Extracción, Livre, Book, Libro, Sommaire, Table of contents, Sumario, Traitement information, Information processing, Procesamiento información, Document numérisé, Digitized document, Structure logique, Logical structure
Document Type:
Konferenz
Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Dept. of Electronic Engineering, Tsinghua University, State Key Laboratory of Intelligent Technology and Systems, 100084, Beijing, China
Rights:
Copyright 2004 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Sciences of information and communication. Documentation
FRANCIS
FRANCIS
Accession Number:
edscal.16075666
Database:
PASCAL Archive
Weitere Informationen
Logical structure extraction of book documents is significant in electronic document database automatic construction. The tables of contents in a book play an important role in representing the overall logical structure and reference information of the book documents. In this paper, a new method is proposed to extract the hierarchical logical structure of book documents, in addition to the reference information, by combining spatial and semantic information of the tables of contents in a book. Experimental results obtained from testing on various book documents demonstrate the effectiveness and robustness of the proposed approach.