Center for Advanced Computer Studies, University of Louisiana, Lafayette, LA 70504-4330, United States Division of Information Science, Sookmyung Women's University, Seoul, 710, Korea, Republic of
Copyright 2003 INIST-CNRS CC BY 4.0 Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Sciences of information and communication. Documentation
FRANCIS
Accession Number:
edscal.14573776
Database:
PASCAL Archive
Weitere Informationen
XML (eXtensible Markup Language) is a new standard for exchanging and representing information on the Internet. Documents can be hierarchically represented in XML-elements and also available for sophisticated content-based retrieval. For fast retrieval, XML documents may be indexed. Typical indexing techniques, however, are not satisfactory for multi-dimensional and irregularly hierarchical XML documents. In this paper, we propose a scalable bitmap indexing that can index not only document-path-content (or -word) information but also additional information such as the occurrence and reference/de-reference information of words and paths, or multimedia features in digital libraries. Querying XML document collections can be performed based on combinations of primitive operations such as slice, project, and dice. Bit-wise operations are outperformed in bitmap indexes. We also define the notion of distances in bitmap indexes suitable for sophisticated or proximity approximation retrievals. Experiments show that the bitmap-based indexing for multiple features of XML documents can be constructed efficiently, and the distance operations can be performed more efficiently with the BitCube than with other alternatives.