Treffer: Using Coollists to index HTML documents in the Web
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
FRANCIS
Weitere Informationen
This paper suggests a partial solution (limited to HTML documents) to the Web-indexing problem using Coollists. Roughly, a Coollist is equivalent to a Hotlist in Mosaic except that it automatically records all the visited HTML document titles by default. Thus, in theory, by maintaining a merged list of everybody's Coollists, a complete index of all the HTML files in the Web should be created eventually. In practice, even if transferring everybody's Coollists to a single site were feasible, the growth and change rate of Web questions us whether the archie metaphor of every index server maintains all the know-wheres could be applied to the rest of the Web. The new metaphor we are suggesting is a library metaphor. Let each organization maintain the merged Coollists of their individuals. If some organization has surplus computing resources, let it maintain the merged list of other merged lists. This way, individuals are likely to find documents of their interest from their own organization. But organizations have characteristics like libraries have specialities. Therefore, individuals will find other interesting documents from its neighboring sites. Bigger libraries carry more books. Likewise, there will be sites that merge many merged lists together which will be useful for blind keyword searching of the titles. For our current implementation of a Coollist, we take advantage of CERN proxy-cache server to collect the indices of all the visited HTML files. People on three of the 19 plants within the company tried the merged list of Coollists and found it almost indispensable. People who used to save almost every URLs they visited and those who wanted some comprehensive list of URLs found it particularly useful. In the paper, we describe the result of our experiment in detail and also point out how our approach might solve the scaleability problem of other Web indexing solutions.