Result: Wordrank-based lexical signatures for finding lost or related web pages
Title:
Wordrank-based lexical signatures for finding lost or related web pages
Authors:
Source:
Frontiers of WWW research and development (APWeb 2006)0APWeb 2006. :843-849
Publisher Information:
Berlin: Springer, 2006.
Publication Year:
2006
Physical Description:
print, 7 ref 1
Original Material:
INIST-CNRS
Subject Terms:
Computer science, Informatique, Sciences exactes et technologie, Exact sciences and technology, Sciences appliquees, Applied sciences, Informatique; automatique theorique; systemes, Computer science; control theory; systems, Logiciel, Software, Systèmes informatiques et systèmes répartis. Interface utilisateur, Computer systems and distributed systems. User interface, Organisation des mémoires. Traitement des données, Memory organisation. Data processing, Systèmes d'information. Bases de données, Information systems. Data bases, Intelligence artificielle, Artificial intelligence, Reconnaissance et synthèse de la parole et du son. Linguistique, Speech and sound recognition and synthesis. Linguistics, Internet, Lien hypertexte, Hyperlink, Enlace hipertexto, Linguistique, Linguistics, Linguística, Réseau web, World wide web, Red WWW, Sémantique, Semantics, Semántica
Document Type:
Conference
Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Institute of Computer Science and Technology, Peking University, Beijing 100871, China
ISSN:
0302-9743
Rights:
Copyright 2007 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems
Accession Number:
edscal.19151515
Database:
PASCAL Archive
Further Information
A lexical signature of a web page consists of several key words carefully chosen from the web page and is used to generate robust hyperlink to find the web page when its URL fails. In this paper, we propose a novel method based on WordRank to compute lexical signatures, which can take into account the semantic relatedness between words and choose the most representative and salient words as lexical signature. Experiments show that the DF-based lexical signatures are best at uniquely identifying web pages, and hybrid lexical signatures are good candidates for retrieving the desired web pages, while WordRank-based lexical signatures are best for retrieving highly relevant web pages when the desired web page cannot be extracted.