Treffer: A novel contextual topic model for multi-document summarization

Title:
A novel contextual topic model for multi-document summarization
Source:
Expert systems with applications. 42(3):1340-1352
Publisher Information:
Amsterdam: Elsevier, 2015.
Publication Year:
2015
Physical Description:
print, 3/4 p
Original Material:
INIST-CNRS
Subject Terms:
Computer science, Informatique, Sciences exactes et technologie, Exact sciences and technology, Sciences et techniques communes, Sciences and techniques of general use, Mathematiques, Mathematics, Probabilités et statistiques, Probability and statistics, Statistiques, Statistics, Inférence linéaire, régression, Linear inference, regression, Sciences appliquees, Applied sciences, Informatique; automatique theorique; systemes, Computer science; control theory; systems, Logiciel, Software, Organisation des mémoires. Traitement des données, Memory organisation. Data processing, Traitement des données. Listes et chaînes de caractères, Data processing. List processing. Character string processing, Intelligence artificielle, Artificial intelligence, Reconnaissance et synthèse de la parole et du son. Linguistique, Speech and sound recognition and synthesis. Linguistics, Analyse quantitative, Quantitative analysis, Análisis cuantitativo, Estimation Bayes, Bayes estimation, Estimación Bayes, Extraction connaissances, Knowledge extraction, Extracción conocimiento, Information numérique, Digital information, Información numérica, Information utile, Useful information, Información útil, Langage naturel, Natural language, Lenguaje natural, Linguistique, Linguistics, Linguística, Modélisation, Modeling, Modelización, Phrase, Sentence, Frase, Résultat expérimental, Experimental result, Resultado experimental, Résumé, Abstract, Resumen, Surcharge, Overload, Sobrecarga, Système hiérarchisé, Hierarchical system, Sistema jerarquizado, Sûreté fonctionnement, Dependability, Seguridad funcionamiento, Traitement langage, Language processing, Tratamiento lenguaje, Modèle n gramme, N gram model, Modelo n grama, Contextual topic, Hierarchical topic model, Multi-document summarization
Document Type:
Fachzeitschrift Article
File Description:
text
Language:
English
Author Affiliations:
School of Computing, University of Eastern Finland, P.O. Box 111, 80101 Joensuu, Finland
School of Computing and Information Systems, Athabasca University, 1 University Drive, Athabasca, Alberta T9S 3A3, Canada
Department of Information Management, National Sun Yat-sen University, Kaohsiung, Tawain, Province of China
ISSN:
0957-4174
Rights:
Copyright 2015 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems

Mathematics
Accession Number:
edscal.28928458
Database:
PASCAL Archive

Weitere Informationen

Information overload becomes a serious problem in the digital age. It negatively impacts understanding of useful information. How to alleviate this problem is the main concern of research on natural language processing, especially multi-document summarization. With the aim of seeking a new method to help justify the importance of similar sentences in multi-document summarizations, this study proposes a novel approach based on recent hierarchical Bayesian topic models. The proposed model incorporates the concepts of n-grams into hierarchically latent topics to capture the word dependencies that appear in the local context of a word. The quantitative and qualitative evaluation results show that this model has outperformed both hLDA and LDA in document modeling. In addition, the experimental results in practice demonstrate that our summarization system implementing this model can significantly improve the performance and make it comparable to the state-of-the-art summarization systems.