Result: How to improve robustness in Kohonen maps and display additional information in Factorial Analysis: Application to text mining : Advances in Self-Organizing Maps

Title:
How to improve robustness in Kohonen maps and display additional information in Factorial Analysis: Application to text mining : Advances in Self-Organizing Maps
Source:
Neurocomputing (Amsterdam). 147:120-135
Publisher Information:
Amsterdam: Elsevier, 2015.
Publication Year:
2015
Physical Description:
print, 29 ref
Original Material:
INIST-CNRS
Subject Terms:
Cognition, Computer science, Informatique, Sciences exactes et technologie, Exact sciences and technology, Sciences appliquees, Applied sciences, Informatique; automatique theorique; systemes, Computer science; control theory; systems, Logiciel, Software, Systèmes informatiques et systèmes répartis. Interface utilisateur, Computer systems and distributed systems. User interface, Organisation des mémoires. Traitement des données, Memory organisation. Data processing, Systèmes d'information. Bases de données, Information systems. Data bases, Intelligence artificielle, Artificial intelligence, Reconnaissance et synthèse de la parole et du son. Linguistique, Speech and sound recognition and synthesis. Linguistics, Connexionnisme. Réseaux neuronaux, Connectionism. Neural networks, Algorithme Kohonen, Kohonen algorithm, Algoritmo Kohonen, Algorithmique, Algorithmics, Algorítmica, Analyse conceptuelle, Conceptual analysis, Análisis conceptual, Analyse donnée, Data analysis, Análisis datos, Approche probabiliste, Probabilistic approach, Enfoque probabilista, Arithmétique, Arithmetics, Aritmética, Autoorganisation, Self organization, Autoorganización, Classification, Clasificación, Fouille donnée, Data mining, Busca dato, Histoire, History, Historia, Interface utilisateur, User interface, Interfase usuario, Non déterminisme, Non determinism, No determinismo, Présentation information, Information layout, Presentación información, Robustesse, Robustness, Robustez, Réseau neuronal, Neural network, Red neuronal, Texte, Text, Texto, Théorie graphe, Graph theory, Teoría grafo, Visualisation, Visualization, Visualización, Vocabulaire, Vocabulary, Vocabulario, Analyse texte, Text analysis, Análisis de textos, Factorial Analysis, Graphs, Kohonen maps, Middle ages scientific literature, Text mining
Document Type:
Academic journal Article
File Description:
text
Language:
English
Author Affiliations:
SAMM - Université Paris 1 Panthéon-Sorbonne 90, rue de Tolbiac, 75013 Paris, France
PIREH-LAMOP - Université Paris 1 Panthéon-Sorbonne 1, rue Victor Cousin, Paris, France
ISSN:
0925-2312
Rights:
Copyright 2015 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems
Accession Number:
edscal.28836737
Database:
PASCAL Archive

Further Information

This article is an extended version of a paper presented in the WSOM'2012 conference (Bourgeois et al., 2012 [1]). We display a combination of factorial projections, SOM algorithm and graph techniques applied to a text mining problem. The corpus contains eight medieval manuscripts which were used to teach arithmetic techniques to merchants. Among the techniques for Data Analysis, those used for Lexicometry (such as Factorial Analysis) highlight the discrepancies between manuscripts. The reason for this is that they focus on the deviation from the independence between words and manuscripts. Still, we also want to discover and characterize the common vocabulary among the whole corpus. Using the properties of stochastic Kohonen maps, which define neighborhood between inputs in a non-deterministic way, we highlight the words which seem to play a special role in the vocabulary. We call them fickle and use them to improve both Kohonen map robustness and significance of FCA visualization. Finally we use graph algorithmic to exploit this fickleness for classification of words.