Treffer: Graph-based contributions to machine-learning ; Contributions à base de graphes à l'apprentissage automatique

Title:
Graph-based contributions to machine-learning ; Contributions à base de graphes à l'apprentissage automatique
Authors:
Contributors:
Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom Paris (IMT)-Télécom Paris, Institut Mines-Télécom Paris (IMT)-Institut Polytechnique de Paris (IP Paris)-Institut Polytechnique de Paris (IP Paris), Institut Polytechnique de Paris, Thomas Bonald
Source:
https://theses.hal.science/tel-03634148 ; Data Structures and Algorithms [cs.DS]. Institut Polytechnique de Paris, 2022. English. ⟨NNT : 2022IPPAT010⟩.
Publisher Information:
CCSD
Publication Year:
2022
Document Type:
Dissertation doctoral or postdoctoral thesis
Language:
English
Relation:
NNT: 2022IPPAT010
Rights:
info:eu-repo/semantics/OpenAccess
Accession Number:
edsbas.FD0697D7
Database:
BASE

Weitere Informationen

A graph is a mathematical object that makes it possible to represent relationships (called edges) between entities (called nodes). Graphs have long been a focal point in a number of problems ranging from work by Euler to PageRank and shortest-path problems. In more recent times, graphs have been used for machine learning.With the advent of social networks and the world-wide web, more and more datasets can be represented using graphs. Those graphs are ever bigger, sometimes with billions of edges and billions of nodes. Designing efficient algorithms for analyzing those datasets has thus proven necessary. This thesis reviews the state of the art and introduces new algorithms for the clustering and the embedding of the nodes of massive graphs. Furthermore, in order to facilitate the handling of large graphs and to apply the techniques under study, we introduce Scikit-network, a free and open-source Python library which was developed during the thesis. Many tasks, such as the classification or the ranking of the nodes using centrality measures, can be carried out thanks to Scikit-network.We also tackle the problem of labeling data. Supervised machine learning techniques require labeled data to be trained. The quality of this labeled data has a heavy influence on the quality of the predictions of those techniques once trained. However, building this data cannot be achieved through the sole use of machines and requires human intervention. We study the data labeling problem in a graph-based setting, and we aim at describing the solutions that require as little human intervention as possible. We characterize those solutions and illustrate how they can be applied in real use-cases. ; Un graphe est un objet mathématique permettant de représenter des relations entre des entités (appelées nœuds) sous forme d’arêtes. Les graphes sont depuis longtemps un objet d’étude pour différents problèmes allant d’Euler au PageRank en passant par les problèmes de plus courts chemins. Les graphes ont plus récemment trouvé des usages pour ...