Treffer: Online Graph Partition for Distributed Dynamic GNN Training

Title:
Online Graph Partition for Distributed Dynamic GNN Training
Authors:
Contributors:
Hong Kong RGC
Source:
ACM Transactions on Modeling and Performance Evaluation of Computing Systems ; volume 10, issue 4, page 1-23 ; ISSN 2376-3639 2376-3647
Publisher Information:
Association for Computing Machinery (ACM)
Publication Year:
2025
Document Type:
Fachzeitschrift article in journal/newspaper
Language:
English
DOI:
10.1145/3760772
Accession Number:
edsbas.7A3479D6
Database:
BASE

Weitere Informationen

Graph Neural Networks (GNNs) have become increasingly popular for their ability to learn the complex features of graph-structured data effectively. However, many real-world graphs are dynamic and change over time in terms of graph structures and features. A large dynamic graph is commonly stored in distributed graph stores and learned through distributed GNN training. Classical graph partition algorithms focus on partition balance and cross-partition edge reduction, which do not serve the need of distributed dynamic GNN learning well. We propose DistDy, a novel online graph partition framework tailored for distributed dynamic GNN learning, aiming to minimize dynamic graph storage overhead and inter-server communication. We design distributed additive storage to store changes in the large dynamic graph, and decide graph partition (aka change storage) on the go by formulating it into a communication utility maximization problem. An efficient online graph partition algorithm is proposed, which computes near-optimal partition strategies according to refined resource prices and additive storage rewards, achieving a proven competitive ratio. Experiments on various real-world and synthetic dynamic graph datasets show that DistDy can achieve 92.2% storage saving and up to 1.39× speed-up in distributed GNN training as compared to using representative graph partition algorithms.