Literature by the same author
plus at Google Scholar

Bibliografische Daten exportieren
 

2ps : High-quality edge partitioning with two-phase streaming

Title data

Mayer, Ruben ; Orujzade, Kamil ; Jacobsen, Hans-Arno:
2ps : High-quality edge partitioning with two-phase streaming.
arXiv , 2020
DOI: https://doi.org/10.48550/arXiv.2001.07086

Abstract in another language

Graph partitioning is an important preprocessing step to distributed graph processing. In edge partitioning, the edge set of a given graph is split into k equally-sized partitions, such that the replication of vertices across partitions is minimized. Streaming is a viable approach to partition graphs that exceed the memory capacities of a single server. The graph is ingested as a stream of edges, and one edge at a time is immediately and irrevocably assigned to a partition based on a scoring function. However, streaming partitioning suffers from the uninformed assignment problem: At the time of partitioning early edges in the stream, there is no information available about the rest of the edges. As a consequence, edge assignments are often driven by balancing considerations, and the achieved replication factor is comparably high. In this paper, we propose 2PS, a novel two-phase streaming algorithm for high-quality edge partitioning. In the first phase, vertices are separated into clusters by a lightweight streaming clustering algorithm. In the second phase, the graph is re-streamed and edge partitioning is performed while taking into account the clustering of the vertices from the first phase. Our evaluations show that 2PS can achieve a replication factor that is comparable to heavy-weight random access partitioners while inducing orders of magnitude lower memory overhead.

Further data

Item Type: Preprint, postprint
Institutions of the University: Faculties
Faculties > Faculty of Mathematics, Physics und Computer Science
Faculties > Faculty of Mathematics, Physics und Computer Science > Department of Computer Science > Chair Data Systems
Faculties > Faculty of Mathematics, Physics und Computer Science > Department of Computer Science > Chair Data Systems > Chair Data Systems - Univ.-Prof. Dr. Ruben Mayer
Faculties > Faculty of Mathematics, Physics und Computer Science > Department of Computer Science
Result of work at the UBT: No
DDC Subjects: 000 Computer Science, information, general works > 004 Computer science
Date Deposited: 25 Apr 2023 10:15
Last Modified: 05 Feb 2024 07:48
URI: https://eref.uni-bayreuth.de/id/eprint/76037