Literatur vom gleichen Autor/der gleichen Autor*in
plus bei Google Scholar

Bibliografische Daten exportieren
 

2ps : High-quality edge partitioning with two-phase streaming

Titelangaben

Mayer, Ruben ; Orujzade, Kamil ; Jacobsen, Hans-Arno:
2ps : High-quality edge partitioning with two-phase streaming.
arXiv , 2020
DOI: https://doi.org/10.48550/arXiv.2001.07086

Abstract

Graph partitioning is an important preprocessing step to distributed graph processing. In edge partitioning, the edge set of a given graph is split into k equally-sized partitions, such that the replication of vertices across partitions is minimized. Streaming is a viable approach to partition graphs that exceed the memory capacities of a single server. The graph is ingested as a stream of edges, and one edge at a time is immediately and irrevocably assigned to a partition based on a scoring function. However, streaming partitioning suffers from the uninformed assignment problem: At the time of partitioning early edges in the stream, there is no information available about the rest of the edges. As a consequence, edge assignments are often driven by balancing considerations, and the achieved replication factor is comparably high. In this paper, we propose 2PS, a novel two-phase streaming algorithm for high-quality edge partitioning. In the first phase, vertices are separated into clusters by a lightweight streaming clustering algorithm. In the second phase, the graph is re-streamed and edge partitioning is performed while taking into account the clustering of the vertices from the first phase. Our evaluations show that 2PS can achieve a replication factor that is comparable to heavy-weight random access partitioners while inducing orders of magnitude lower memory overhead.

Weitere Angaben

Publikationsform: Preprint, Postprint
Institutionen der Universität: Fakultäten
Fakultäten > Fakultät für Mathematik, Physik und Informatik
Fakultäten > Fakultät für Mathematik, Physik und Informatik > Institut für Informatik > Lehrstuhl Data Systems
Fakultäten > Fakultät für Mathematik, Physik und Informatik > Institut für Informatik > Lehrstuhl Data Systems > Lehrstuhl Data Systems - Univ.-Prof. Dr. Ruben Mayer
Fakultäten > Fakultät für Mathematik, Physik und Informatik > Institut für Informatik
Titel an der UBT entstanden: Nein
Themengebiete aus DDC: 000 Informatik,Informationswissenschaft, allgemeine Werke > 004 Informatik
Eingestellt am: 25 Apr 2023 10:15
Letzte Änderung: 05 Feb 2024 07:48
URI: https://eref.uni-bayreuth.de/id/eprint/76037