Adaptive Partitioning for Large-Scale Dynamic Graphs Luis Vaquero*, F´elix Cuadrado*, Dionysios Logothetis**, Claudio Martella*** Queen Mary University of London,**Telef´onica I+D, ***VU University Amsterdam [email protected],[email protected],[email protected],[email protected]

c 2013 by the Association for Computing Machinery, Inc. Copyright (ACM). Permission to make digital or hard copies of portions of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page in print or the first screen in digital media. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. SoCC’13, 1–3 Oct. 2013, Santa Clara, California, USA. ACM 978-1-4503-2428-1. http://dx.doi.org/10.1145/2523616.2525943

0.6 0.5

ratio of cuts

0.7

HSH DGT ADP

0.4

Mining large-scale graphs is increasingly important, as it provides a powerful way of extracting useful information from real-world data. Efficient processing of that volume of information requires partitioning the graph across multiple nodes in a distributed system. However, traversing edges across distributed partitions results in significant performance penalty due to the additional cost of inter-partition communication. Minimising the number of cut edges between partitions improves communication cost between neighbouring vertices; balanced graph partitioning is required for load balancing [2]. These large graphs represent real-world information, which is inherently dynamic. Recent systems such as Kineograph [1] can process changing graphs, but they do not consider the impact of dynamism in graph partitioning. To illustrate this impact, we built a call graph from mobile Call Detail Records data, with a sliding window defining the creation and removal of nodes and edges. The graph was partitioned using three different techniques: modulo hash (HSH), the most popular partitioning technique because of its high scalability to produce balanced partitions, [2]; a state of art streaming partition technique (deterministic greedy, DTG) [3]; and our adaptive repartitioning heuristic, (ADP). Figure 1 shows the evolution of the partitioning (expressed as the ratio of edges that cut across different partitions). While a good partitioning strategy significantly improves the initial ratio of cuts, the quality of the partitioning degrades over time, resulting in higher communication penalty.

0.8

Abstract

0

5

10

15

20

25

30

time(days)

Figure 1: Evolution of the ratio of cuts over time on a dynamic graph generated by processing CDR calls on a sliding window.

In order to prevent this performance degradation, current approaches would require a full graph repartition, which can be extremely costly with large-scale graphs, and generate downtime gaps in the system. While this problem does not deeply affect batch processing systems, it can greatly impact throughput and latency of graph processing systems requiring faster response times. We propose an adaptive approach, where the graph is optimised with every change, over computation execution. We improve graph partitioning in a scalable manner by applying a local decision heuristic, based on decentralised, iterative vertex migration. The heuristic [4] migrates vertices between partitions trying to minimise the number of cut edges, while at the same time keeping partitions balanced upon structural changes at run time. We tested this approach in a system that processes dynamic graphs and adapts to graph changes by applying the iterative vertex migration algorithm. While continuous migrations bring added overhead to the computation, we observed in several experiments that the total execution time was reduced by over 50%. A more detailed analysis of the system and experiments is available at [4].

References [1] R. Cheng, J. Hong, A. Kyrola, Y. Miao, X. Weng, M. Wu, F. Yang, L. Zhou, F. Zhao, and E. Chen.

Kineograph: taking the pulse of a fast-changing and connected world. In EuroSys, 2012. [2] G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In PODC, 2009. [3] I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, 2012. [4] L. Vaquero, F. Cuadrado, D. Logothetis, and C. Martella. xdgp: A dynamic graph processing system with adaptive partitioning. http://arxiv.org/abs/1309.1049, 2013, 1309.1049.

Adaptive Partitioning for Large-Scale Dynamic Graphs

system for large-scale graph processing. In PODC,. 2009. [3] I. Stanton and G. Kliot. Streaming graph partition- ing for large distributed graphs. In KDD, 2012. [4] L. Vaquero, F. Cuadrado, D. Logothetis, and. C. Martella. xdgp: A dynamic graph pro- cessing system with adaptive partitioning. http://arxiv.org/abs/1309.1049, ...

61KB Sizes 0 Downloads 228 Views

Recommend Documents

Adaptive virtual channel partitioning for network-on ... - GT comparch
and GPU applications, we can guarantee a minimum service in the network to each ... (1) We propose a feedback-directed virtual channel partitioning (VCP) ...

Adaptive virtual channel partitioning for network-on-chip ... - CompArch
memory controllers, and this type of on-chip network will have a significant impact ... [email protected]; H. Kim, School of Computer Science, Georgia ...

Adaptive virtual channel partitioning for network-on ... - GT comparch
cс 2013 ACM 1084-4309/2013/10-ART48 $15.00. DOI: http://dx.doi.org/10.1145/2504906. ACM Transactions on Design Automation of Electronic Systems, Vol.

Algorithm for Dynamic Partitioning and Reallocation ...
database management system (DDBMS) as a software system that manages a ... A distributed database system is a database system which is fragmented or ...

Adaptive Cache Partitioning on a Composite Core - umich.edu and ...
slot. The y axis is the set index of every data cache access instead of the memory address. Figure 7 shows the cache accesses with workload gcc*- gcc*.

Adaptive Cache Partitioning on a Composite Core - umich.edu and ...
Computer Engineering Lab. University of Michigan, Ann Arbor, MI. {jiecaoyu, lukefahr, shrupad, reetudas, mahlke}@umich.edu. 1. INTRODUCTION. In modern processors, power consumption and heat dissipa- tion are key challenges, especially for battery-lim

EURASIP-Adaptive Transport Layer Protocol for Highly Dynamic ...
EURASIP-Adaptive Transport Layer Protocol for Highly Dynamic Environment 0.807.pdf. EURASIP-Adaptive Transport Layer Protocol for Highly Dynamic ...

adaptive model combination for dynamic speaker ...
as MAP [7]) and speaker space family (such as eigenvoice. [6]). .... a global weight vector is learned for all phone classes of test ..... Signal Processing, vol. 9, pp.

Predicting Dense Regions in Dynamic Graphs
portant problems in the real worlds of business, industry, education, and others. ... ing track of all the changes over time for example track- ing patterns of ... given time. Tylenda et al. [26] developed a graph-based link prediction method that in

Reachability Queries on Large Dynamic Graphs: A ...
inapplicable to the dynamic graphs (e.g., social networks and the ... republish, to post on servers or to redistribute to lists, requires prior specific permission.

Polynomial algorithm for graphs isomorphism's
Polynomial algorithm for graphs isomorphism's i. Author: Mohamed MIMOUNI. 20 Street kadissia Oujda 60000 Morocco. Email1 : mimouni.mohamed@gmail.

Adaptive Dynamic Inversion Control of a Linear Scalar Plant with ...
trajectory that can be tracked within control limits. For trajectories which ... x) tries to drive the plant away from the state x = 0. ... be recovered. So for an ..... 375–380, 1995. [2] R. V. Monopoli, “Adaptive control for systems with hard s

A Dynamic and Adaptive Approach to Distribution ...
the performance of the underlying portfolio or unforeseen ... Distribution Planning and Monitoring by David M. .... performance-based withdrawal methodolo-.

Adaptive Correction of Sampling Bias in Dynamic Call ...
Jan 19, 2016 - Profiling dynamic call graphs main foo. 12 bar. 12. ▷ DCG g = (N,E,freq). ▻ N as a set of procedures. ▻ E as a set of caller-callee relationships.

Dynamic GPGPU Power Management Using Adaptive ...
that performs inter-kernel optimization while accounting for ... TABLE I: Software visible CPU, Northbridge, and GPU DVFS states on the AMD A10-7850K.

A Graph-Partitioning Based Approach for Parallel Best ... - icaps 2017
GRAZHDA* seeks to approximate the partitioning of the actual search space graph by partitioning the domain tran- sition graph, an abstraction of the state space ...

Partitioning Algorithms for Improving Efficiency of Topic ...
Section 7 concludes. II. RELATED WORK. Topic modeling is a powerful technique in text analysis and data mining. One of the first models was LDA developed.

Iterative mesh partitioning strategy for improving ... - Semantic Scholar
Computer-Aided Engineering, Department of Civil Engineering, National Taiwan ... substructure interior degrees-of-freedom are performed independently and .... elements to achieve a good balance of workloads among substructures.

Cooperative Cache Partitioning for Chip ... - Research at Google
applicable to large on-chip caches (e.g., L3 cache in today's server processors). ... similar concepts of cooperative caching proposed in file systems research. Further .... algorithms and techniques that can monitor situations dynamically.

Graphs for the Depression.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Graphs for the Depression.pdf. Graphs for the Depression.pdf. Open. Extract. Open with. Sign In. Main menu.

Polynomial algorithm for graphs isomorphism's
20 Street kadissia Oujda 60000 Morocco. Email1 : [email protected]. Email2 : [email protected]. Comments : 4 pages. Subj – Class : Graph isomorphism. Abstract: isomorphism is in P. Graph isomorphism. Abstract. When two graphs are isomorp

Label Partitioning For Sublinear Ranking - Proceedings of Machine ...
whole host of other popular methods are used in this way. We refer ..... (10). For a single example, the desired objective is that a rel- evant label appears in the top k. However .... gave the best results. However .... ence on World Wide Web, pp.

Three Data Partitioning Strategies for Building Local ...
Experiments. ○ CLU, CLU2, FEA and meta ensemble (MMM). ○ Baselines: naive (NAI), random partitoning. (RAN) and no partitioning (ALL). ○ Classification datasets from various domains. ○ dimensionalities 7-58. ○ sizes 500- 44000. ○ two class

Streaming Balanced Graph Partitioning ... - Research at Google
The sheer size of 'big data' motivates the need for streaming ... of a graph in a streaming fashion with only one pass over the data. ...... This analysis leaves open the question of how long the process must run before one partition dominates.