Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs Hyokun Yun1, S.V. N. Vishwanathan1,2 1
2
Departments of Statistics and Computer Science , Purdue University The Quilting Algorithm[3]
Abstract Question: How to efficiently sample graphs from Multiplicative Attribute Graphs Model (MAGM)?
Performance
Step 1. Partition nodes such that each node has unique attribute configuration in its partition (1)
• Our Answer: Quilting Algorithm – The first sub-quadratic for MAGM sampling algorithm – Time complexity: O (log2(n))3 |E| on mild conditions
• Question: How many partitions? – With high probabilityO (log2(n)) under mild conditions – Therefore quilting O (log2(n))2 KPGM graphs suffices – Theoretically bounded by O (log2(n)) on mild conditions – Empirical simulations confirms the theory: 20
(2)
size of the partition B
∗ n: number of nodes in the graph ∗ |E|: number of edges – Exploit the close connection between Kronecker Product Graphs Model (KPGM) and MAGM ∗ Sampling a graph from KPGM can be done very efficiently ∗ We theoretically prove that it suffices to sample small number of KPGMs and quilt them – Can sample a graph with 8 million nodes and 20 billion edges in under 6 hours
Step 2. Use this partition to divide the edge probability matrix Q
Kronecker Power of a Matrix
15 10 5
Observed log2 (n) 0
• Let Θ be a 2 × 2 matrix
Q(1,1)
• Kronecker power of the matrix shows the fractal structure
0.2 0.4 0.6 0.8 number of nodes n
Q(1,2)
1 ·106
Experiments • Choose two parameter values from the literature (Θ1 and Θ2)
Q
• Increase number of nodes n to evaluate scalability (repeated 10 times each) • Size of the graph vs. Total running time , Θ[2] =
, Θ[3] = Q(2,1)
• KPGM[1] uses this Kronecker Power of a matrix to define a graph model
Multiplicative Attribute Graphs Model (MAGM)[2]
Q(2,2)
Step 3. When permuted, each Q(k,l) matrix becomes the submatrix of Kronecker Power Matrix. That is, it is a subgraph of KPGM, for which efficient sampling method O (log2(n) |E|) exists[1].
Attributes of Receivers
·107
permute
1 0.5 0.5 Quilting Naive 2 4 6 Number of nodes (n)
8
0 ·106
2 4 6 Number of nodes (n)
8 ·106
Θ2
Running time per edge
1.2
Quilting Naive
1
1.5 1
0.6 0.4
0.5
0.2 0 0
A(1,2)
Quilting Naive
0.8
0
A(1,1)
0
Θ1
sample
Q
Quilting Naive
• Size of the graph vs. Running time per edge
Step 4. We sample a graph for each Q(k,l) and quilt these pieces together to form the final graph Attributes of Senders
Θ2
1
0
Q0(1,1)
Q(1,1)
·107
1.5
0
Adjacency Matrix
Θ1
2 Running Time (ms)
Θ=
2 4 6 Number of nodes (n)
8
0 ·106
2 4 6 Number of nodes (n)
8 ·106
• Generalization of KGPM[1], increased modeling power (ex: power-law degree distribution) • d attributes characterize nodes in the graph: each node either possesses or lacks (0 or 1) each attribute
References
• Probability of an edge between two nodes is determined by attributes A
– The effect of each attribute is multiplicative → Multiplicative Attribute Graphs – We call Q the edge probability matrix • Na¨ıvely sampling each entry of matrix will take O(n2) time! A(2,1)
A(2,2)
[1] J. Leskovec, D. Chakrabati, J. Kleinberg, C. Faloutsos and Z. Ghahramani, Kronecker Graphs: An Approach to Modeling Networks. Journal of Machine Learning Research, 11(3), 2010. [2] M. Kim and J. Leskovec, Multiplicative Atrribute Graph Model of Real-World Networks. Algorithms and Models for the Web-Graph, 62-73, 2010. [3] H. Yun and S.V. N. Vishwanathan, Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs. Under Review