Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs Hyokun Yun1, S.V. N. Vishwanathan1,2 1

2

Departments of Statistics and Computer Science , Purdue University The Quilting Algorithm[3]

Abstract Question: How to efficiently sample graphs from Multiplicative Attribute Graphs Model (MAGM)?

Performance

Step 1. Partition nodes such that each node has unique attribute configuration in its partition (1)

• Our Answer: Quilting Algorithm – The first sub-quadratic for MAGM  sampling algorithm  – Time complexity: O (log2(n))3 |E| on mild conditions

• Question: How many partitions? – With high probabilityO (log2(n))  under mild conditions – Therefore quilting O (log2(n))2 KPGM graphs suffices – Theoretically bounded by O (log2(n)) on mild conditions – Empirical simulations confirms the theory: 20

(2)

size of the partition B

∗ n: number of nodes in the graph ∗ |E|: number of edges – Exploit the close connection between Kronecker Product Graphs Model (KPGM) and MAGM ∗ Sampling a graph from KPGM can be done very efficiently ∗ We theoretically prove that it suffices to sample small number of KPGMs and quilt them – Can sample a graph with 8 million nodes and 20 billion edges in under 6 hours

Step 2. Use this partition to divide the edge probability matrix Q

Kronecker Power of a Matrix

15 10 5

Observed log2 (n) 0

• Let Θ be a 2 × 2 matrix

Q(1,1)

• Kronecker power of the matrix shows the fractal structure

0.2 0.4 0.6 0.8 number of nodes n

Q(1,2)

1 ·106

Experiments • Choose two parameter values from the literature (Θ1 and Θ2)

Q

• Increase number of nodes n to evaluate scalability (repeated 10 times each) • Size of the graph vs. Total running time , Θ[2] =

, Θ[3] = Q(2,1)

• KPGM[1] uses this Kronecker Power of a matrix to define a graph model

Multiplicative Attribute Graphs Model (MAGM)[2]

Q(2,2)

Step 3. When permuted, each Q(k,l) matrix becomes the submatrix of Kronecker Power Matrix. That is, it is a subgraph of KPGM, for which efficient sampling method O (log2(n) |E|) exists[1].

Attributes of Receivers

·107

permute

1 0.5 0.5 Quilting Naive 2 4 6 Number of nodes (n)

8

0 ·106

2 4 6 Number of nodes (n)

8 ·106

Θ2

Running time per edge

1.2

Quilting Naive

1

1.5 1

0.6 0.4

0.5

0.2 0 0

A(1,2)

Quilting Naive

0.8

0

A(1,1)

0

Θ1

sample

Q

Quilting Naive

• Size of the graph vs. Running time per edge

Step 4. We sample a graph for each Q(k,l) and quilt these pieces together to form the final graph Attributes of Senders

Θ2

1

0

Q0(1,1)

Q(1,1)

·107

1.5

0

Adjacency Matrix

Θ1

2 Running Time (ms)

Θ=

2 4 6 Number of nodes (n)

8

0 ·106

2 4 6 Number of nodes (n)

8 ·106

• Generalization of KGPM[1], increased modeling power (ex: power-law degree distribution) • d attributes characterize nodes in the graph: each node either possesses or lacks (0 or 1) each attribute

References

• Probability of an edge between two nodes is determined by attributes A

– The effect of each attribute is multiplicative → Multiplicative Attribute Graphs – We call Q the edge probability matrix • Na¨ıvely sampling each entry of matrix will take O(n2) time! A(2,1)

A(2,2)

[1] J. Leskovec, D. Chakrabati, J. Kleinberg, C. Faloutsos and Z. Ghahramani, Kronecker Graphs: An Approach to Modeling Networks. Journal of Machine Learning Research, 11(3), 2010. [2] M. Kim and J. Leskovec, Multiplicative Atrribute Graph Model of Real-World Networks. Algorithms and Models for the Web-Graph, 62-73, 2010. [3] H. Yun and S.V. N. Vishwanathan, Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs. Under Review

Departments of Statistics and Computer Science ...

and Computer Science. 2. , Purdue University. Abstract ... increased modeling power (ex: power-law degree distribution). • d attributes characterize nodes in the ...

204KB Sizes 1 Downloads 174 Views

Recommend Documents

Departments of Statistics and Computer Science ...
and Computer Science. 2. , Purdue ... increased modeling power (ex: power-law degree distribution) ... Na¨ıvely sampling each entry of matrix will take O(n. 2. ) ...

IEEE SecDev 2016 - Computer Science and Statistics
Sep 11, 2016 - and even death. Early research on ... The first day's afternoon session will focus on ... the afternoon of the second day, attendees will be able to ...

Project Guidelines - Department of Computer Science and ...
The project work for M.E. / M.Tech. consists of Phase – I and Phase – II. Phase – I is to be under taken during III semester and Phase – II, which is a continuation ...

New Jersey and Newark Departments of Health 10.20.17 Inspection of ...
New Jersey and Newark Departments of Health 10.20.17 Inspection of Associated Humane Societies-Newark.pdf. New Jersey and Newark Departments of ...

The Future of Computer Science - Cornell Computer Science
(Cornell University, Ithaca NY 14853, USA). Abstract ... Where should I go to college? ... search engine will provide a list of automobiles ranked according to the preferences, .... Rather, members of a community, such as a computer science.

Information Science and Statistics - GitHub
Expert Systems. Doucet, de Freitas, and Gordon: Sequential Monte Carlo Methods in Practice. Fine: Feedforward Neural Network Methodology. Hawkins and Olwell: Cumulative Sum ... Library of Congress Control Number: 2006922522 ... that fill in important

The Role of Forensics When Departments and Programs are Targeted ...
Recent economic conditions in the United States are taking their toll on the educational institutions in this country. One dilemma resulting from this predicament is the potentiality a department of communication and/or a forensic program may be targ

Catalog of Houses, Departments and studios.
Consumption of gas and hot and cold running water ... Wireless Internet Access in public areas. • Delivery in ... Note: The air conditioning service is contracted ...

Statistics-The-Art-And-Science-Of-Learning-From-Data ...
Statistics-The-Art-And-Science-Of-Learning-From-Data-2nd-Edition.pdf. Statistics-The-Art-And-Science-Of-Learning-From-Data-2nd-Edition.pdf. Open. Extract.

Dualism, Science, and Statistics
Dualism, Science, and Statistics ... Published By: American Institute of Biological Sciences .... can either collect more data or revise our research hypothe- sis.

Geometries of sensor outputs, inference and ... - Computer Science
particular heterogeneous multi-sensor measurements which involve corrupt data, either noisy or with missing ... coherent fusion of data from a multiplicity of sources, generalizing signal processing to a non linear setting. Since ..... R. R. Coifman

Noorul Islam University Computer Science and Engineering ...
Security issues include protecting data from unauthorized access and viruses. 3. ... Using HDB3, encode the bit stream 10000000000100. ... Displaying Noorul Islam University Computer Science and Engineering Computer Networks.pdf.

Geometries of sensor outputs, inference and ... - Computer Science
These simple examples indicate that diffusion and harmonic analysis are useful for coherent sensor integration and fusion, enabling signal processing for ...

Factor Automata of Automata and Applications - NYU Computer Science
file, for a large set of strings represented by a finite automaton. ..... suffix-unique, each string accepted by A must end with a distinct symbol ai, i = 1,...,Nstr. Each ...

Geometries of sensor outputs, inference and ... - Computer Science
A simple way to understand the effect of introducing similarity based .... One can modify this basic construction of a hierarchical scale decomposition in order ... scheme is provided below: given data entries d(q, r) where, for illustration we can .