Temporal Clustering in Time-varying Networks with Time Series Analysis

Kun Tu, Bruno Ribeiro, Ananthram Swami, Don Towsley UMass Amherst Purdue University Army Research Lab {kuntu, towsley}@cs.umass.edu, {ribeiro}@cs.purdue.edu, {a.swami}@ieee.org

Abstract Detecting and tracking evolving communities in temporal networks is a key aspect of network analysis. Observing detailed changes in a community over time requires analyzing networks at small time scales and introduces two challenges: (a) the adjacency matrix of a community in a network snapshot may be too sparse for community detection; and (b) tracking evolving communities and their lifetimes is difficult. Our work proposes a community detection framework to address these time scale challenges. For the time-dependent aspect of the communities, we propose a time series segmentation algorithm to detect their formations, dissolutions, and lifetimes. We use synthetic networks and real-world datasets to test our method against state-of-the-art. The results show that our proposed approach achieves more accurate fine-grained community detection than competing methods.

1

Introduction

In real world, relations between individuals changes over time. When modeled with a time-varying network, groups of individuals with close relations can be detected as communities from network structure. Tracking changes in these communities helps to understand the evolving of relations. Detecting communities in temporal networks requires cluster identifications and their lifetime detections. There is a significant body of work in clustering temporal networks. One approach identifies clusters at each network snapshot and reconciles those clusters obtained at neighboring snapshots [3, 2, 11, 8]. A second approach applies stochastic block models (SBMs) to detect communities at each time step [4, 15, 7], looking for changes in the SBMs between snapshots to find temporal clusters. A third approach applies tensor decomposition to stacked adjacency matrices of network snapshots [5, 16, 13, 9]. Clusters and their lifetime are detected from the decomposition results. However, these approaches do not tackle all such challenges: (a) Communities are difficult to detect at a fine time scale because they are sparsely connected. However, analysis at a coarse time scale can result in loss of detailed changes in communities; (b) an individual in a real world may belong to multiple communities at the same time; (c) the number of communities in a time-varying network is difficult to determine, which can significantly affect clustering results; finally, (d) state-of-the-art algorithms sacrifice accuracy by limiting the number of communities to make computation tractable [6] because a real-world temporal network can be large both in size and time dimension, resulting in a large number of temporal clusters. Contributions Motivated by intensive studies of previous work, we propose temporal clustering (TC) method to model a node’s membership to multiple communities as well as time-dependent edge generation probabilities between node-pairs inside a community. Assuming the rate at which an edge appears/disappers between two nodes changes slowly in a fine time granularity, we propose 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.

an adaptive threshold of edge-generation rate to detect community lifetime and a bottom up time series segmentation algorithm to detect community formation/dissolution. Experiments show that our method has better performance in community detection and lifetime detection than baseline methods.

2 2.1

Temporal Clustering Model Generative Model for Temporal Network

We model a time-varying network G = {Gt (V, Et )|t = 1, · · · , T } as a mixture of R (unobservable) generative models {X (r) }R r=1 , where Gt is a network snapshot at time t, V is a fixed set of nodes and Et is the set of edges at time t. We assume that a node i ∈ V belongs to X (r) according to a Bernoulli random variable with parameter P (i ∈ X (r) ) = air . At time step t, the r-th generative model X (r) is expected to generate air ajr λ(r) (t) edges between two nodes i, j, where λ(r) (t) is defined as the expected edge-generation rate of X (r) at time t. λ(r) (t) changes over time and can be interpreted as an activity time series. Optimization Problem The temporal clustering problem is to find R network generative models X (r) , their edge-generation rates {λ(r) (t)}, and the probability that node i belongs to the generative model, air ∈ [0, 1], for r = 1, · · · , R, t = 1, · · · , T , and i = 1, . . . , |V | to minimize the objective: P P PR (r) (t))2 i,j∈V t (Xijt − r=1 air ajr λ s.t

0 ≤ air ≤ 1; λ

(r)

for i = 1, . . . , |V |; r = 1, . . . , R

(t) ≥ 0;

(1)

for t = 1, . . . , T

ˆ (r) = {Ar , λ~r } as the r-th generative model learned from an algorithm, where λ~r = We denote X [λ1,r , . . . , λT,r ] is a sequence of T samples of the edge-generation rate λ(r) (t). Detecting Communities We allow X (r) to include one or more clusters. The similarity between two nodes, i and j in X (r) , is defined as: (r) si,j = air ajr (2) (r)

Clusters across different generative models X (r) , X (l) are allowed to overlap: suppose Cm is the (s) (r) (s) m-th cluster obtained from X (r) and Cn is from X (s) , |Cm ∩ Cl | ≥ 0, for r 6= l.

3 3.1

Detecting and Tracking Communities Learning Generative Model

We apply the alternating least squares algorithm (ALS) because of its good performance and fast speed [14]. We select the number of clusters, R, according to core consistency [1], which is proven to be effective in ALS for tensor decomposition. We use λ~r to estimate the edge-generation rate λ(r) (t) and learn a piecewise linear function of time t from λ~r . The time windows when λ(r) (t) increases/decreases are considered as time periods when a community forms/dissolves. 3.2

Clustering in Generative Models

ˆ (r) using K-means algorthm with the Clustering We identify communities within each model X (r) similarity defined in Eq (2) to detect K clusters, where K (r) is determined using Silhouette criterion (details in supplementary material [12]). (r)

Ranking Clusters We propose average similarity ordering (SO) score to rank all communities Cm (for m = 1, · · · , K (r) , r = 1, · · · , R) in descending order to select communities of interests: P P Z T T (r) air ajr (r) air ajr X i,j∈Cm i,j∈C m (r) (r) ˆ t,r ), SOm = λ (z)dz, ≈ ( λ (3) (r) (r) 0 |Cm |2 |Cm |2 t=1 2

ˆ t,r is an estimate of edge-generation rate obtained from X ˆ (r) . SO score represents, on where λ average, the number of edges generated by a node-pair in a community over time. Intuitively, densely connected communities are significant and provide useful information for analysis. 3.3

Detecting Lifetime

We consider the edge-generation rates λ~r from X (r) as a time series. To describe the change in a community, we define its lifetime as time intervals when the edge-generation rate is larger than a threshold. The community formation/dissolution period is defined as time intervals when edge-generation rate is increasing/decreasing. Formation/Dissolution of Clusters To detect community formation/dissolution, we propose a bottom up algorithm to obtain D segments for λ~r to generate a piecewise linear function fD (t) (Alg.1).

1 2 3 4 5 6 7 8 9 10 11

Algorithm 1: Time Mode Segmentation Algorithm Data: Time Series Y = [y1 , · · · , yT ] Result: Segmentation S S = [[1], [2], · · · , [T ]]; Yˆ = SlideWindowFilter(Y ); ∆ = Y − Yˆ ; δ¯ = mean(∆); σ ˆ = standardError(∆); max_error = δ¯ + σ ˆ; for i = 1 : T − 1 do merge_cost(i) = LinearRegressionError(Y[i:i+1]); while min(merge_cost)≤ max_error do i = indexOf(min(merger_cost)); S(i) = merge(S(i),S(i+1)); delete(S(i+1); merge_cost(i) = LinearRegressionError(merge(S(i), S(i+1)); merge_cost(i-1) = LinearRegressionError(merge(S(i-1), S(i));

Lifetime Threshold To detect lifetime of a cluster, we first define “average edge-generating rate for P (r) (r) 1 (r) (t). Similarly, we define the “average community Cm at t”: λm (t) = (r) (r) air ajr λ i,j∈Cm |Cm |2 P R P 1 edge-generating rate for network G at t” as λ(t) = |V |2 r=1 i,j∈V air ajr λ(r) (t). (r)

(r)

Intuitively, our model determines that a community Cm appears at time t if a node pair in Cm , on average, is expected to generate more edges than a randomly selected node pair from the network. (r) Thus, λ(t) serves as a threshold to decide the lifetime of a cluster. We define the lifetime of Cm as: (r) L(r) m = {t|λm (t) > λ(t)}

(4)

where λ(t) can be easily obtained from the result of applying Sliding Window Filtering on the P|V |

sequence [

4

i,j=1 Xijt |V 2 |

] for t = 1, · · · , T .

Experiments and Evaluations

We evaluate our temporal clustering (TC) method by comparing the baseline methods: evolutionary clustering (EC) [13] and PARAFAC with binary classification (BC) [5]. Precision, Recall, F1 measure and PR curve are applied to evaluate the performance of cluster detection, cluster member detection and lifetime detection in time-varying undirected networks constructed from synthetic datasets, Lakehurst dataset[3] and Enron dataset[10]. We also examine how time granularity, denoted by w, affects the performance of different methods. In particular, we aggregate w continuous snapshots by summing up edge numbers between two nodes. We present parts of the results in this section (Details are in [12] and codes are available at https://github.com/submissionCode/Temporal-Clustering). 3

TC.R1 TC.R2 TC.R3 BC.R1 BC.R2 BC.R3 EC

0.6 0.4 0.2 0 100

101

102

1 TC.R1 TC.R2 TC.R3 BC.R1 BC.R2 BC.R3 EC

0.6 0.4

0.6 0.4 0.2

0.2 0 100

103

TC.R1 TC.R2 TC.R3 BC.R1 BC.R2 BC.R3 EC

0.8

F-1 Measure

1 0.8

F-1 Measure

F-1 Measure

1 0.8

101

Granularity w

102

0 100

103

101

Granularity w

(a) Community Detection

102

103

Granularity w

(b) Community Member Detection

(c) LifeTime Detection

Figure 1: F1 Measures for Detections of Community, Community Member and Lifetime. K is the number of ground truth communities in a network, temporal clustering (TC) and BC computes with rank R1 = K, R2 = 0.8K, and R3 = 0.6K. TC outperforms BC and EC with small R. Detections of community and community members by TC and BC are robust in the change in w. Lifetime accuracy decreases as w increases.

4.1

Evaluation with Data

Synthetic networks we generate 5040 synthetic temporal networks with size ranging from 100 to 500 with number of snapshots T ∈ [1000, 4000]. Community sizes range from 8 to 80 and the number of communities range from 10 to 400. Fig 1 illustrates F1 measure for the experiments. Lakehurst Data The Lakehurst data set from the US Army Research Lab contains a three hour (10800 seconds) trace of 70 vehicles (ground and airborne) [3]. 64 vehicles are split into 9 platoons, moving from one checkpoint to another. Platoons have intersecting path and sometimes form a larger cluster, resulting in a total of 19 communities. The community size ranges from 5 to 25. Vehicles in a platoon move together and are within 150 meters for 99% of the time. We create a time-varying network by adding an edge between two vehicles within 150 meters of each other and construct a 70 × 70 × 10800 tensor. We set R = 10, 15, 20, 25 and 30. Recalls of ground truth communities and nodes are shown in Table 1. Table 1: Communities Recall and Member Recall for Lakehurst Data R Community Recall Member Recall

10 0.368 0.430

Temporal Clustering 15 20 25 0.421 0.579 0.895 0.541 0.841 0.875

30 1.0 0.904

10 0.319 0.403

PARAFAC w/ BC 15 20 25 0.421 0.579 0.895 0.532 0.841 0.862

30 1.0 0.904

EC N/A 0.474 0.275

Enron Email Data Enron email dataset by Priebe et al. [10]contains 184 unique email addresses and 125,409 messages dated from November 1998 to June 2002. We construct a 184 × 184 × 31, 592 tensor with a time granularity of one hour. One interesting discovery from our method are the differences in the email exchange behaviors between communities. Fig 2 shows the average weekly email exchange rate of two groups.

3

×10 -3

1.5

Threshold Email-Generating Rate

2

Threshold Email-Generating Rate

1 0.5

1 0 Mon

×10 -3

Tue

Wed

Thu

Fri

Sat

0 Mon

Sun

Tue

Wed

Thu

Fri

Sat

Sun

(a) Avg Weekly Email exchange Rate of Community 1 (b) Avg Weekly Email exchange Rate of Community 5 Figure 2: Different Email Exchanging Behavior between Two Identified Communities. The piecewise linear function for average weekly email exchange rate of a community consisting of CEOs and vice presidents (a) is significantly larger on weekends than that of a community consisting of a president, director and Employees (b). The lifetime of the first community even extends to weekends according to our threshold.

5

Conclusion

We proposed a temporal clustering method based on a network generative model to detect clusters and propose a bottom up algorithm for time series segmentation to track cluster lifetimes. Experiments 4

show that our method has advantages over EC and PARAFAC decomposition methods with BC when snapshots are too sparse to provide cluster structure or lifetime information.

References [1] Rasmus Bro and Henk AL Kiers. A new efficient method for determining the number of components in parafac models. Journal of chemometrics, 17(5):274–286, 2003. [2] Deepayan Chakrabarti, Ravi Kumar, and Andrew Tomkins. Evolutionary clustering. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 554–560. ACM, 2006. [3] Yung-Chih Chen, Elisha Rosensweig, Jim Kurose, and Don Towsley. Group detection in mobility traces. In Proceedings of the 6th international wireless communications and mobile computing conference, pages 875–879. ACM, 2010. [4] Wenjie Fu, Le Song, and Eric P. Xing. Dynamic mixed membership blockmodel for evolving networks. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, pages 329–336, New York, NY, USA, 2009. ACM. [5] Laetitia Gauvin, André Panisson, and Ciro Cattuto. Detecting the community structure and activity patterns of temporal networks: a non-negative tensor factorization approach. PloS one, 9(1):e86028, 2014. [6] Inah Jeon, Evangelos E Papalexakis, Christos Faloutsos, Lee Sael, and U Kang. Mining billion-scale tensors: algorithms and discoveries. The VLDB Journal, 25(4):519–544, 2016. [7] Stefano Leonardi, Aris Anagnostopoulos, Jakub Lacki, Silvio Lattanzi, and Mohammad Mahdian. Community detection on evolving graphs. In Advances in Neural Information Processing Systems, pages 3522–3530, 2016. [8] Yu-Ru Lin, Yun Chi, Shenghuo Zhu, Hari Sundaram, and Belle L Tseng. Facetnet: a framework for analyzing communities and their evolutions in dynamic networks. In Proceedings of the 17th international conference on World Wide Web, pages 685–694. ACM, 2008. [9] Evangelos E Papalexakis, Konstantinos Pelechrinis, and Christos Faloutsos. Location based social network analysis using tensors and signal processing tools. In Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2015 IEEE 6th International Workshop on, pages 93–96. IEEE, 2015. [10] Carey E Priebe, John M Conroy, David J Marchette, and Youngser Park. Scan statistics on enron graphs. Computational and Mathematical Organization Theory, 11(3):229–247, 2005. [11] Lei Tang, Huan Liu, Jianping Zhang, and Zohreh Nazeri. Community evolution in dynamic multi-mode networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 677–685. ACM, 2008. [12] Kun Tu, Bruno Ribeiro, Ananthram Swami, and Don Towsley. Temporal clustering in dynamic networks with tensor decomposition. arXiv preprint arXiv:1605.08074, 2016. [13] Kevin S Xu, Mark Kliger, and Alfred O Hero. Evolutionary spectral clustering with adaptive forgetting factor. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pages 2174–2177. IEEE, 2010. [14] Yangyang Xu and Wotao Yin. A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM Journal on imaging sciences, 6(3):1758–1789, 2013. [15] Tianbao Yang, Yun Chi, Shenghuo Zhu, Yihong Gong, and Rong Jin. Detecting communities and their evolutions in dynamic social networks: a bayesian approach. Machine learning, 82(2):157–189, 2011. [16] Wenchao Yu, Charu C Aggarwal, and Wei Wang. Temporally factorized network modeling for evolutionary network analysis. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pages 455–464. ACM, 2017.

5

Temporal Clustering in Time-varying Networks with ...

Detecting and tracking evolving communities in temporal networks is a key aspect of network analysis. Observing detailed changes in a community over time requires analyzing networks at small time scales and introduces two challenges: (a) the adjacency matrix of a community in a network snapshot may be too sparse for.

220KB Sizes 2 Downloads 346 Views

Recommend Documents

Tuning clustering in random networks with arbitrary degree distributions
Sep 30, 2005 - scale-free degree distributions with characteristic exponents between 2 and 3 as ... this paper, we make headway by introducing a generator of random networks ..... 2,3 and its domain extends beyond val- ues that scale as ...

Maximal planar networks with large clustering ...
Examples are numer- ous: these include the Internet 15–17, the World Wide Web ... of high computing powers, scientists have found that most real-life ... similar to Price's 79,80. The BA ... In our opin- ion, Apollonian networks may be not the netw

Discrete temporal models of social networks - CiteSeerX
Abstract: We propose a family of statistical models for social network ..... S. Hanneke et al./Discrete temporal models of social networks. 591. 5. 10. 15. 20. 25. 30.

Random walks on temporal networks
May 18, 2012 - in settings such as conferences, with high temporal resolution: For each contact .... contexts: the European Semantic Web Conference (“eswc”),.

Discrete temporal models of social networks - CiteSeerX
We believe our temporal ERG models represent a useful new framework for .... C(t, θ) = Eθ [Ψ(Nt,Nt−1)Ψ(Nt,Nt−1)′|Nt−1] . where expectations are .... type of nondegeneracy result by bounding the expected number of nonzero en- tries in At.

Connected k-Hop Clustering in Ad Hoc Networks
information transmission flooding could be confined within each cluster. In an ad ...... dominating set (CDS) to carry out data propagation. Find- ing a minimum ...

Random walks on temporal networks
May 18, 2012 - relationships in social networks [2] are a static representation of a succession of ... its nearest neighbors, the most naive strategy is the random walk search, in .... of vertex i, Pr (i; t), as the probability that vertex i is visit

Discrete Temporal Models of Social Networks - Steve Hanneke
Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213 USA. Abstract ..... ary distribution. Answering this would not only be.

Intelligent Jamming in Wireless Networks with ... - CiteSeerX
create a denial of service attack. ... Index Terms—Denial of Service, MAC protocol attacks, .... presented in [1] that showed that wireless networks using TCP.

Predicting Blogging Behavior Using Temporal and Social Networks
Experiments show that the social network and profile-based blogging behavior model with ELM re- gression techniques produce good results for the most ac-.

Clustering of Wireless Sensor and Actor Networks ... - Semantic Scholar
regions, maximal actor coverage along with inter-actor connectivity is desirable. In this paper, we propose a distributed actor positioning and clustering algorithm which employs actors as cluster-heads and places them in such a way that the coverage

On Regular Temporal Logics with Past*, **
this section, we fix a finite set P of propositions. ..... ver, S. Mador-Haim, E. Singerman, A. Tiemeyer, M. Y. Vardi, and Y. Zbar. ... IEEE Computer Society Press. 10.

On Regular Temporal Logics with Past - CiteSeerX
In fact, we show that RTL is exponentially more succinct than the cores of PSL and SVA. Furthermore, we present a translation of RTL into language-equivalent ...

Clustering with Gaussian Mixtures
Clustering with. Gaussian Mixtures. Andrew W. Moore. Professor. School of Computer Science. Carnegie Mellon University www.cs.cmu.edu/~awm.

Spectral Clustering with Limited Independence
Oct 2, 2006 - data in which each object is represented as a vector over the set of features, ... and perhaps simpler “clean-up” phase than known algo- rithms.

Subspace Clustering with a Twist - Microsoft
of computer vision tasks such as image representation and compression, motion ..... reconstruction error. Of course all of this is predicated on our ability to actu-.

Introducing Temporal Asymmetries in Feature ...
improvement in phoneme error rate on TIMIT database over the MRASTA technique. Index Terms: feature extraction, auditory neurons, speech recognition. 1.

Temporal-Spatial Sequencing in Prosodic ...
Waseda University/MIT and California State University, Fullerton. 1. .... words on a reading list, and therefore she could not use contextual clues to arrive at a ...

Introducing Temporal Asymmetries in Feature ...
mate posterior probabilities of 29 English phonemes. Around. 10% of the data is used for cross-validation. Log and Karhunen. Loeve (KL) transforms are applied ...

Measuring Volatility Clustering in Stock Markets
Sep 15, 2007 - Division of Business Administration, ... apply it to a high-frequency data of the financial markets. ... In the next section, we describe the data sets and methods used in this paper. .... istry of Education through the program BK 21.