Spectral Embedded Clustering Feiping Nie1,2 , Dong Xu2 , Ivor Wai-Hung TSANG2 and Changshui Zhang1 1 State Key Laboratory of Intelligent Technology and Systems, Department of Automation, Tsinghua University, Beijing 100080, China 2 School of Computer Engineering, Nanyang Technological University, Singapore [email protected]; [email protected]; [email protected]; [email protected] Theorem 1. If rank(Sb ) = c − 1 and rank(St ) = rank(Sw ) + rank(Sb ), then the true cluster assignment matrix can be represented by a low dimensional linear mapping of the data, that is, there exist W ∈ Rd×c and b ∈ Rc×1 such that Y = X T W + 1n bT .

A Proof of Theorem 1 Suppose · the eigenvalue-decomposition of St is St = ¸ Λ2t 0 −1 T T [U1 , U0 ] [U1 , U0 ] . Let B = Λ−1 t U1 Sb U1 Λt . 0 0 Suppose the eigenvalue-decomposition of B is B = Vb Λb VbT , and let P = [U1 Λ−1 t Vb , U0 ]. According to [Ye, 2007], we know that if ·rank(St )¸ = It 0 rank(Sw ) + rank(Sb ) holds, then P T St P = = 0 0 · ¸ Ib 0 Dt and P T Sb P = = Db , where It ∈ Rrt ×rt and 0 0 Ib ∈ Rrb ×rb are identity matrices, rt is the rank of St , rb is the rank of Sb , and rb ≤ rt . According to [Ye, 2005], we know St+ = P Dt P T . Then, we have Sw St+ Sb = St St+ Sb − Sb St+ Sb = P −T P T St St+ Sb P P −1 − P −T P T Sb St+ Sb P P −1 = P −T Dt Dt Db P −1 − P −T Db Dt Db P −1 = 0. Note that Sb = XGGT X T . Therefore, we have Sw St+ XGGT X T = 0 ⇒ Sw St+ XG(Sw St+ XG)T = 0 ⇒ Sw St+ XG = 0 ⇒ Sw St+ XY = 0. Let us define W0 to be W0 = St+ XY , then we have Sw W0 = 0. Therefore, W0 is in the null space of Sw and all the data that belongs to the same class will be projected onto the same point under the projection W0 [Cevikalp et al., 2005], thus we have: ¯Tj W0 , ∀i, yi = [0, ..., 0, 1, 0, ..., 0]T ⇒ xTi W0 = x | {z } | {z } j−1

c−j

(1)

where yiT is the i-th row of the true cluster assignment matrix Y and x ¯j is the mean of the data that belongs to class ¯ c = [¯ ¯ c = XY Σ, where j. Denote X x1 , ..., x ¯c ]. Note that X Σ ∈ Rc×c is a diagonal matrix with the i-th diagonal element as 1/ni , ni is the number of the data that belongs to class ¯ cT W0 ) = rank(ΣY T X T (XX T )+ XY ) = i. Then rank(X T + rank((XX ) XY ) = rank(Sb ) = c − 1. Denote ¯ cT W0 + 1c 1Tc . Q=X ¯ cT W0 1c = 0 Note that Y 1c = 1n and X1n = 0, so X T −1 ¯ T and 1c Σ Xc W0 = 0. Thus vector 1c is linearly inde¯ cT W0 . Suppose pendent of the rows or the columns of X ¯ cT W0 = Q1 QT , ¯ cT W0 is X the full rank decomposition of X 2 c×c−1 where Q1 , Q2 ∈ R are column full rank matrices. Then Q = Q1 QT2 + 1c 1Tc = [Q1 , 1c ][Q2 , 1c ]T . As [Q1 , 1c ] and [Q2 , 1c ] both are full rank matrices, Q is invertible. Hence ¯ T W0 + 1c 1T )Q−1 = I ⇒ we have QQ−1 = I ⇒ (X c c T −1 −T T ¯ Xc W0 Q + 1c (Q 1c ) = I. Let W = W0 Q−1 , and

b = Q−T 1c .

¯ T W + 1c bT ) = I. According to (1), we have Then (X c T X W + 1n bT = Y . Therefore, If rank(Sb ) = c − 1 and rank(St ) = rank(Sw ) + rank(Sb ), there exist W ∈ Rd×c and b ∈ Rc×1 such that Y = X T W + 1n bT . ¤

References [Cevikalp et al., 2005] Hakan Cevikalp, Marian Neamtu, Mitch Wilkes, and Atalay Barkana. Discriminative common vectors for face recognition. IEEE Trans. Pattern Anal. Mach. Intell., 27(1):4–13, 2005. [Ye, 2005] Jieping Ye. Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. Journal of Machine Learning Research, 6:483–502, 2005. [Ye, 2007] Jieping Ye. Least squares linear discriminant analysis. In ICML, pages 1087–1093, 2007.

Spectral Embedded Clustering

2School of Computer Engineering, Nanyang Technological University, Singapore ... rank(Sw) + rank(Sb), then the true cluster assignment ma- trix can be ...

115KB Sizes 1 Downloads 321 Views

Recommend Documents

Spectral Embedded Clustering - Semantic Scholar
A well-known solution to this prob- lem is to relax the matrix F from the discrete values to the continuous ones. Then the problem becomes: max. FT F=I tr(FT KF),.

Spectral Clustering - Semantic Scholar
Jan 23, 2009 - 5. 3 Strengths and weaknesses. 6. 3.1 Spherical, well separated clusters . ..... Step into the extracted folder “xvdm spectral” by typing.

Parallel Spectral Clustering
Key words: Parallel spectral clustering, distributed computing. 1 Introduction. Clustering is one of the most important subroutine in tasks of machine learning.

Spectral Clustering for Time Series
the jth data in cluster i, and si is the number of data in the i-th cluster. Now let's ... Define. J = trace(Sw) = ∑K k=1 sktrace(Sk w) and the block-diagonal matrix. Q =.... ..... and it may have potential usage in many data mining problems.

Parallel Spectral Clustering - Research at Google
a large document dataset of 193, 844 data instances and a large photo ... data instances (denoted as n) is large, spectral clustering encounters a quadratic.

Spectral Clustering for Complex Settings
2.7.5 Transfer of Knowledge: Resting-State fMRI Analysis . . . . . . . 43 ..... web page to another [11, 37]; the social network is a graph where each node is a person.

Active Spectral Clustering - Computer Science, UC Davis
tion, social network analysis and data clustering can be abstracted into a graph ... Previous research [5] showed that in batch constrained clustering, not all given ...

Spectral Clustering with Limited Independence
Oct 2, 2006 - data in which each object is represented as a vector over the set of features, ... and perhaps simpler “clean-up” phase than known algo- rithms.

Spectral Clustering for Medical Imaging
integer linear program with a precise geometric interpretation which is globally .... simple analytic formula to define the eigenvector of an adjusted Laplacian, we ... 2http://www-01.ibm.com/software/commerce/optimization/ cplex-optimizer/ ...

Flexible Constrained Spectral Clustering
Jul 28, 2010 - H.2.8 [Database Applications]: Data Mining. General Terms .... rected, weighted graph G(V, E, A), where each data instance corresponds to a ...

Consensus Spectral Clustering in Near-Linear Time
chine learning and data mining applications [11]. The spectral clustering approaches are prohibited in such very large-scale datasets due to its high ...

Multi-view clustering via spectral partitioning and local ...
(2004), for example, show that exploiting both the textual content of web pages and the anchor text of ..... 1http://www.umiacs.umd.edu/~abhishek/papers.html.

Diffusion Maps, Spectral Clustering and Eigenfunctions ...
spectral clustering and dimensionality reduction algorithms that use the ... data by the first few eigenvectors, denoted as the diffusion map, is optimal under a ...

Self-Taught Spectral Clustering via Constraint ...
Oracle is available, self-teaching can reduce the number ... scarce and polling an Oracle is infeasible. ... recover an almost perfect constraint matrix via self-.

Multi-way Constrained Spectral Clustering by ...
for data analysis. Typically, it works ... tor solutions are with mixed signs which makes incor- porating the ... Based on the above analysis, we propose the follow-.

Kernel k-means, Spectral Clustering and Normalized Cuts
[email protected]. Yuqiang Guan. Dept. of Computer ... republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

Multi-Objective Multi-View Spectral Clustering via Pareto Optimization
of 3D brain images over time of a person at resting state. We can ... (a) An illustration of ideal- ized DMN. 10. 20 .... A tutorial on spectral clustering. Statistics and ...

On Constrained Spectral Clustering and Its Applications
Our method offers several practical advantages: it can encode the degree of be- ... Department of Computer Science, University of California, Davis. Davis, CA 95616 ...... online at http://bayou.cs.ucdavis.edu/ or by contacting the authors. ...... Fl

Consensus Spectral Clustering in Near-Linear Time
quality large-scale data analysis. I. Introduction. Clustering is one of the most widely used techniques for data analysis, with applications ranging from statistics, ...

Parallel Spectral Clustering Algorithm for Large-Scale ...
Apr 21, 2008 - Spectral Clustering, Parallel Computing, Social Network. 1. INTRODUCTION .... j=1 Sij is the degree of vertex xi [5]. Consider the ..... p ) for each computer and the computation .... enough machines to do the job. On datasets ...

Parallel Spectral Clustering Algorithm for Large-Scale ...
1 Department of ECE, UCSB. 2 Department of ... Apr. 22, 2008. Gengxin Miao Et al. (). Apr. 22, 2008. 1 / 20 .... Orkut is an Internet social network service run by.

Spectral unmixing versus spectral angle mapper for ...
to assess the classification performance for identifying and mapping 'desert like' .... The derived spectral angle maps form a new data cube with the number of bands equal .... Interactive visualization and analysis of imaging spectrometer data.

Spectral karyotyping
spectrum at all image points. Here we describe the principle of spectral imaging, define ... lens (the system can also be attached to any other optics such as a telescope or a .... taken for an infinite interferogram, and the zero filling is an optio

Spectral unmixing versus spectral angle mapper for ...
process of desertification and land degradation as a whole. Hyperspectral remote ..... Mapping target signatures via partial unmixing of ... integration of image processing, digital elevation data and field knowledge. (application to Nepal). JAG.