Generalized compressive sensing matching pursuit algorithm Nam H. Nguyen, Sang Chin and Trac D. Tran In this short note, we present a generalized greedy approach, in particular the well-known compressive sensing matching pursuit (CoSaMP) algorithm [2], to consider a general loss function L(x). The problem is to minimize a loss function subject to a sparse constraint: min L(x)

subject to

kxk0 ≤ k.

(1)

This formulation can find applications in graphical model selection, sparse covariance matrix estimation. In the case of linear regression and compressed sensing where the observation vector is y = Ax? + ν, where A ∈ Rn×p is the sensing matrix, the objective function is often quadratic: L(x) = ky − Axk2 . The algorithm is presented as follows. This algorithm only differs from CoSaMP [2] in the step 4 where instead of taking the correlation between A and the error residual, we take the gradient of the loss function. In fact, when the loss function is quadratic, these two algorithms are the same. In this short note, we show that CoSaMP algorithm still offers efficient convergence for general loss function. 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12:

Initialization: x0 and t = 1 while not converge do u = ∇L(xt−1 ) Γ = supp(u2k ) b =Γ∪Λ Γ b = argminx L(x) subject to Λ = supp(bk ) xtΛ = bΛ , xtΛc = 0 t=t+1 end while Output: x b = xt

b supp(x) = Γ

Algorithm 1: Generalized CoSaMP algorithm To show the convergence of the algorithm, assumptions on the loss function are required. These assumption, called restricted strong convexity (RSC) [3] and restricted strong smoothness (RSS) [?], are defined as follow Definition 1 (Restricted strong convexity (RSC)). The loss function L satisfies restricted strong convexity with parameter ρ− k if 2 L(x + h) − L(x) − h∇L(x), hi ≥ ρ− k khk2

for every h ∈ Rp obeying | supp(h)| ≤ k. 1

(2)

We also specify another notion of restricted strong smoothness Definition 2 (Restricted strong smoothness (RSS)). The loss function L satisfies restricted smoothness with parameter ρ+ k if 2 L(x + h) − L(x) − h∇L(x), hi ≤ ρ+ k khk2

(3)

for every h ∈ Rp obeying | supp(h)| ≤ k. These RSC and RSM conditions imposed on the loss function are significant in showing the convergence of the algorithm. In fact, when the loss function is quadratic, these two assumptions returns to the well known restricted isometry property (RIP). Greedy algorithms such as CoSaMP [2] and ADMiRA [1] require δ4k ≤ 0.1 and δ4 < 0.05 respectively to guarantee exponential convergence. Theorem 1. Denote λ = k∇L(x? )k∞ , the error at the tth iteration is bounded by   √ √



t

x − x? ≤ γ xt−1 − x? + λ k  2 + 1 + q 3 . 2 2 − + ρ− 4k 2 ρ4k ρk

(4)

r + − ρ+ k (ρ2k −ρ2k ) where γ is defined as γ := 2 . − − 2ρ ρ 4k 4k

The following corollary is the consequence of Theorem 1 Corollary 1. Denote x0 as the initialization, with λ and γ defined in Theorem 1, we have  √ √



t λ 3 k 2 + 1

x − x? ≤ γ t x0 − x? +  . + q 2 2 1−γ ρ− 4k 2 ρ− ρ+ 4k k

In particular, assuming that γ ≤ 0.5, then after t = log x0 − x? / iterations n √ o

t

x − x? ≤ max 2 , cλ k 2 !



where c = 2

2+1 ρ− 4k

+

q3 + 2 ρ− 4k ρk

.

Remark. ∗ 1) When L(x) is the quadratic loss, ∇L(x? ) = A∗ (Ax? − y) q = A ν. For statistical noise ν ∼ N (0, σ 2 I), one can easily see that k∇L(x? )k∞ = kA∗ νk∞ ≤ logn p assuming that columns of A are normalized to be unit-normed. Thus, ( r )

t

k log p ?

x − x ≤ max 2 , c . 2 n This error bound is similar with we can obtain via Lasso program. n what √ o ? 2) If mini∈T |xi | > 2 max , cλ k , then after t iterations, xt has the same support as the original solution x? . 2

3) In compressed sensing, the measurement vector y = Ax and the loss function is often quadratic. An important assumption frequently made for the sensing matrix A is the restricted isometry property (RIP) which is defined as the smallest constant δk such that (1 − δk ) kxk22 ≤ kAxk22 ≤ (1 + δk ) kxk22 for all vectors x satisfying | supp(x)| ≤ k. As shown in the original paper of Needell and Tropp [2], for the CoSaMP algorithm to work, it is required that δ4k ≤ 0.1. If we replace ρ− k = 1 − δk and ρ+ = 1 + δ , then the condition γ ≤ 0.5 in Corollary 1 can be interpreted as δ ≤ 0.1. k 4k k To the rest of this section, we focus on proving the main theorem 1, we need the following two lemmas whose proofs are deferred to the end of this section. Lemma 1. We have s kb − x? k2 ≤

 

√ ρ+ 1 3 k ? .

xΓbc + λ k  − + q 2 − + ρ− ρ 4k 4k 2 ρ4k ρk

Lemma 2. Denote R as the support of the vector (xt−1 − x? ), we have s √ −

t−1

ρ+ λ 2k ? ? 2k − ρ2k t−1

(x

− x )R\Γ 2 ≤ x −x 2+ − . 2ρ− ρ4k 4k With these two lemmas at hand, now we prove the main theorem. We have



t

x − x? ≤ kb − x? k + b − xt ≤ 2 kb − x? k , (5) 2 2 2 2

where the last inequality is due to the construction of xt :

xt − b 2 ≤ kx? − bk2 .

Applying Lemma 1, it now suffices to upper bound x?bc . From the algorithm procedure, the Γ

support of

xt−1

b Thus, xt−1 = 0. We have is in the set Γ. bc

2

Γ







?

xΓbc = (x? − xt−1 )Γbc 2 ≤ (x? − xt−1 )Γc 2 = (x? − xt−1 )R\Γ 2 , 2

(6)

where R is denoted as the support of xt−1 − x? . Applying Lemma 2 complete the proof of Theorem 1. Proof of Lemma 1. Furthermore, by the RSC we have ? 2 ? ? ? ρ− 4k kb − x k2 + h∇L(x ), b − x i ≤ L(b) − L(x )

≤ L(x?Γb ) − L(x? ) D E D E = L(x?Γb ) − L(x? ) − ∇L(x? ), x?Γb − x? + ∇L(x? ), x?Γb − x?

2

?

? ? ? ? x − x ≤ ρ+ x − x + k∇L(x )k

b b ∞ Γ k Γ 2 1



2 √



? ? = ρ+ bc b c + λ k xΓ k xΓ 2 2 ! √ 2

λ k λ2 k

? x + = ρ+ − .

bc k Γ 2 2ρ+ 4ρ+ k k 3

The left-hand side is lower bounded by − ? 2 ? ? ? 2 ? ? ρ− 4k kb − x k2 + h∇L(x ), b − x i ≥ ρ4k kb − x k2 − k∇L(x )k∞ kb − x k1 √ ? ? 2 ≥ ρ− 4k kb − x k2 − λ 4k kb − x k2 √ !2 λ 4k λ2 k ? = ρ− kb − x k − − . 2 4k 2ρ− ρ− 4k 4k

Combining these pieces together, after some simple algebras, we conclude that   s

+ √ ρk ?  1 + q3  kb − x? k2 ≤ bc + λ k − xΓ 2 − + ρ4k ρ− 4k 2 ρ ρ

(7)

4k k

as claimed. Proof of Lemma 2. Denote ∆ = x? − xt−1 and notice that supp(∆) = R. We have

2 t−1 ), ∆ L(xt−1 + ∆) − L(xt−1 ) − ρ− 2k k∆k2 ≥ ∇L(x



= ∇R∩Γ L(xt−1 ), ∆R∩Γ + ∇R\Γ L(xt−1 ), ∆R\Γ



≥ ∇R∩Γ L(xt−1 ), ∆R∩Γ − ∇R\Γ L(xt−1 ) ∆R\Γ 2

2



It is clear from the construction of the subset Γ that ∇R\Γ L(xt−1 ) 2 ≤ ∇Γ\R L(xt−1 ) 2 . In addition, one can see that cardinality of the subset R is less than 2k while |Γ| = 2k. Thus, |R\Γ| ≤ |Γ\R|. Now, we define a vector g ∈ Rp whose entries outside the set Γ\R are zero and



∇ L(xt−1 ) gΓ\R = − ∆R\Γ 2 ∇ Γ\RL(xt−1 ) . From this construction, we have gΓ\R 2 = ∆R\Γ 2 . Thus, k Γ\R k2





− ∇R\Γ L(xt−1 ) 2 ∆R\Γ 2 ≥ − ∇Γ\R L(xt−1 ) 2 ∆R\Γ 2

= ∇Γ\R L(xt−1 ), gΓ\R , where the second identity follows from the construction of gΓ\R . Combining these pieces allows us to bound



2 t−1 ), ∆R∩Γ + ∇Γ\R L(xt−1 ), gΓ\R L(xt−1 + ∆) − L(xt−1 ) − ρ− 2k k∆k2 ≥ ∇R∩Γ L(x

= ∇Γ L(xt−1 ), zΓ

= ∇L(xt−1 ), z , T ]T . where z is a sparse vector whose support is Γ and zΓ = [∆TR∩Γ gΓ\R Furthermore, by RSC, the right-hand side is upper bounded by

2 ∇L(xt−1 ), z ≥ L(xt−1 + z) − L(xt−1 ) − ρ+ 2k kzk2 .

Combining these two pieces, we get 2 2 − t−1 ρ+ + z) − L(xt−1 + ∆) 2k kzk2 − ρ2k k∆k2 ≥ L(x

= L(xt−1 + z) − L(x? ). It is clear from the construction of z that kzk2 = k∆k2 . Thus, the left-hand side is equal to 2 − (ρ+ 2k − ρ2k ) k∆k2 . The right-hand side can be lower bound by RSC 2 L(xt−1 + z) − L(x? ) ≥ h∇L(x? ), z − ∆i + ρ− 4k kz − ∆k2

4

2 ≥ − k∇L(x? )k∞ kz − ∆k1 + ρ− 4k kz − ∆k2 √ 2 ≥ −λ 4k kz − ∆k2 + ρ− 4k kz − ∆k2 √ !2 λ 4k λ2 k − = ρ4k kz − ∆k2 − − . 2ρ− ρ− 4k 4k

Therefore, we have (ρ+ 2k



2 ρ− 2k ) k∆k2



ρ− 4k

√ !2 λ 4k λ2 k kz − ∆k2 − − . 2ρ− ρ− 4k 4k

We notice that by the constructions of vectors g and z,

2

2

2 kz − ∆k22 = ∆R\Γ 2 + gΓ\R 2 = 2 ∆R\Γ 2 .

(8)

Therefore, we conclude that s

∆R\Γ ≤ 2

− ρ− 2k 2ρ− 4k

ρ+ 2k





k∆k2 + λ k  √

 1  1 . +q − 2ρ4k 2ρ− 4k

(9)

References [1] K. Lee and Y. Bresler. ADMiRA: Atomic decomposition for minimum rank approximation. IEEE Trans. Inf. Theory, 56(9):4402–4416, 2010. [2] D. Needell and J. A. Tropp. CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Applied Comput. Harmon. Anal., 26:301–321, 2008. [3] S. Negahban, P. Ravikumar, M. J. Wainwright, and B. Yu. A unified framework for highdimensional analysis of M-estimators with decomposable regularizers. In Ad. Neural Infor. Proc. Sys. (NIPS), Vancouver, BC, Canada, Dec. 2009.

5

Generalized compressive sensing matching pursuit algorithm

Generalized compressive sensing matching pursuit algorithm. Nam H. Nguyen, Sang Chin and Trac D. Tran. In this short note, we present a generalized greedy ...

205KB Sizes 2 Downloads 256 Views

Recommend Documents

Generalized compressive sensing matching pursuit ...
Definition 2 (Restricted strong smoothness (RSS)). The loss function L .... Denote R as the support of the vector (xt−1 − x⋆), we have. ∥. ∥(xt−1 − x⋆)R\Γ. ∥. ∥2.

Generalized Orthogonal Matching Pursuit
This work was supported by the KCC (Korea Communications Commission),. Korea, under the R&D program ...... stitute of Technology, China, in 2006 and 2009, re- spectively. ... compressive sensing, wireless communications, and statistical ...

Sparsity adaptive matching pursuit algorithm for ...
where y is the sampled vector with M ≪ N data points, Φ rep- resents an M × N ... sparsity adaptive matching pursuit (SAMP) for blind signal recovery when K is ...

Object Detection by Compressive Sensing
[4] demonstrate that the most discriminative features can be learned online to ... E Rn×m where rij ~N(0,1), as used in numerous works recently [9]. ..... 7.1 Sushma MB has received Bachelor degree in Electronics and communication in 2001 ...

COMPRESSIVE SENSING FOR THROUGH WALL ...
SCENES USING ARBITRARY DATA MEASUREMENTS. Eva Lagunas1, Moeness G. Amin2, Fauzia Ahmad2, and Montse Nájar1. 1 Universitat Polit`ecnica de Catalunya (UPC), Barcelona, Spain. 2 Radar Imaging Lab, Center for ... would increase the wall subspace dimensi

Beamforming using compressive sensing
as bandwidth compression, image recovery, and signal recovery.5,6 In this paper an alternating direction ..... IEEE/MTS OCEANS, San Diego, CA, Vol. 5, pp.

Compressive Sensing With Chaotic Sequence - IEEE Xplore
Index Terms—Chaos, compressive sensing, logistic map. I. INTRODUCTION ... attributes of a signal using very few measurements: for any. -dimensional signal ...

Multipath Matching Pursuit - IEEE Xplore
Abstract—In this paper, we propose an algorithm referred to as multipath matching pursuit (MMP) that investigates multiple promising candidates to recover ...

A Lecture on Compressive Sensing 1 Scope 2 ...
Audio signals and many communication signals are compressible in a ..... random number generator (RNG) sets the mirror orientations in a pseudorandom 0/1 pattern to ... tion from highly incomplete frequency information,” IEEE Trans. Inform.

TC-CSBP: Compressive Sensing for Time-Correlated ...
School of Electrical and Computer Engineering ... where m

Photon-counting compressive sensing laser radar for ...
in the spirit of a CCD camera. A variety of ... applying single-pixel camera technology [11] to gen- ..... Slomkowski, S. Rangwala, P. F. Zalud, T. Senko, J. Tower,.

Compressive Sensing for Through-the-Wall Radar ... - Amazon AWS
the wall imaging,” Progress in Electromagnetics Research M, vol. 7, pp. 1-13, 2009. [24] T. S. Ralston, G. L. Charvat, and J. E. Peabody, “Real-time through-wall imaging using an ultrawideband multiple-input multiple-output (MIMO) phased array ra

GBCS: a Two-Step Compressive Sensing ...
Oklahoma State University, Stillwater, OK 74078. Emails: {ali.talari ... Abstract—Compressive sensing (CS) reconstruction algorithms can recover a signal from ...

A Compressive Sensing Based Secure Watermark Detection And ...
the cloud will store the data and perform signal processing. or data-mining in an encrypted domain in order to preserve. the data privacy. Meanwhile, due to the ...

Correction to “Generalized Orthogonal Matching Pursuit”
Jan 25, 2013 - On page 6204 of [1], a plus sign rather than a minus sign was incor- ... Digital Object Identifier 10.1109/TSP.2012.2234512. TABLE I.

Compressive Sensing for Ultrasound RF Echoes using ... - FORTH-ICS
B. Alpha-stable distributions for modelling ultrasound data. The ultrasound image .... hard, there exist several sub-optimal strategies which are used in practice. Most of .... best case is observed for reweighted lp-norm minimization with p = α −

A Lecture on Compressive Sensing 1 Scope 2 ...
The ideas presented here can be used to illustrate the links between data .... a reconstruction algorithm to recover x from the measurements y. Initially ..... Baraniuk, “Analog-to-information conversion via random demodulation,” in IEEE Dallas.

Thresholding Orthogonal Multi Matching Pursuit
May 1, 2010 - Compressed sensing (CS) is a new signal recovery method established in the recent years. The fundamental work of CS is done by Donoho[8], Candes,. Romberg and Tao([1] , [2], and [3]). The CS approach recovers a sparse signal in high dim

the matching-minimization algorithm, the inca algorithm and a ...
trix and ID ∈ D×D the identity matrix. Note that the operator vec{·} is simply rearranging the parameters by stacking together the columns of the matrix. For voice ...

the matching-minimization algorithm, the inca algorithm ... - Audentia
ABSTRACT. This paper presents a mathematical framework that is suitable for voice conversion and adaptation in speech processing. Voice con- version is formulated as a search for the optimal correspondances between a set of source-speaker spectra and

BAYESIAN PURSUIT ALGORITHM FOR SPARSE ...
We show that using the Bayesian Hypothesis testing to de- termine the active ... suggested algorithm has the best performance among the al- gorithms which are ...

Support Recovery With Orthogonal Matching Pursuit in ... - IEEE Xplore
Nov 1, 2015 - Support Recovery With Orthogonal Matching Pursuit in the Presence of Noise. Jian Wang, Student Member, IEEE. Abstract—Support recovery ...

A Fast Greedy Algorithm for Generalized Column ...
In Proceedings of the 52nd Annual IEEE Symposium on Foundations of Computer. Science (FOCS'11), pages 305 –314, 2011. [3] C. Boutsidis, M. W. Mahoney, and P. Drineas. An improved approximation algorithm for the column subset selection problem. In P

Adaptation Algorithm and Theory Based on Generalized Discrepancy
rithms is that the training and test data are sampled from the same distribution. In practice ...... data/datasets.html, 1996. version 1.0. S. Sch˝onherr. Quadratic ...