Appl. Comput. Harmon. Anal. 30 (2011) 402–406

Contents lists available at ScienceDirect

Applied and Computational Harmonic Analysis www.elsevier.com/locate/acha

Letter to the Editor

The null space property for sparse recovery from multiple measurement vectors Ming-Jun Lai 1 , Yang Liu ∗ Department of Mathematics, The University of Georgia, Athens, GA 30602, United States

a r t i c l e

i n f o

a b s t r a c t

Article history: Available online 10 November 2010 Communicated by Naoki Saito on 3 January 2010

We prove a null space property for the uniqueness of the sparse solution vectors recovered from a minimization in  p quasi-norm subject to multiple systems of linear equations, where p ∈ (0, 1]. Furthermore, we show that the null space property is equivalent to the null space property for the standard  p minimization subject to a single linear system. This answers the questions raised in Foucart and Gribonval (2010) [17]. © 2010 Elsevier Inc. All rights reserved.

Keywords: Sparse recovery Optimization

1. Introduction Recently, one of the central problems in the compressed sensing for the sparse solution recovery of under-determined linear systems has been extended to the sparse solution vectors for multiple measurement vectors (MMV). That is, letting A be a sensing matrix of size m × N with m  N and given multiple measurement vectors b(k) , k = 1, . . . , r, we are looking for solution vectors x(k) , 1, . . . , r such that

Ax(k) = b(k) ,

k = 1, . . . , r

(1)

and the vectors x(k) , k = 1, . . . , r are jointly sparse, i.e. have nonzero entries at the same locations and have as few nonzero entries as possible. Such problems arise in source localization (cf. [24]), neuromagnetic imaging (cf. [12]), and equalization of sparse communication channels (cf. [13,15]). A popular approach to find the sparse solution for multiple measurement vectors (MMV) is to solve the following optimization:

 minimize x(k) ∈R N k=1,...,r

N    (x1, j , . . . , xr , j ) p

1 / p

q

j =1

 : subject to Ax

(k)

(k)

= b , k = 1, . . . , r ,

(2)

r

for q  1 and p  1, where x(k) = (xk,1 , . . . , xk, N ) T for all k = 1, . . . , r, (x1 , . . . , xr )q = ( j =1 |x j |q )1/q is the standard q norm. Clearly, it is a generalization of the standard 1 minimization approach for the sparse solution. That is, when r = 1, one finds the sparse solution x solving the following minimization problem:

minimize{x1 : subject to Ax = b}, x∈R N

* 1

Corresponding author. E-mail addresses: [email protected] (M.-J. Lai), [email protected] (Y. Liu). This author is partly supported by the National Science Foundation under grant DMS-0713807.

1063-5203/$ – see front matter doi:10.1016/j.acha.2010.11.002

© 2010

Elsevier Inc. All rights reserved.

(3)

M.-J. Lai, Y. Liu / Appl. Comput. Harmon. Anal. 30 (2011) 402–406

403

N

where x1 = j =1 |x j | is the standard 1 norm. Such a minimization problem (3) has been actively studied recently. See, e.g. [5–9,3,16,19,25], and references therein. In the literature, there are also several studies for various combinations of p  1 and q  1 in (2). See, e.g., references [13,11,24,27–29]. In particular, the well-known null space property (cf. [14] and [20]) for the standard 1 minimization has been extended to this setting (2) for multiple measurement vectors. In [4], the following result is proved. Theorem 1.1. Let A be a real matrix of size m × N and S ⊂ {1, 2, . . . , N } be a fixed index set. Denote by S c the complement set of S in {1, 2, . . . , N }. Let  ·  be any norm. Then all x(k) with support x(k) in S for k = 1, . . . , r can be uniquely recovered using the following



N    (x1, j , . . . , xr , j ): subject to Ax(k) = b(k) , k = 1, . . . , r minimize x(k) ∈R N k=1,...,r



(4)

j =1

if and only if all vectors (u(1) , . . . , u(r ) ) ∈ ( N ( A ))r \{(0, 0, . . . , 0)} satisfy the following

    (u 1, j , . . . , ur , j ) < (u 1, j , . . . , ur , j ), j∈ S

(5)

j∈ S c

where N ( A ) stands for the null space of A. In [17], Foucart and Gribonval studied the MMV setting when r = 2, q = 2 and p = 1. They gave another nice explanation of the problem of MMV. When r = 2, one can view that the sparse solutions x(1) and x(2) are two components of a complex solution y = x(1) + ix(2) of Ay = c with c = b(1) + ib(2) . Then they recognize that the null space property for Ay = c for the solution as complex vector is the same as the null space property for Ax = b for solution as real vector. That is, they proved the following Theorem 1.2. Let A be a real matrix of size m × N and S ⊂ {1, . . . , N } be the support of the sparse vector y. The complex null space property: for any u ∈ N ( A ), w ∈ N ( A ) with (u, w) = 0,



u 2j + w 2j <

j∈ S



u 2j + w 2j ,

(6)

j∈ S c

where u = (u 1 , u 2 , . . . , u N ) T and w = ( w 1 , w 2 , . . . , w N ) T is equivalent to the following standard null space property: for any u in the null space N ( A ) with u = 0,





|u j | <

j∈ S

|u j |.

(7)

j∈ S c

Furthermore, the researchers in [17] raised two questions. One is how to extend their result from r = 2 to any r  3 and the other one is what happen when q = 2 and p < 1. These motivate us to study the joint sparse solution recovery. The study of the 1 minimization in (3) was generalized to the following  p setting:



p

min x p , Ax = b



(8)

x∈R N

N

for a fixed number p ∈ (0, 1] (see for instance, [20,9], and [18]), in which x p = ( j =1 |x j | p )1/ p is the standard  p quasinorm when p ∈ (0, 1). Therefore, we may consider a joint recovery from multiple measurement vectors via

minimize

N 

x21, j + · · · + xr2, j

p

: subject to Ax(1) = b(1) , . . . , Ax(r ) = b(r )

(9)

j =1

for a given 0 < p  1, where x(k) = (xk,1 , . . . , xk, N ) T ∈ R N for all k = 1, . . . , r, and this is actually (2) for q = 2. Note that



when p → 0+ , we have ( x21, j + · · · + xr2, j ) p → 1 if any of x1, j , . . . , xr , j is nonzero and hence, N 

x21, j + · · · + xr2, j

p

→s

i =1

which is the joint sparsity of the solution vectors x(k) , k = 1, . . . , r. Thus, the minimization in (9) makes sense. In fact, the minimization (9) has one advantage over the minimization in (2) when p = q = 1. That is, fewer measurements are needed for exact recovery by using the  p minimization with p < 1 than the standard 1 convex minimization. Indeed, in [9], Chartrand demonstrated this fact by numerical examples with Gaussian random matrices and in [10], Chartrand and Staneva showed in theory that the  p minimization (8) can recover the exact sparse solution by using a fewer measurements when

404

M.-J. Lai, Y. Liu / Appl. Comput. Harmon. Anal. 30 (2011) 402–406

p → 0+ than the standard convex 1 minimization. It is well known that the 1 minimization is convex and is equivalent to the standard linear programming problem which has two matured computational approaches: the interior point method and the simplex method which requires polynomial time for most practical problems (cf. [26]). However, when p < 1, the  p minimization is a nonconvex minimization. Its computation is not as well understood as that of the 1 minimization. Nevertheless, one can find local minimizers of some regularized constrained or unconstrained  p minimization as in [18] and [22] and some local minimizers have been empirically observed to be global minimizers, see numerical experiments in [18] and [22]. However, the computation takes more time than that of the 1 minimization. Thus if one finds an efficient computational algorithm, the  p minimization for sparse solutions does provides a good approach to have a better sparse recovery than 1 minimization for practical applications. In this paper we mainly prove the following Theorem 1.3. Let A be a real matrix of size m × N and S ⊂ {1, 2, . . . , N } be a fixed index set. Fix p ∈ (0, 1] and r  1. Then the following conditions are equivalent: (a) All x(k) with support in S for k = 1, . . . , r can be uniquely recovered using (9); (b) For all vectors (u(1) , . . . , u(r ) ) ∈ ( N ( A ))r \{(0, 0, . . . , 0)}



u 21, j + · · · + u r2, j

p

j∈ S

<



u 21, j + · · · + u r2, j

p

;

(10)

j∈ S c

(c) For all vector z ∈ N ( A ) with z = 0,

 j∈ S

|z j | p <



|z j | p ,

(11)

j∈ S c

where z = ( z1 , . . . , z N ) T ∈ R N . That is, it is enough to check (11) for all z ∈ N ( A ) in order to see the uniqueness of the joint sparse solution vectors. This significantly reduces the complexity of verification of (10). Also, our results extend the results in Theorem 1.1 from the norm in it to the quasi-norm. These results completely answer the questions raised in [17]. 2. The proof of the main theorem To show the equivalences in Theorem 1.3, we divide the proof into three parts. That is, we prove (1) (b) ⇒ (a); (2) (a) ⇒ (c); (3) (c) ⇒ (b). The first part, i.e. (b) ⇒ (a) is to show that (10) is a sufficient condition for the uniqueness of the joint sparse solution vectors. The proof is a straightforward generalization of the arguments in [21]. We spell out the detail as follows. Let x(k) , k = 1, . . . , r be the joint sparse solution vectors of the minimization (9) with the assumption that the support of each x(k) are contained in S. For any vectors u(1) , . . . , u(r ) in N ( A ) with an assumption that they are not simultaneously zero, we easily have, for 0 < p  1,

      (x1, j , . . . , xr , j ) p  (u 1, j , . . . , ur , j ) p + (x1, j + u 1, j , . . . , xr , j + ur , j ) p . j∈ S

2

j∈ S

2

(12)

2

j∈ S

By the property (10), we have

      (x1, j , . . . , xr , j ) p < (u 1, j , . . . , ur , j ) p + (x1, j + u 1, j , . . . , xr , j + ur , j ) p . j∈ S

2

j∈ S c

2

(13)

2

j∈ S

But the support of the vectors x(k) , k = 1, . . . , r is contained in S. Hence, N      (x1, j , . . . , xr , j ) p = (x1, j , . . . , xr , j ) p j =1

2

j∈ S

<

    (u 1, j , . . . , ur , j ) p + (x1, j + u 1, j , . . . , xr , j + ur , j ) p j∈ S c

=

2

2

2

j∈ S

    (x1, j + u 1, j , . . . , xr , j + ur , j ) p + (x1, j + u 1, j , . . . , xr , j + ur , j ) p . j∈ S c

2

j∈ S

So x(k) , k = 1, . . . r are the unique solution to the minimization problem (9).

2

M.-J. Lai, Y. Liu / Appl. Comput. Harmon. Anal. 30 (2011) 402–406

405

The second part is to show (a) implies (c). Assume that there exists some z ∈ N ( A ) with z = 0 and



|z j | p 

j∈ S



|z j | p .

(14)

j∈ S c

We can choose x(1) ∈ R N such that the entries of x(1) restricted on S are equal to those of z, and the remaining entries are zeros as well as x(k) = 0 for k = 2, . . . , r. Then for multiple measurement vectors b(k) := Ax(k) , k = 1, . . . , r,





Ax(1) = Ax(1) − Az = A x(1) − z and by (14)

N        (x1, j , 0, . . . , 0) p = (x1, j , 0, . . . , 0) p = ( z j , 0, . . . , 0) p 2

j =1

2

j∈ S



2

j∈ S

    ( z j , 0, . . . , 0) p = (x1, j − z j , 0, . . . , 0) p 2

j∈ S c

2

j∈N

which contradicts with the uniqueness of the recovery of the new measurement vectors b(k) , k = 1, . . . , r. This finishes the proof for the part that (a) implies (c). The third part is to show (c) implies (b). Let us start with a useful lemma. Let Sr −1 be the unit sphere in Rr . From the perspective of integral geometry, we know that S1 | ·, ξ | p dξ is actually a rotation invariant function on the vectors in Rr (cf. [1,2], and [23]). In fact, we have Lemma 2.1. Let r be an integer not less than 2. Then for any p > 0,



   v , ξ  p dξ = C

Sr −1

for all v ∈ Sr −1 , where C > 0 is a constant dependent only on p and r. Proof. Let U be an orthogonal transformation of Rr . Then for any v ∈ Sr −1 , the sphere of the unit ball in Rr , we have









U ( v ), ξ = v , U −1 (ξ )

(15)

for all ξ ∈ Sr −1 . Also, we know that

Sr −1 = U ( v ): U ∈ O (r ) ,

(16)

where O (r ) denotes the set of all r × r orthogonal matrices. By change of variables and using the fact that | det(U −1 )| = 1, we get



   U ( v ), ξ  p dξ =

Sr −1



   v , U −1 (ξ )  p dξ =

Sr −1



   v , U −1 (ξ )  p dU −1 (ξ ) =

Sr −1





   v , ξ  p dξ

(17)

Sr −1

for all U ∈ O (r ). Thus we see that Sr −1 | v , ξ | p dξ ≡ C for some C > 0 and for all v ∈ Sr −1 .

2

Next we need to have a comparison result for any matrix B ∈ Rr × N of size r × N with any integer r  2. Lemma 2.2. Let S ⊂ {1, 2, . . . , N } be an index set with | S | = s. Given 0 < p  1 and a matrix B = [b i j ]1i r ,1 j  N ∈ Rr × N , if

    (x1 , x2 , . . . , xr ) B S  < (x1 , x2 , . . . , xr ) B S c  p p

for all (x1 , . . . , xr ) ∈ Rr \{(0, . . . , 0)}, then



b21, j + · · · + br2, j

j∈ S

p

<



b21, j + · · · + br2, j

(18)

p .

(19)

j∈ S c

Proof. Let us rewrite (18) as follows.

 j∈ S

|b1, j x1 + · · · + br , j xr | p <

 j∈ S c

|b1, j x1 + · · · + br , j xr | p

(20)

406

M.-J. Lai, Y. Liu / Appl. Comput. Harmon. Anal. 30 (2011) 402–406

for all (x1 , x2 , . . . , xr ) ∈ Rr \ {(0, 0, . . . , 0)}. Normalizing (b1, j , . . . , br , j ), we let v j :=

1

b21, j +···+br2, j



b21, j + · · · + br2, j

(b1, j , . . . , br , j ). Then we have

p 

p    p  v j , ξ  p < b21, j + · · · + br2, j  v j , ξ 

j∈ S

(21)

j∈ S c

for all vector ξ = (x1 , x2 , . . . , xr ) ∈ Rr \ {(0, 0, . . . , 0)}. Taking the integral of (21) over the unit (r − 1)-sphere Sr −1 , we have



b21, j + · · · + br2, j

j∈ S



p  

p       v j , ξ  p dξ <  v j , ξ  p dξ. b21, j + · · · + br2, j Sr −1

j∈ S c

(22)

Sr −1

By using Lemma 2.1, Sr −1 | ·, ξ | p dξ is a positive constant. Therefore, (19) follows.

2

We are now ready to prove the third part of Theorem 1.3 for p ∈ (0, 1). Now assume that we have (11) for r  2. For any (u(1) , u(2) , . . . , u(r ) ) of vectors in ( N ( A ))r \{(0, 0, . . . , 0)}, we let B = [u(1) , . . . , u(r ) ] T be a matrix in Rr × N , where 0 stands for a zero vector of size N × 1. For any (x1 , x2 , . . . , xr ) ∈ Rr \ {(0, 0, . . . , 0)}, z = (x1 , x2 , . . . , xr ) B is in N ( A ) \ {(0, 0, . . . , 0)}. The null space property of z, i.e. (11) implies (18) for all (x1 , x2 , . . . , xr ) ∈ Rr \ {(0, 0, . . . , 0)}. The conclusion (19) of Lemma 2.2 implies the null space property (10) for r  2. The above discussions show that (c) implies (b) in Theorem 1.3. These three parts above furnish a proof of Theorem 1.3. Acknowledgments We would like to thank the reviewers for the valuable comments that have helped significantly improve the presentation of this paper. References [1] S. Alesker, Continuous rotation invariant valuations on convex sets, Ann. of Math. (2) 149 (1999) 977–1005. [2] S. Alesker, J. Bernstein, Range characterization of the cosine transform on higher Grassmannians, Adv. Math. 184 (2004) 367–379. [3] A.M. Bruckstein, D.L. Donoho, M. Elad, From sparse solutions of systems of equations to sparse modeling of signals and images, SIAM Rev. 51 (2009) 34–81. [4] E. van den Berg, M.P. Friedlander, Joint-sparse recovery from multiple measurements, arXiv:0904.2051v1, 2009. [5] E.J. Candès, Compressive sampling, in: International Congress of Mathematicians, vol. III, Eur. Math. Soc., Zürich, 2006, pp. 1433–1452. [6] E.J. Candès, The restricted isometry property and its implications for compressed sensing, C. R. Acad. Sci. Paris, Ser. I 346 (2008) 589–592. [7] E.J. Candès, J.K. Romberg, T. Tao, Stable signal recovery from incomplete and inaccurate measurements, Comm. Pure Appl. Math. 59 (2006) 1207–1223. [8] E.J. Candès, T. Tao, Near-optimal signal recovery from random projections: universal encoding strategies, IEEE Trans. Inform. Theory 52 (12) (2006) 5406–5425. [9] R. Chartrand, Exact reconstruction of sparse signals via nonconvex minimization, IEEE Signal Process. Lett. 14 (2007) 707–710. [10] R. Chartrand, V. Staneva, Restricted isometry properties and nonconvex compressive sensing, Inverse Problems 24 (3) (2008), doi:10.1088/02665611/24/3/035020. [11] J. Chen, X. Huo, Theoretical results on sparse representations of multiple-measurement vectors, IEEE Trans. Signal Process. 54 (December 2006) 4634– 4643. [12] S.F. Cotter, B.D. Rao, Sparse channel estimation via matching pursuit with application to equalization, IEEE Trans. Commun. 50 (3) (March 2002) 374–377. [13] S.F. Cotter, B.D. Rao, K. Engang, K. Kreutz-Delgado, Sparse solutions to linear inverse problems with multiple measurement vectors, IEEE Trans. Signal Process. 53 (July 2005) 2477–2488. [14] D.L. Donoho, M. Elad, Optimally sparse representation in general (nonorthogonal) dictionaries via 1 minimization, Proc. Natl. Acad. Sci. USA 100 (5) (March 2003) 2197–2202. [15] I.J. Fevrier, S.B. Gelfand, M.P. Fitz, Reduced complexity decision feedback equalization for multi-path channels with large delay spreads, IEEE Trans. Commun. 47 (6) (June 1999) 927–937. [16] S. Foucart, A note on guaranteed sparse recovery via 1 -minimization, Appl. Comput. Harmon. Anal. 29 (2010) 97–103. [17] S. Foucart, R. Gribonval, Real vs. complex null space properties for sparse vector recovery, C. R. Math. Acad. Sci. Paris 348 (15–16) (August 2010) 863–865. [18] S. Foucart, M.J. Lai, Sparsest solutions of underdetermined linear systems via q minimization for 0 < q  1, Harmon. Anal. 26 (2009) 395–407. [19] S.Foucart, M.J. Lai, Sparse recovery with pre-gaussian random matrices, Studia Math. 200 (2010) 91–102. [20] R. Gribonval, M. Nielsen, Sparse representations in unions of bases, IEEE Trans. Inform. Theory 49 (12) (December 2003) 3320–3325. [21] R. Gribonval, M. Nielsen, Highly sparse representations from dictionaries are unique and independents of the sparseness measure, Appl. Comput. Harmon. Anal. 22 (2007) 335–355. [22] M.J. Lai, J. Wang, An unconstrained lq minimization for sparse solution of under determined linear systems, 2010, submitted for publication. [23] Y. Lonke, Derivatives of the L p -cosine transform, Adv. Math. 176 (2003) 175–186. [24] D. Malioutov, M. Cetin, A.S. Willsky, A sparse signal reconstruction perspective for source localization with sensor arrays, IEEE Trans. Signal Process. 53 (2005) 3010–3022. [25] M. Rudelson, R. Vershynin, Non-asymptotic theory of random matrices: extreme singular values, in: Proceedings of the International Congress of Mathematicians Hyderabad, India, 2010. [26] D.A. Spielman, S.H. Teng, Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time, J. ACM 51 (2004) 385–463. [27] J.A. Tropp, Recovery of short, complex linear combinations via 1 minimization, IEEE Trans. Inform. Theory 51 (4) (April 2005) 1568–1570. [28] J.A. Tropp, Algorithms for simultaneous sparse approximation: Part II: Convex relaxation, Signal Process. 86 (2006) 589–602. [29] J.A. Tropp, A.C. Gilbert, M.J. Strauss, Algorithms for simultaneous sparse approximation: Part I: Greedy pursuit, Signal Process. 86 (2006) 572–588.

The null space property for sparse recovery from ... - Semantic Scholar

Nov 10, 2010 - E-mail addresses: [email protected] (M.-J. Lai), [email protected] (Y. Liu). ... These motivate us to study the joint sparse solution recovery.

125KB Sizes 0 Downloads 64 Views

Recommend Documents

the value of null theories in ecology - Semantic Scholar
the-two-thirds-power scaling law, because that is a null ..... model for the origin of allometric scaling laws in biology. ...... unreachable in the Australian Outback.

Recursive Sparse, Spatiotemporal Coding - Semantic Scholar
Mountain View, CA USA .... the data from a given fixed basis; we call this the synthesis step. .... The center frames of the receptive fields of 256 out of 2048 basis.

Multi-Label Sparse Coding for Automatic Image ... - Semantic Scholar
Microsoft Research Asia,. 4. Microsoft ... [email protected], [email protected], {leizhang,hjzhang}@microsoft.com. Abstract .... The parameter r is set to be 4.

On the Strong Chromatic Index of Sparse Graphs - Semantic Scholar
3 Jul 2015 - 2Department of Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO 80217; ... 5Department of Computer Science, Iowa State University, Ames, IA 50011. 6Research ..... there does not exist a strong edge colorin

null model analysis of species nestedness patterns - Semantic Scholar
Key words: biogeography; matrix temperature; nestedness; nestedness temperature calculator; null ... the tests and causes the analyses to incorrectly identify nestedness after the addition of non-nested endemic species to the matrix. Moreover, high f

Linear space-time precoding for OFDM systems ... - Semantic Scholar
term channel state information instead of instantaneous one and it is ... (OFDM) and multicarrier code division multiple access ..... IEEE Vehicular Technology.

Combining Similarity in Time and Space for ... - Semantic Scholar
Eindhoven University of Technology, PO Box 513, NL 5600 MB Eindhoven, the Netherlands ... Keywords: concept drift; gradual drift; online learning; instance se- .... space distances become comparable across different datasets. For training set selecti

Property Based Intrusion Detection to Secure OLSR - Semantic Scholar
the network. This is achieved through periodic message exchanges. OLSR [4] is an example of a. Proactive MANET routing protocol. A significant issue in the ... In this section, we will describe the elements of OLSR, required for the purpose of invest