216x Filetype PDF File size 0.32 MB Source: www.cs.tau.ac.il
TheApproximateRankofaMatrixanditsAlgorithmic Applications NogaAlon∗ Troy Lee Adi Shraibman Schools of Mathematics and Centre for Quantum School of Computer Science Computer Science Technologies Academic College of Tel-Aviv Tel Aviv University Singapore Yaffo, Israel Tel Aviv 69978, Israel troyjlee@gmail.com adish@mta.ac.il nogaa@tau.ac.il † Santosh Vempala School of Computer Science Georgia Tech Atlanta 30332 vempala@gatech.edu ABSTRACT Keywords We study the ǫ-rank of a real matrix A, defined for any Approximate rank, Nash equilibria, covering number, con- ǫ > 0 as the minimum rank over matrices that approxi- vex body mate every entry of A to within an additive ǫ. This pa- rameter is connected to other notions of approximate rank 1. INTRODUCTION and is motivated by problems from various topics includ- ing communication complexity, combinatorial optimization, 1.1 Background game theory, computational geometry and learning theory. Here we give bounds on the ǫ-rank and use them for algo- Alargebodyofworkintheoreticalcomputersciencedeals rithmic applications. Our main algorithmic results are (a) with various ways of approximating a matrix by a simpler polynomial-time additive approximation schemes for Nash one. The motivation from the design of approximation algo- equilibria for 2-player games when the payoff matrices are rithms is clear. When the input to a computational problem positive semidefinite or have logarithmic rank and (b) an is a matrix (that may represent a weighted graph, a payoff additive PTAS for the densest subgraph problem for similar matrix in a two-person game or a weighted constraint sat- classes of weighted graphs. We use combinatorial, geometric isfaction problem), the hope is that it is easier to solve or and spectral techniques; our main new tool is an algorithm approximately solve the computational problem for the ap- for efficiently covering a convex body with translates of an- proximating matrix, which is simpler. If the notion of ap- other convex body. proximation is suitable for the problem at hand, then the solution will be an approximate solution for the original in- put matrix as well. General Terms A typical example of this reasoning is the application of cut decomposition and that of regular decomposition of ma- Algorithms, Theory trices. The cut-norm of a matrix A with a set of rows R and a set of columns C is ∗ max | X A |. Research supported in part by an ERC Advanced grant, by S⊂R,T⊂T ij a USA-Israeli BSF grant, and by the Hermann Minkowski i∈S,j∈T Minerva Center for Geometry in Tel Aviv University. A cut matrix B(S,T;r) is a matrix B for which B = r †Supported in part by NSF awards CCF-0915903 and CCF- ij iff i ∈ S,j ∈ T and B =0otherwise. Frieze and Kannan 1217793. ij showed that any n by m matrix B with entries in [−1,1] can be approximated by a sum of at most 1 cut matrices, ǫ2 in the sense that the cut norm of the difference between B and this sum is at most ǫmn. Such an approximation can Permission to make digital or hard copies of all or part of this work for be found efficiently and leads to several approximation algo- personal or classroom use is granted without fee provided that copies are rithms for dense graphs—see [20]. Similar approximations of not made or distributed for profit or commercial advantage and that copies matrices can be given using variants of the regularity lemma bear this notice and the full citation on the first page. To copy otherwise, to of Szemer´edi. These provide a more powerful approximation republish, to post on servers or to redistribute to lists, requires prior specific at the expense of increasing the complexity of the approx- permission and/or a fee. imating matrix, and supply approximation algorithms for STOC’13,June1–4,2013,PaloAlto, California, USA. Copyright 2013 ACM 978-1-4503-2029-0/13/06 ...$15.00. additional problems see [6, 19, 7]. All these methods, however, deal with global properties, Boolean function is bounded from below by logǫ-rank(A)/2 as the approximation obtained by all these variants of the where A is the communication matrix of the function. Using regularity lemma are not sensitive to local changes in the this connection, it has been proven that for fixed ǫ the 2n √ matrix. In particular, these methods cannot provide ap- by 2n set-disjointness matrix has ǫ-rank 2Θ( n), using the proximate solutions to problems like that of finding an ap- quantumprotocolofAaronsonandAmbainis[1]fordisjoint- proximate Nash equilibrium in a two person game, or that of ness, and Razborov’s lower bound for the problem (see [36], approximating the maximum possible density of a subgraph where there is a lower bound for the trace-norm of any ma- √ on, say, nvertices in a given n vertex weighted graph. Mo- trix that ǫ-approximates the disjointness matrix restricted tivated by applications of this type, it is natural to consider to the sets of size n/4.) In another work, Lee and Shraib- a stronger notion of approximation of a matrix, an approx- man[29] have given algorithmic bounds on ǫ-rank via the γ2 imation in the infinity norm, by a matrix of low rank. This norm. The γ -norm of a real matrix A, denoted by γ (A), is 2 2 motivates the following definition. the minimum possible value of the product c(X)c(Y), where c(Z) is the maximum ℓ -norm of a column of a matrix Z, Definition 1.1. For a real n×n matrix A, the ǫ-rank of 2 and the minimum is taken over all factorizations of A of the Ais defined as follows: form A = XtY. For a sign matrix A and for α ≥ 1, let n×n γα(A) denote the minimum possible value of γ (B), where ǫ-rank(A) = min{rank(B) : B ∈ ℜ , kA −Bk ≤ǫ}. 2 2 ∞ Branges over all matrices of the same dimension as A that We will usually assume that the matrix A has entries in satisfy 1 ≤ A ·B ≤αforalladmissible i,j. Let rank (A) ij ij α [−1,1], but the definition holds for any real matrix. Define denote the minimum possible rank of such a matrix B. Note the density norm of a matrix A to be that for α = 1 + ǫ this is (roughly) the ǫ/2-rank of A. In [29] it is shown that rank (A) and γα(A) are polynomially |xTAy| α 2 den(A) = max . related for any sign matrix A, up to a poly-logaritmic fac- x,y∈ℜn kxk1kyk1 α + tor in the dimension of A. Since γ2 (A) can be computed It is easy to verify that the following definition of ǫ-rank is efficiently using semi-definite programming, this provides a equivalent to Definition 1.1: (rough) approximation algorithm for the ǫ-rank of a given n×n sign matrix. ǫ-rank(A) = {minrank(B) : B ∈ ℜ , den(A−B) ≤ ǫ}. For the special case of the n by n identity matrix the The investigation of notions of simple matrices that ap- ǫ-rank has been studied and provided several applications. In [2] it is shown that it is at least Ω( logn ) and at most proximate given ones is motivated not only by algorithmic ǫ2 log(1/ǫ) applications, but by applications in complexity theory as O(logn). This is used in [2] to derive several applications ǫ2 well. Following Valiant [45] call a matrix A (r,s)-rigid if for in geometry, coding theory, extremal finite set theory and any matrix B of the same dimensions as A and rank at most the study of sample spaces supporting nearly independent r, A−B contains a row with at least s nonzero entries. Here randomvariables. See also [13] for a more recent application √ the notion of simple matrix is thus a matrix of low rank, and of the lower bound (for the special case ǫ < 1/ n) in com- the notion of approximation is to allow a limited number of binatorial geometry and in the study of locally correctable changes in each row. Valiant proved that if an n by n matrix codes over real and complex numbers. is (Ω(n),nΩ(1))-rigid, then there is no arithmetic circuit of The notion of the ǫ-rank of a matrix is also related to linear size and logarithmic depth that computes Ax for any learning and to computational geometry. Indeed, the prob- given input x. Therefore, the main problem in this context lem of computing the ǫ-rank of a given n by n matrix A is (which is still wide open) is to give an explicit construction equivalent to the geometric problem of finding the minimum of such a rigid matrix. possible dimension of a linear subspace of Rn that intersects Another notion that received a considerable amount of the aligned cubes of edge length 2ǫ centered at the columns attention is the sign-rank of a real matrix. For a matrix of A. In learning, the problem of learning with margins, the A, rank±(A) is defined as the minimum rank over matrices fat-shattering dimension of a family of functions and the each of whose entries has the same sign as the correspond- problem of learning functions approximately are all related ing entry in the original matrix. The notion of approxima- to this notion (see e.g., [4] and the references therein). tion here refers to keeping the signs of the entries, while the Thefocusofourpaperisalgorithmicapplicationsofǫ-rank; simplicity of the approximating matrix is measured by its we state our results in the next section. rank. The sign rank has played a useful role in the study of 1.2 Results the unbounded error communication complexity of Boolean functions (see, e.g., [5, 21] and their references), in establish- We begin with bounds on the ǫ-rank. A well known re- ing lower bounds in learning theory and in providing lower sult of Forster [21] asserts that the sign-rank of any n by n √ bounds for the size of threshold-of-majority circuits com- Hadamard matrix H is at least Ω( n). This clearly implies puting a function in AC0 (see [37]). It is clear that for −1,1 the same lower bound for the ǫ-rank of any such matrix for matrices or matrices with entries of absolute value exceeding any ǫ < 1. Using the approximate γ norm, Linial et al. 2 ǫ, ǫ-rank(A) ≥ rank (A), and simple examples show that in [30] further show that ǫ-rank(H) ≥ (1−2ǫ)n. The following ± gives a slightly stronger estimate, which is tight. many cases the ǫ-rank is far larger than the sign-rank. Another line of work on ǫ-rank is motivated by commu- Theorem 1.2. For any n × n Hadamard matrix H and nication complexity [17, 29, 36]. The ǫ-error private-coin any 0 < ǫ < 1, ǫ-rank(H) ≥ (1−ǫ2)n. communicationcomplexityofaBooleanfunctionisbounded from below by the logarithm of the ǫ-rank of the correspond- Next we show a lower bound on the approximate rank of ing communication matrix (see [28]). In addition, the ǫ- a random d-regular graph. Let A denote the adjacency G ¯ error private-coin quantum communication complexity of a matrix of a graph G, and A denote the“signed”adjacency G matrix where the (i,j) entry is 1 for an edge and −1 for a can be computed using linear programming. The setting non-edge. We show, in fact, a stronger statement that for a of A + B having constant rank was investigated by Kan- ¯ random d-regular graph the sign rank of A is Ω(d). nan and Theobald [27], who gave a PTAS for the case when G Aclosely related result was shown by Linial and Shraib- the rank is a constant. Their algorithm has running time man [31]. Similarly to the sign rank, define γ∞(A) as the poly(d,1/ǫ) d 2 n ; ours has complexity [O(1/ǫ)] poly(n), giving a minimumγ normofamatrixthathasthesamesignpattern 2 PTAS for ǫ-rank d = O(logn). as A and all entries at least 1 in magnitude. It is known that rank (A) = O(γ∞(A)log(mn)) [14] and also that the sign Theorem 1.6. Let A,B ∈ [−1,1]n×n be the payoff ma- ± 2 ∞ trices of a 2-player game. If A + B is positive semidefinite, rank can be exponentially smaller than γ2 [16, 42]. Linial ∞ √ then an ǫ-Nash equilibrium of the game can be computed by and Shraibman show that γ (A ) = Ω( d) for a random 2 G d-regular graph, which also implies an Ω(d) lower bound on a Las Vegas randomized algorithm using poly(n) space and the ǫ-approximate rank for any constant ǫ < 1/2. expected time Our proof is different from these γ techniques and relies, 2 2 O(log(1/ǫ)/ǫ ) as in previous lower bounds on the sign rank [5], on Warren’s n theorem from real algebraic geometry [47]. i.e., there is a PTAS to compute an ǫ-Nash equilibrium. Theorem 1.3. For almost all d-regular graphs G on n The above theorem can be recovered by a similar algorithm ¯ using the γ2-norm approach. vertices rank (A ) = Ω(d) for the adjacency matrix of G. ± G The next theorem is more general (the γ approach seems 2 By“almostall”wemeanthatthefractionofd-regulargraphs to achieve only a weaker bound of (n/ǫ)O(d) here). on n vertices for which the statement holds tends to 1 as n n×n ¯ Theorem 1.7. Let A,B ∈ [−1,1] be the payoff ma- tends to infinity. Note that as A =2A −J,whereJ isthe G G all ones matrix, this result also implies that ǫ-rank(A ) = trices of a 2-player game. Suppose (ǫ/2)-rank(A + B) = G Ω(d) for any ǫ < 1/2. The lower bound here is tight for d and suppose we have a matrix C of rank d satisfying kA+B−Ck ≤ǫ/2. Then, an ǫ-Nash equilibrium of the ǫ-rank up to a logn factor: it follows from results in [30, 29] ∞ that for any fixed ǫ bounded away from zero the ǫ-rank of game can be computed by a Las Vegas randomized algorithm A foreveryd-regulargraphonnverticesisO(dlogn). The using poly(n) space and expected time G lower bound is tight for sign rank, as a result of [5] implies 1 O(d) that the sign rank is bounded by the maximal number of ǫ poly(n). sign changes in each row. The ǫ-rank of any positive semidefinite matrix can be Notethatinparticular if the rank of A+B is d ≤ O(logn) bounded from above as stated in the next theorem. The we can simply take C = A + B and get a polynomial time theorem follows via a direct application of the Johnson- algorithm. Lindenstrauss lemma [26]. Our second application is to finding an approximately Theorem 1.4. Forasymmetricpositivesemi-definite n× densest (biparite) subgraph, a problem that has thus far n matrix A with |A | ≤ 1, we have evaded a PTAS even in the dense setting. In fact, there are ij hardness results indicating that even the dense case is hard ǫ-rank(A) ≤ 9logn. to approximate to within any constant factor, see [3]. Here ǫ2 −ǫ3 we observe that we can get efficiently a good additive ap- Note that this is nearly tight, by the above mentioned lower proximation for the special case that the input matrix has a bound for the ǫ-rank of the identity matrix. small ǫ-rank (assuming we are given an approximating ma- The last theorem can be extended to linear combinations trix of low rank). For a matrix A with entries in [0,1] and subsets S,T of rows and columns, let A be the submatrix of positive semi-definite (=PSD) matrices. S,T induced by S and T. The density of the submatrix A is P S,T Corollary 1.5. Let A = m αB where |α | ≤ 1 are P i=1 i i i A scalars and Bi are n×n PSD matrices with entries at most density(A ) = i∈S,j∈T ij 1 in magnitude. Then S,T |S||T| 2logn the average of the entries of the submatrix. ǫ-rank(A) ≤ Cm ǫ2 Theorem 1.8. Let A be an n×n real matrix with entries for an absolute constant C. in [0,1]. Then for any integer 1 ≤ k ≤ n, there is a Las Theresultsaboveprovideseveralalgorithmicapplications. Vegas randomized algorithm to find subsets S,T of rows and Our first application is finding approximate Nash equilibria columns with |S| = |T| = k s.t., in 2-player games. Lipton et al. [32] showed that an ǫ-Nash density(A ) ≥ max density(A ) −ǫ. S,T U,V 2 |U|=|V|=k O(logn/ǫ ) for any 2-player game can be computed in time n 2 and it has been an important open question to determine O(log(1/ǫ)/ǫ ) Its expected time complexity is bounded by n if whether this problem has a PTAS (i.e., an algorithm of com- A is PSD and by 1 O(d)poly(n) where d = (ǫ/2)-rank(A) plexity of nf(ǫ)). Our result establishes a PTAS when A+B ǫ (and we are given an ǫ/2-approximation of rank d), and its is PSD or when A+B has ǫ-rank O(logn) (and we are given space complexity is poly(n). an ǫ-approximating matrix C of A+B with rank O(logn)), where A and B are the payoff matrices of the game. Note Note that this is a bipartite version of the usual densest that the special case when A + B = 0 corresponds to zero- subgraph problem in which the objective is to find the den- sum games, a class for which the exact Nash equilibrium sity of the densest subgraph on (say) 2k vertices in a a given (possibly weighted) input graph. It is easy to see that the where the minimum is taken over all subspaces U of dimen- answers to these two problems can differ by at most a factor sion m−i+1. Similarly, of 2, and as the best known polynomial time approximation T λ (C) = min max x Cx. algorithm for this problem, given in [15], only provides an j 1/4+o(1) W, dim(W)=m−j+1x∈W,kxk=1 O(n )-approximation for an n-vertex graph, this bi- Put V = U ∩ W. Clearly, the dimension of V is at least partite version also appears to be very difficult for general m−i−j+2andforanyx∈V,kxk=1, graphs. Ourresults for general matrices are based on an algorithm T T T x (B+C)x=x Bx+x Cx≤λ(B)+λ (C). to efficiently find a near-optimal cover of a convex body A i j Therefore, λ (B+C)is equal to bytranslates of another convex body B. We state this result i+j−1 here as it seems to be of independent interest. Let N(A,B) T min max x (B+C)x≤λ(B)+λ (C). denote the minimum number of translates of B required to i j cover A. Z, dim(Z)=m−i−j+2x∈Z,kxk=1 ✷ Theorem 1.9. For any two centrally symmetric convex Proof. (of Theorem 1.2). Let E be an n by n matrix, bodies A,B in ℜd, a cover of A using translates of B of size O(d) kEk∞ ≤ǫsothatthe rank of H+E is the ǫ-rank of H. Let N(A,B)2 can be enumerated by a Las Vegas randomized BandC bethefollowing two symmetric 2n by 2n matrices O(d) algorithm with expected running time N(A,B)2 and us- ing space poly(d). B= 0 H , (2) HT 0 In essence, this theorem allows us to find and use covers of O(d) O(d) size (1/ǫ) rather than (d/ǫ) that can be constructed more easily. Although we do not prove it here, we expect C= 0 E . (3) that the theorem can be extended to asymmetric convex ET 0 bodies. Since H is a Hadamard matrix, BTB = B2 is n times the 2 1.3 Organization 2n by 2n identity matrix. It follows that λ (B) = n for all i, i The rest of the paper is organized as follows. In Section and by Lemma 2.1, part (i), exactly n eigenvalues of B are √ √ √ n and exactly n are − n. In particular λ (B) = − n. 2 we describe lower and upper bounds for the ǫ-rank of a n+1 matrix, presenting the proofs of Theorems 1.2, 1.3, 1.4 and The absolute value of every entry of E is at most ǫ, and 2 2 Corollary 1.5. In Section 3 we describe efficient construc- thus the square of the Frobenuis norm of C is at most 2n ǫ . As this is the trace of CTC, that is, the sum of squares of tions of ǫ-nets which are required to derive the algorithmic 2 applications, and prove Theorem 1.9. Section 4 contains the eigenvalues of C, it follows that C has at most 2nǫ eigenval- √ algorithmic applications including the proofs of Theorems ues of absolute value at least n. By Lemma 2.1, part (i), 2 1.6, 1.7 and 1.8. The final Section 5 contains some conclud- this implies that there are at most ǫ n eigenvalues of C of √ √ value at least n and thus λ 2 (C) < n. By Lemma ing remarks, open problems and plans for future extensions. ⌊ǫ n⌋+1 2.1, part (ii) we conclude that λ 2 (B + C) < 0. n+⌊ǫ n⌋+1 2. BOUNDSONǫ-RANK Therefore B + C has at least n − ⌊ǫ2n⌋ negative eigenval- ues and hence also at least n − ⌊ǫ2n⌋ positive eigenvalues. 2.1 Lowerbounds Therefore its rank, which is exactly twice the rank of H+E, is at least 2(n − ⌊ǫ2n⌋), completing the proof. ✷ Westart with the proof of Theorem 1.2, which is based on some spectral techniques. For a symmetric m by m matrix 1 Alet λ (A) ≥ λ (A) ≥ ... ≥ λ (A) denote its eigenvalues, Remark: By the last theorem, if ǫ < √n then the ǫ-rank 1 2 m of any n by n Hadamard matrix is n. This is tight in the ordered as above. We need the following simple lemma. sense that for any n which is a power of 4 there is an n by Lemma 2.1. (i) Let A be an n by n real matrix, then the 1 n Hadamard matrix with ǫ-rank n−1 for ǫ = √n. Indeed, 2n eigenvalues of the symmetric 2n by 2n matrix the matrix B= 0 A (1) +1 +1 +1 −1 T A 0 +1 −1 +1 +1 H1 = +1 +1 −1 +1 (4) appear in pairs λ and −λ. +1 −1 −1 −1 (ii) For any two symmetric m by m matrices B and C and all admissible values of i and j is a 4 by 4 Hadamard matrix in which the sum of elements in λ (B+C)≤λ(B)+λ (C). every row is of absolute value 2. Thus, the tensor product of i+j−1 i j k copies of this matrix is a Hadamard matrix of order n = 4k Proof. (i) Let (x,y) be the eigenvector of an eigenvalue in which the sum of elements in every row is in absolute value λ of B, where x and y are real vectors of length n. Then k 1 1 T 2 . We can thus either add or subtract k = √ to each Ay =λx and A x=λy. It is easy to check that the vector 2 n (x,−y) is an eigenvector of the eigenvalue −λ of B. This element of this matrix and get a matrix in which the sum of proves part (i). elements in every row is 0, that is, a singular matrix. (ii) The proof is similar to that of the Weyl Inequalities - c.f., In a similar vein, we can show that the ǫ-rank of any e.g., [23], and follows from the variational characterization n-by-n matrix A with entries in [−1,1] is at most n − 1 6 of the eigenvalues. It is easy and well known that for ǫ = √ . Indeed, for any such A = (aij) there are, by n T the main result of [44], δj ∈ {−1,1} so that for every i, λi(B) = min max x Bx Pn √ | a δ | ≤ 6 n. Fix such δ and define, for each i, U, dim(U)=m−i+1x∈U,kxk=1 j=1 ij j j
no reviews yet
Please Login to review.