Author | Valiant, Gregory |
Source | ACM Digital Library |
Content type | Text |
Publisher | Association for Computing Machinery (ACM) |
File Format | |
Copyright Year | ©2015 |
Language | English |
Subject Domain (in DDC) | Computer science, information & general works ♦ Data processing & computer science |
Subject Keyword | Correlations ♦ Approximate closest pair ♦ Asymmetric embeddings ♦ Learning juntas ♦ Locality sensitive hashing ♦ Metric embedding ♦ Nearest neighbor ♦ Parity with noise |
Abstract | Given a set of $\textit{n}$ $\textit{d}-dimensional$ Boolean vectors with the promise that the vectors are chosen uniformly at random with the exception of two vectors that have Pearson correlation coefficient ρ (Hamming distance $\textit{d}ċ$ 1™ρ&frac;2), how quickly can one find the two correlated vectors? We present an algorithm which, for any constant &eps;>0, and constant ρ>0, runs in expected time $O(n^{5&mins;ω&frac;4&mins;ω+&eps;}$ +nd) < $O(n^{1.62}$ +nd), where ω < 2.4 is the exponent of matrix multiplication. This is the first subquadratic--time algorithm for this problem for which ρ does not appear in the exponent of $\textit{n},$ and improves upon $O(n^{2&mins;O(ρ)}),$ given by Paturi et al. [1989], the Locality Sensitive Hashing approach of Motwani [1998] and the Bucketing Codes approach of Dubiner [2008]. Applications and extensions of this basic algorithm yield significantly improved algorithms for several other problems. Approximate Closest Pair. For any sufficiently small constant &eps;>0, given $\textit{n}$ $\textit{d}-dimensional$ vectors, there exists an algorithm that returns a pair of vectors whose Euclidean (or Hamming) distance differs from that of the closest pair by a factor of at most 1+&eps;, and runs in time $O(n^{2&mins;Θ(&sqrt;&eps;})).$ The best previous algorithms (including Locality Sensitive Hashing) have runtime $O(n^{2&mins;O(&eps;)}).$ Learning Sparse Parities with Noise. Given samples from an instance of the learning parities with noise problem where each example has length $\textit{n},$ the true parity set has size at most $\textit{k}$ « $\textit{n},$ and the noise rate is η, there exists an algorithm that identifies the set of $\textit{k}$ indices in time $\textit{n}ω+&eps;&frac;3$ k $\textit{poly(1&frac;1&mins;2η)}$ < $n^{0.8k}$ poly(1&frac;1&mins;2 η). This is the first algorithm with no dependence on η in the exponent of $\textit{n},$ aside from the trivial $\textit{O}&big;(n$ &choose; k)&big; ≈ $O(n^{k})$ brute-force algorithm, and for large noise rates (η > 0.4), improves upon the results of Grigorescu et al. [2011] that give a runtime of $\textit{n}(1+(2$ $η)}^{2}$ + $\textit{o}(1))k&frac;2$ $\textit{poly}(1&frac;1&mins;2η).$ Learning $\textit{k}-Juntas$ with Noise. Given uniformly random length $\textit{n}$ Boolean vectors, together with a label, which is some function of just $\textit{k}$ « $\textit{n}$ of the bits, perturbed by noise rate η, return the set of relevant indices. Leveraging the reduction of Feldman et al. [2009], our result for learning $\textit{k}-parities$ implies an algorithm for this problem with runtime $\textit{n}ω+&eps;&frac;3$ k $\textit{poly}(1&frac;1&mins;2η)$ < $n^{0.8k}$ $\textit{poly}(1&frac;1&mins;2$ η), which is the first runtime for this problem of the form $n^{ck}$ with an absolute constant $\textit{c}$ < 1. Learning $\textit{k}-Juntas$ without Noise. Given uniformly random length $\textit{n}$ Boolean vectors, together with a label, which is some function of $\textit{k}$ « $\textit{n}$ of the bits, return the set of relevant indices. Using a modification of the algorithm of Mossel et al. [2004], and employing our algorithm for learning sparse parities with noise via the reduction of Feldman et al. [2009], we obtain an algorithm for this problem with runtime $\textit{n}ω+$ &eps;&frac;4 k $\textit{poly(n)}$ < $n^{0.6k}$ $\textit{poly(n)},$ which improves on the previous best of $n^{ω+1&frac;ωk}$ ≈ $n^{0.7k}$ $\textit{poly(n)}$ of Mossel et al. [2004]. |
ISSN | 00045411 |
Age Range | 18 to 22 years ♦ above 22 year |
Educational Use | Research |
Education Level | UG and PG |
Learning Resource Type | Article |
Publisher Date | 2015-05-06 |
Publisher Place | New York |
e-ISSN | 1557735X |
Journal | Journal of the ACM (JACM) |
Volume Number | 62 |
Issue Number | 2 |
Page Count | 45 |
Starting Page | 1 |
Ending Page | 45 |
National Digital Library of India (NDLI) is a virtual repository of learning resources which is not just a repository with search/browse facilities but provides a host of services for the learner community. It is sponsored and mentored by Ministry of Education, Government of India, through its National Mission on Education through Information and Communication Technology (NMEICT). Filtered and federated searching is employed to facilitate focused searching so that learners can find the right resource with least effort and in minimum time. NDLI provides user group-specific services such as Examination Preparatory for School and College students and job aspirants. Services for Researchers and general learners are also provided. NDLI is designed to hold content of any language and provides interface support for 10 most widely used Indian languages. It is built to provide support for all academic levels including researchers and life-long learners, all disciplines, all popular forms of access devices and differently-abled learners. It is designed to enable people to learn and prepare from best practices from all over the world and to facilitate researchers to perform inter-linked exploration from multiple sources. It is developed, operated and maintained from Indian Institute of Technology Kharagpur.
NDLI is a conglomeration of freely available or institutionally contributed or donated or publisher managed contents. Almost all these contents are hosted and accessed from respective sources. The responsibility for authenticity, relevance, completeness, accuracy, reliability and suitability of these contents rests with the respective organization and NDLI has no responsibility or liability for these. Every effort is made to keep the NDLI portal up and running smoothly unless there are some unavoidable technical issues.
Ministry of Education, through its National Mission on Education through Information and Communication Technology (NMEICT), has sponsored and funded the National Digital Library of India (NDLI) project.
Phone: +91-3222-282435
For any issue or feedback, please write to ndl-support@iitkgp.ac.in