Access Restriction

Author Chierichetti, Flavio ♦ Kumar, Ravi
Source ACM Digital Library
Content type Text
Publisher Association for Computing Machinery (ACM)
File Format PDF
Copyright Year ©2015
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Locality sensitive hashing ♦ Set similarities
Abstract Locality sensitive hashing (LSH) is a key algorithmic tool that is widely used both in theory and practice. An important goal in the study of LSH is to understand which similarity functions admit an LSH, that is, are LSHable. In this article, we focus on the class of transformations such that given any similarity that is LSHable, the transformed similarity will continue to be LSHable. We show a tight characterization of all such LSH-preserving transformations: they are precisely the probability generating functions, up to scaling. As a concrete application of this result, we study which set similarity measures are LSHable. We obtain a complete characterization of similarity measures between two sets $\textit{A}$ and $\textit{B}$ that are ratios of two linear functions of $∣\textit{A}∩$ $\textit{B}∣,$ $∣\textit{A}▵\textit{B}∣,$ $∣\textit{A}∪\textit{B}∣:$ such a measure is LSHable if and only if its corresponding distance is a metric. This result generalizes the well-known LSH for the Jaccard set similarity, namely, the minwise-independent permutations, and obtains LSHs for many set similarity measures that are used in practice. Using our main result, we obtain a similar characterization for set similarities involving radicals.
ISSN 00045411
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2015-11-02
Publisher Place New York
e-ISSN 1557735X
Journal Journal of the ACM (JACM)
Volume Number 62
Issue Number 5
Page Count 25
Starting Page 1
Ending Page 25

Open content in new tab

   Open content in new tab
Source: ACM Digital Library