Speeding up spatial approximation search in metric spacesSpeeding up spatial approximation search in metric spaces

Access Restriction
Subscribed

 Author Figueroa, Karina ♦ Chavez, Edgar ♦ Navarro, Gonzalo ♦ Paredes, Rodrigo Source ACM Digital Library Content type Text Publisher Association for Computing Machinery (ACM) File Format PDF Copyright Year ©2009 Language English
 Subject Domain (in DDC) Computer science, information & general works ♦ Computer programming, programs & data Subject Keyword AESA ♦ Proximity or similarity searching Abstract Proximity searching consists of retrieving from a database those elements that are similar to a query object. The usual model for proximity searching is a metric space where the distance, which models the proximity, is expensive to compute. An index uses precomputed distances to speedup query processing. Among all the known indices, the baseline for performance for about 20 years has been AESA. This index uses an iterative procedure, where at each iteration it first chooses the next promising element (“pivot”) to compare to the query, and then it discards database elements that can be proved not relevant to the query using the pivot. The next pivot in AESA is chosen as the one minimizing the sum of lower bounds to the distance to the query proved by previous pivots. In this article, we introduce the new index $\textit{iAESA},$ which establishes a new performance baseline for metric space searching. The difference with AESA is the method to select the next pivot. In iAESA, each candidate sorts previous pivots by closeness to it, and chooses the next pivot as the candidate whose order is most similar to that of the query. We also propose a modification to AESA-like algorithms to turn them into probabilistic algorithms. Our empirical results confirm a consistent improvement in query performance. For example, we perform as few as 60% of the distance evaluations of AESA in a database of documents, a very important and difficult real-life instance of the problem. For the probabilistic algorithm, we perform in a database of faces up to 40% of the comparisons made by the best alternative algorithm to retrieve the same percentage of the correct answer. Based on the empirical results, we conjecture that the new probabilistic AESA-like algorithms will become, as AESA had been for exact algorithms, a reference point establishing, in practice, a lower bound on how good a probabilistic proximity search algorithm can be. ISSN 10846654 Age Range 18 to 22 years ♦ above 22 year Educational Use Research Education Level UG and PG Learning Resource Type Article Publisher Date 2010-01-05 Publisher Place New York e-ISSN 10846654 Journal Journal of Experimental Algorithmics (JEA) Volume Number 14 Page Count 16 Starting Page 3.6 Ending Page 3.21

Open content in new tab

Source: ACM Digital Library