### On nonmetric similarity search problems in complex domainsOn nonmetric similarity search problems in complex domains

Access Restriction
Subscribed

 Author Skopal, Tom ♦ Bustos, Benjamin Source ACM Digital Library Content type Text Publisher Association for Computing Machinery (ACM) File Format PDF Copyright Year ©2011 Language English
 Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science Subject Keyword Similarity retrieval ♦ Approximate and exact search ♦ Nonmetric distances ♦ Similarity measuring Abstract The task of similarity search is widely used in various areas of computing, including multimedia databases, data mining, bioinformatics, social networks, etc. In fact, retrieval of semantically unstructured data entities requires a form of aggregated qualification that selects entities relevant to a query. A popular type of such a mechanism is similarity querying. For a long time, the database-oriented applications of similarity search employed the definition of similarity restricted to metric distances. Due to its topological properties, metric similarity can be effectively used to index a database which can then be queried efficiently by so-called metric access methods. However, together with the increasing complexity of data entities across various domains, in recent years there appeared many similarities that were $\textit{not}$ metrics—we call them $\textit{nonmetric}$ similarity functions. In this article we survey domains employing nonmetric functions for effective similarity search, and methods for efficient nonmetric similarity search. First, we show that the ongoing research in many of these domains requires complex representations of data entities. Simultaneously, such complex representations allow us to model also complex and computationally expensive similarity functions (often represented by various matching algorithms). However, the more complex similarity function one develops, the more likely it will be a nonmetric. Second, we review state-of-the-art techniques for efficient (fast) nonmetric similarity search, concerning both exact and approximate search. Finally, we discuss some open problems and possible future research trends. ISSN 03600300 Age Range 18 to 22 years ♦ above 22 year Educational Use Research Education Level UG and PG Learning Resource Type Article Publisher Date 2011-10-01 Publisher Place New York e-ISSN 15577341 Journal ACM Computing Surveys (CSUR) Volume Number 43 Issue Number 4 Page Count 50 Starting Page 1 Ending Page 50

#### Open content in new tab

Source: ACM Digital Library