Difference between revisions of "Approximate Nearest Neighbor (ANN)"

From
Jump to: navigation, search
m
m
Line 24: Line 24:
 
* [[Embedding]]:  [[Agents#AI-Powered Search|Search]] ... [[Clustering]] ... [[Recommendation]] ... [[Anomaly Detection]] ... [[Classification]] ... [[Dimensional Reduction]] ... [[...find outliers]]
 
* [[Embedding]]:  [[Agents#AI-Powered Search|Search]] ... [[Clustering]] ... [[Recommendation]] ... [[Anomaly Detection]] ... [[Classification]] ... [[Dimensional Reduction]] ... [[...find outliers]]
 
* [https://www.activeloop.ai/resources/glossary/approximate-nearest-neighbors-ann/ Approximate Nearest Neighbor (ANN) | ActiveLoop]
 
* [https://www.activeloop.ai/resources/glossary/approximate-nearest-neighbors-ann/ Approximate Nearest Neighbor (ANN) | ActiveLoop]
* [https://towardsdatascience.com/comprehensive-guide-to-approximate-nearest-neighbors-algorithms-8b94f057d6b6 comprehensive-guide to Approximate Nearest Neighbor (ANN) | Towards Data Science]
+
* [https://towardsdatascience.com/comprehensive-guide-to-approximate-nearest-neighbors-algorithms-8b94f057d6b6 Comprehensive Guide to Approximate Nearest Neighbor (ANN) | Towards Data Science]
  
 
Approximate Nearest Neighbor (ANN) is a technique used to efficiently find the closest points in high-dimensional spaces. It is often used in similarity search, where the goal is to find the most similar items to a given query item. ANN algorithms speed up the search by preprocessing the data into an efficient index and are often tackled using phases such as tree-based data structures, neighborhood graphs, hashing methods, and quantization.  
 
Approximate Nearest Neighbor (ANN) is a technique used to efficiently find the closest points in high-dimensional spaces. It is often used in similarity search, where the goal is to find the most similar items to a given query item. ANN algorithms speed up the search by preprocessing the data into an efficient index and are often tackled using phases such as tree-based data structures, neighborhood graphs, hashing methods, and quantization.  

Revision as of 12:52, 17 August 2023

YouTube search... ...Google search

Approximate Nearest Neighbor (ANN) is a technique used to efficiently find the closest points in high-dimensional spaces. It is often used in similarity search, where the goal is to find the most similar items to a given query item. ANN algorithms speed up the search by preprocessing the data into an efficient index and are often tackled using phases such as tree-based data structures, neighborhood graphs, hashing methods, and quantization.

The main advantage of ANN is that it can significantly reduce the retrieval cost, given that the query only needs to access a small fraction of documents guided by the index. However, the accuracy of ANN is not always guaranteed, and it depends on the quality of the index and the similarity measure used. In some cases, an approximate nearest neighbor is almost as good as the exact one, especially if the distance measure accurately captures the notion of user quality.

ANN algorithms are allowed to return points whose distance from the query is at most times the distance from the query to its nearest points. Proximity graph methods such as HNSW (Hierarchical Navigable Small World) are considered the current state-of-the-art for the approximate nearest neighbors search. Other ANN techniques include KD-trees, locality-sensitive hashing (LSH), and Product Quantization (PQ).

ANN is widely used in various applications, such as image and video retrieval, [Recommendation|recommendation systems]], and Natural Language Processing (NLP).