Build k-Nearest Neighbor (k-NN) similarity search engine with Amazon Elasticsearch Service

Posted on: Mar 3, 2020

Amazon Elasticsearch Service now offers k-Nearest Neighbor (k-NN) search which can enhance search by similarity use cases like product recommendations, fraud detection, and image, video and semantic document retrieval. Built using the lightweight and efficient Non-Metric Space Library (NMSLIB), k-NN enables high scale, low latency nearest neighbor search on billions of documents across thousands of dimensions with the same ease as running any regular Elasticsearch query.  

Given a space of data points, the k-NN plugin finds the number (k) of data points at closest distance to a query data point. A new field type for k-NN, enables you to seamlessly integrate k-NN search with Elasticsearch's extensive features such as aggregations and filtering to further improve the precision of the search results. Elasticsearch's distributed architecture enables the k-NN plugin to ingest and process large datasets, support incremental updates, thereby delivering you a highly performant similarity search engine with fast inference.  

k-NN similarity search is powered by Open Distro for Elasticsearch, an Apache 2.0-licensed distribution of Elasticsearch. To learn more about Open Distro for Elasticsearch and its k-NN plugin, visit the project website.

k-NN similarity search is available on domains running Elasticsearch 7.1 and higher. To learn more, see the documentation.

k-NN similarity search is now available for Amazon Elasticsearch Service domains across 22 regions globally: US East (N. Virginia, Ohio), US West (Oregon, N. California), AWS GovCloud (US-Gov-East, US-Gov-West), Canada (Central), South America (Sao Paulo), EU (Ireland, London, Frankfurt, Paris, Stockholm), Asia Pacific (Singapore, Sydney, Tokyo, Seoul, Mumbai, Hong Kong), AWS China (Beijing, operated by Sinnet), AWS China (Ningxia, operated by NWCD), and Middle East (Bahrain) . Please refer to the AWS Region Table for more information about Amazon Elasticsearch Service availability.