AWS Open Source Blog

Category: Analytics

Open Distro for Elasticsearch logo.

Introducing real-time anomaly detection in Open Distro for Elasticsearch

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. Visit the website to learn more. There is an enormous increase in real-time streaming applications across a wide range of industries such as finance, health, information technology, retail, and the Internet of Things (IoT). Organizations depend on log analytics solutions to detect aberrations […]

Read More

Gearing up for re:Invent 2019 with Open Distro for Elasticsearch sessions

re:Invent 2019 has a new track this year and it’s all about Open Source! There are lots of great sessions coming up on Open Distro for Elasticsearch and its components such as Alerting, Security, and Performance Analyzer. Join in to learn more and participate in hands-on workshops! Keep a lookout for our sessions on machine […]

Read More

Deploying Spark jobs on Amazon EKS

UPDATE, March 2021: This blog post describes how to deploy self-managed Apache Spark jobs on Amazon EKS. AWS now provides a fully managed service with Amazon EMR on Amazon EKS. This new deployment option allows customers to automate the provisioning and management of Spark on Amazon EKS, and benefit from advanced features such as Amazon […]

Read More
Lucene in action Amazon search screenshot.

What Amazon gets by giving back to Apache Lucene

  At pretty much any scale, search is hard. It becomes dramatically harder, however, when searching at Amazon scale: think billions of products, complicated by millions of sellers constantly changing those products on a daily basis, with hundreds of millions of customers searching through that inventory at all hours. Although Amazon has powered its product […]

Read More
The OKTA login screen for logging in to Open Distro For Elasticsearch Kibana

Add Single Sign-On (SSO) to Open Distro for Elasticsearch Kibana using SAML and Okta

Open Distro for Elasticsearch Security implements the web browser single sign-on (SSO) profile of the SAML 2.0 protocol. This enables you to configure federated access with any SAML 2.0 compliant identity provider (IdP). In a prior post, I discussed setting up SAML-based SSO using Microsoft Active Directory Federation Services (ADFS). In this post, I’ll cover […]

Read More
Diagram of six elasticsearch nodes with three indexes showing uneven, skewed CPU, RAM, JVM, and I/O usage.

Demystifying Elasticsearch shard allocation

At the core of OpenSearch’s ability to provide a seamless scaling experience, lies its ability distribute its workload across machines. This is achieved via sharding. When you create an index you set a primary and replica shard count for that index. Elasticsearch distributes your data and requests across those shards, and the shards across your […]

Read More
Open Distro for Elasticsearch logo.

Open Distro for Elasticsearch 1.1.0 released

We are happy to announce that Open Distro for Elasticsearch 1.1.0 is now available for download! Version 1.1.0 includes the upstream open source versions of Elasticsearch 7.1.1, Kibana 7.1.1, and the latest updates for alerting, SQL, security, performance analyzer, and Kibana plugins, as well as the SQL JDBC driver. You can find details on enhancements, […]

Read More
An Open Distro for Elasticsearch cluster with balanced resource usage

Use Elasticsearch’s _rollover API For efficient storage distribution

Many Open Distro for Elasticsearch users manage data life cycle in their clusters by creating an index based on a standard time period, usually one index per day. This pattern has many advantages: ingest tools like Logstash support index rollover out of the box; defining a retention window is straightforward; and deleting old data is […]

Read More

Add Single Sign-On to Open Distro for Elasticsearch Kibana Using SAML and ADFS

Open Distro for Elasticsearch Security (Open Distro Security) comes with authentication and access control out of the box. Prior posts have discussed LDAP integration with Open Distro for Elasticsearch and JSON Web Token authentication with Open Distro for Elasticsearch. Security Assertion Markup Language 2.0 (SAML) is an open standard for exchanging identity and security information […]

Read More
Diagram showing where PartiQL fits with other data sources.

Announcing PartiQL: One query language for all your data

Data is being gathered and created at rates unprecedented in history. Much of this data is intended to drive business outcomes but, according to the Harvard Business Review, “…on average, less than half of an organization’s structured data is actively used in making decisions…” The root of the problem is that data is typically spread […]

Read More