AWS Database Blog

Retain more for less with UltraWarm for Amazon Elasticsearch Service

Machine-generated data powers solutions and causes problems. It’s indispensable for identifying operational issues in today’s modern software applications, yet you need flexible, scalable tools like Amazon Elasticsearch Service to analyze it in real time. This log data is so valuable that you don’t want to remove it from hot storage, but there’s so much of it that you have to. Log analytics has an inherent tension, a weighing of potential value against definite costs.

We know that logs from the past few days are valuable. Paying for hot storage, which offers the best indexing and search performance, makes sense. Logs from six weeks or months ago will probably be valuable, but how long from now? In the meantime, is the cost of keeping them in hot storage justifiable?

UltraWarm, a new storage tier for Amazon Elasticsearch Service, removes this tension, enabling you to dramatically extend your data retention period and reduce costs by up to 90% over hot storage. Best of all, the interactive analytics experience remains. Query your warm indices just like any other index, or use them to build Kibana dashboards.

How it works

Traditional warm storage solutions have limitations. An open source Elasticsearch cluster might use dense-storage D2 instances, but adding those nodes maintains the same fundamental Elasticsearch architecture. You still have to account for operating system overhead, disk watermarks, and index replicas. Rather than paying for what you use, you’re still paying for what you provision.

UltraWarm is different. It uses a combination of Amazon S3 and nodes powered by the AWS Nitro System to provide a hot-like experience for aggregations and visualizations. S3 provides durable, low-cost storage, removing the need for replicas. S3 also abstracts away the notion of overhead, so each UltraWarm node can use 100% of its available storage. These nodes include query processing optimizations and a sophisticated caching solution that pre-fetches data. Compared to traditional warm storage solutions, UltraWarm’s performance is generally similar or superior.

For a concrete pricing example, consider three ultrawarm1.large nodes. Like all nodes, you pay an hourly rate for each. In this case, that rate is $2.680. Together, these nodes can address up to 60 TiB of storage on S3 at $0.024 per GiB/month, but if you only store 10 TiB of data, you’re only billed for those 10 TiB. If you use all 60 TiB, the grand total for a month of nodes and storage is approximately $7,344.

For comparison, you need four i3.16xlarge nodes to hit just shy of 60 TiB of storage. But not all of those 60 TiB are usable. After factoring in overhead, you’re left with 75% of the original space. Replicas then cut that number in half. To reach approximately 60 TiB of usable storage for primary shards, you actually need 11 nodes, each of which is $7.987 per hour. Whether you use that space or not, the grand total for a month is $64,136.

True, the i3.16xlarge nodes offer superior performance, but the UltraWarm solution saves you nearly 90%. For read-only data that you access less frequently, UltraWarm storage offers massive savings while still keeping your data available and ready-to-query.

Get started

Despite the numerous under-the-hood innovations, getting started is easy. To migrate an index from hot to warm storage, make a single request to the REST API with no complicated parameters or request body:

POST _ultrawarm/migration/<my-index>/_warm

The index remains available throughout the migration process—no downtime. If you ever need to write to the index again, UltraWarm includes its own automated snapshot repository from which you can restore the index to hot storage.

UltraWarm is currently in public preview in three Regions: US East (N. Virginia), US East (Ohio), and US West (Oregon). To create a new domain and see if UltraWarm fits your use case, see the Amazon Elasticsearch Service Developer Guide.

 


About the Author

 

Andrew Etter is a Senior Technical Writer with Amazon Web Services and the bestselling author of Modern Technical Writing: An Introduction to Software Documentation.