AWS Partner Network (APN) Blog

ScyllaDB on AWS is a NoSQL Database Built for Gigabyte-to-Petabyte Scale

By Peter Corless, Director of Technical Advocacy – ScyllaDB

ScyllaDB-AWS-Partners-2023
ScyllaDB

ScyllaDB is an AWS Partner and database for data-intensive applications requiring high throughput and low latency. It was invented to take advantage of modern, highly performant cloud servers, including the new Amazon EC2 I4i series servers based on Intel Ice Lake x86 processors, and the Is4gen and Im4gn series which use Arm-based Graviton2 processors.

ScyllaDB is a wide-column NoSQL database fully compatible with Apache Cassandra and Amazon DynamoDB. In the latter capacity, ScyllaDB is AWS Outposts Ready for on-premises deployments.

ScyllaDB’s NoSQL database is built with deep architectural advancements such as a highly asynchronous, shared-nothing, shard-per-core design that enables teams to harness the ever-increasing computing power of modern infrastructure, thus eliminating barriers to scale as data grows.

NVMe is the Key

First, let’s look at why ScyllaDB recommends I-series Amazon Elastic Compute Cloud (Amazon EC2) instances. ScyllaDB’s consistent, single-digit millisecond P99 latencies are achieved with directly attached storage, such as the NVMe disks provided in the I3, I3en, and I4i series Intel-based servers, and the Is4gen and Im4gn series Graviton2-powered servers.

ScyllaDB is designed around a shard-per-core architecture. This pairs up specific vCPUs with specific allocations of directly attached NVMe storage. It uses an optimized CPU scheduler and IO scheduler which have undergone continuous innovation over more than six years of development.

ScyllaDB was written to be close-to-the-metal and takes full advantage of various advanced architectural underpinnings of EC2 servers like the new Nitro architecture, NUMA-awareness, and more.

Efficient Utilization = Better TCO + ROI

Other wide column NoSQL databases can also scale out to hundreds or thousands of instances. This scalability has been a strong reason to adopt them in the past. However, this can also be seen as a design inefficiency as their architectural limitations preclude them from taking full advantage of denser EC2 instances, where you can find 32, 64, or 128 vCPUs, or where NVMe storage can range from 30-60 terabytes per instance at the high end.

As an example, another NoSQL database recommends keeping node sizes to only eight vCPUs and to less than two terabytes of storage; it’s only capable of horizontal (scale out) and not vertical (scale up) scalability.

ScyllaDB was designed to match other systems’ scale out capabilities, while also being able to scale up—efficiently utilizing all of the high-performance vCPUs, RAM, storage, and network IO that current and future EC2 instances can offer.

By getting maximum utility out of each instance, users can provision and manage smaller clusters. These are less taxing to administer and present a smaller attack surface to secure.

In real-world case studies, ScyllaDB often provides the same or better performance as other databases on far smaller infrastructure, resulting in anywhere between 2-10x savings over prior solutions.

For example, Comcast was able to lower its P99 long-tail latencies by 95%, save over 60% on its total cost of ownership (TCO), and reduced its footprint from 962 nodes to just 78 of ScyllaDB. Brazilian food delivery app iFood was able to save 90% of its costs, while reducing latencies from 80 milliseconds to three milliseconds.

ScyllaDB on the I4i Series

Results from benchmarking tests with ScyllaDB running on I4i family instances surpassed high-performance expectations.

“When we tested I4i instances, we observed up to 2.7x increase in throughput per vCPU compared to I3 instances for reads,” says Avi Kivity, Chief Technology Officer and Co-Founder at ScyllaDB. “With an even mix of reads and writes, we observed 2.2x higher throughput per vCPU, with a 40% reduction in average latency than I3 instances. We are excited for the incredible performance and value these new instances will enable for our customers going forward.”

ScyllaDB-NoSQL-Database-1

Figure 1 – Throughput i3 vs. I4i per server.

The graph above shows operations per second (OPS) throughput results on i4i.16xlarge vs. i3.16xlarge (both with 64 vCPUs) with 50% reads and 50% writes (higher is better).

Meanwhile, the graph below shows P99 latency results on i4i.16xlarge (64 vCPU servers) vs. i3.16xlarge with 50% reads and 50% writes, and latency with 50% of the max throughput (lower is better).

ScyllaDB-NoSQL-Database-2

Figure 2 – P99 latency in MS vs. instance type.

With the new availability of the I4i series, ScyllaDB has updated its guidance for all AWS customers: essentially, if you have the I4i available in your region, use it for ScyllaDB.

I4i instances provide superior performance—in terms of both throughput and latency—over the previous generation of EC2 instances.

ScyllaDB on the Graviton2 Series

Some users prefer the Arm-based Graviton2 instances for their price-performance advantage, and ScyllaDB is an AWS Graviton Service Ready partner. In particular, ScyllaDB is optimized to run on the storage-oriented Is4gen and memory-oriented Im4gn series.

In tests against the existing I3en, performance is better for both high-cache hit rate memory-intensive operations and low-cache hit rate disk-intensive operations (both reads and writes).

ScyllaDB-NoSQL-Database-3

Figure 3 – Performance of ScyllaDB on the Graviton2-based is4gen.8xlarge (right) was superior to the Intel-based i3en.6xlarge (left), and can store twice as much data (30 TB vs. 15 TB).

The is4gen.8xlarge was able to provide equal to or better performance than the i3en.6xlarge while being able to store twice as much data—30 TB of storage on the is4gen.8xlarge vs. the i3en.6xlarge’s 15 TB. This makes the Graviton2-based system cheaper to operate on a per-terabyte-per-hour basis while remaining capable of serving the same or bigger workloads.

ScyllaDB on AWS Marketplace

AWS customers can get started with ScyllaDB via AWS Marketplace with two available offerings:

ScyllaDB Cloud is a great option for lean organizations who don’t want to do backend administration of cloud services. All real-time monitoring, upgrades, backups, restores, and repairs are handled by the ScyllaDB team.

ScyllaDB Enterprise is for teams who have the interest and wherewithal to manage their own clusters.

Conclusion

AWS and ScyllaDB continue to innovate at a relentless pace because of the needs and plans of forward-thinking companies. Other organizations looking to embrace a database built to manage the speed and scale of data generated in this next tech cycle can find ScyllaDB Cloud and ScyllaDB Enterprise on AWS Marketplace.

For more information, contact ScyllaDB at info@scylladb.com.

The content and opinions in this blog are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

.
ScyllaDB-APN-Blog-Connect-2023
.


ScyllaDB – AWS Partner Spotlight

ScyllaDB is an AWS Partner and NoSQL database that enables real-time applications that run at global scale. ScyllaDB’s fully managed cloud service, Scylla Cloud, handles petabytes of data with ease, performing millions of operations per second with predictable low latencies and global high availability.

Contact ScyllaDB | Partner Overview | AWS Marketplace