AWS Cloud Financial Management

Singular improves price-to-performance ratio by 35% in just a few days with AWS Graviton2

Singular is a marketing data platform that empowers businesses by driving growth with unified marketing data, intelligent insights, and automation. It has over 950 virtual servers, 8 Kubernetes clusters, and 5 petabytes of raw data in its AWS environments. To deliver the most accurate, timely, and actionable cross-platform analytics to its customers, Singular processes 15 billion real-time events and 5 million batch processing jobs per day, across 3 AWS Regions, and additional on-premises resources.

Following the launch of AWS Graviton2 processors, Singular decided to migrate all relevant workloads to optimize its price-to-performance ratio.

Using the right modernization mental model

After a few experimentations, looking at available resources from AWS and consulting with AWS experts, we started to build a robust, replicable, and scalable way to do so.” –Ofir Nir, Head of DevOps

Singular’s mental model uses a blueprint that empowers engineers to take action, addressing a top concern for FinOps practitioners, according to the State of the FinOps 2022 report. With migration-at-scale in mind from the get-go, it can be applied when considering a migration to a new service or infrastructure component (e.g., migrating to Amazon gp3 EBS, AWS Nitro SSD, Amazon EC2 M6i instances):

  1. Identify relevant workloads to migrate (many workloads don’t require significant changes).
  2. Create the building blocks like relevant AMIs, build systems, monitoring agents, and runtime environments.
  3. Prepare a comparison infrastructure to ensure you can measure the effectiveness of the change.
  4. Test your workloads progressively, starting with staging environments, then production-like ones, and ultimately migrating to actual production environments.

Kickstarting the project

To kickstart the project with stakeholder engagement and momentum, Singular created a Hackathon project within its Optimizing API service team. Existing servers were the first to be tested, focusing on their Apache Druid cluster (consisting of 100 Amazon EC2 instances of the ‘R’ and ‘M’ families, hosting stateful applications). The JVM-based (Java VM) environment only required small adjustments: using the ARM AMI for Ubuntu20, applying a workaround for missing libraries, and then using progressive deployment for testing.

The results showed 15-20% lower load average and 20-25% faster queries for Singular’s Apache Druid application. This led to the decision to adopt Graviton at scale, and reinvest the savings into new capabilities.

Measuring success

Migrating to Graviton2 was straightforward for most workloads, but it required consistent application of the mental model for migrations described earlier:

  1. Singular listed out all workload dependencies and their availability for Graviton/arm64 (e.g. OS and version, libraries, frameworks, and runtimes used).
  2. It upgraded to the latest versions to ensure compatibility (generally available in either Linux distributions repositories, Amazon ECR, DockerHub, GitHub, etc.).
  3. Then, Singular carried out A/B testing to measure improved performance (load testing with production-like traffic, and measuring both ‘load average’ and ‘query times’).
  4. Finally, it deployed workloads to production in a progressive manner (using a combination of ‘Canary’ and ‘Rolling’ deployment techniques).

In just a few days Singular completed the migration of its Apache Druid workloads (~100 Amazon EC2 instances large-24xl), improving its price/performance ratio by an average of 35%.

Singular measured outcomes and assessed return on investment using cost with KPI/unit/customer granularity. This was achieved with different sets of in-house, AWS native (AWS Cost and Usage Report and AWS Cost Explorer), and third-party tools (Finout.io), that tracked impact on ‘Cost Per Customer’, ‘Cost Per Product/Feature’, and ‘Cost Per Business Context’.

Singular’s goal was to maximize the effectiveness of this initiative by comparing the number of good workload candidates to those effectively modernized at the end of the project. The key success factor was having the FinOps team provide all the context needed for engineers to take action, including key areas such as performance impact, dependencies, and step-by-step instructions.

Looking into the future

Following the successful deployment of Graviton2 to their Apache Druid and Realtime Outbound Processing environment that followed, Singular plans to have 60% of its workloads running on Graviton2 by the end of 2022. According to Ofir, Singular’s Python/PostgreSQL workloads are the next target environment to migrate.

🎞️Hear more about Singular’s journey to cost and performance optimization.

About Singular Benefits Singular achieved using AWS
Singular is a marketing data platform that empowers businesses by driving growth with unified marketing data, intelligent insights, and automation. It captures, analyzes, and refines billions of data points to deliver the most accurate, timely, and actionable cross-platform analytics to its customers.
  • Migrated Apache Druid workloads to Amazon Graviton2 processors to optimize infrastructure modernization and efficiency
  • Improved price-to-performance ratio by an average of 35%
  • Used AWS native tools to track, understand, and report KPIs aligned with business objectives
  • Measured 15-20% lower load average and 20-25% faster queries after deploying Amazon Graviton2 to Apache Druid and Realtime Outbound Processing environment