Druva Controls Costs with Highly Secure Scaling Strategy on AWS
2021
From Monolith to Microservices
The speed at which we can launch instances and the flexibility of the overall compute infrastructure on AWS are astounding.”
Kiran Chitnis
Senior Director of Cloud Operations, Druva
Compliance with Regulatory Guidelines
Druva began its containerization journey in 2018 to dynamically scale its services and greatly reduce single points of failure. As its volume of microservices grew, the company began evaluating container orchestration services, both cloud-based and open-source, to ease the administrative burden on its operations team.
Most of Druva’s customers are Fortune 500 enterprises, such as big pharma corporations, U.S. government entities, or public agencies. Druva chose to use Amazon Elastic Container Service (Amazon ECS) for its global availability and compliance with the Federal Risk and Authorization Management Program (FedRAMP). Compliance with FedRAMP—as well as SOC 2, the Health Insurance Portability and Accountability Act (HIPAA), and the Payment Card Industry Data Security Standard (PCI DSS)—is a vital first step in obtaining and retaining high-profile customers.
“Our customers recognize AWS compliance and security certifications, which makes our lives much easier because we don’t need to pursue time-consuming, costly third-party assessments,” says Kiran Chitnis, senior director of cloud operations at Druva.
Global Footprint Ensures Data Residency and Low Latency
Containers have proven an efficient means for scaling the enterprise. As of 2021, Druva is active in 18 AWS Regions including AWS GovCloud (US), up from 11 regions when it started containerizing in 2018. Expanding its footprint and activating new AWS Regions has allowed Druva to onboard new customers in countries such as South Africa and Sweden while satisfying data residency and latency requirements. Low latency is critical for Druva’s customers when backing up large volumes of high-value data.
Druva’s customers can also maintain a lower recovery time objective (RTO) and recovery point objective (RPO) thanks to the scalability offered with containers. Previously, the rerouting of a customer’s backup job to an AWS Region at maximum server capacity would go into a queue, thus driving up the RPO indefinitely until more resources were available. Now, with containerization, Druva will seldom face a scenario where a customer’s backup job has to wait because its infrastructure is saturated. “With automation now in place, we have seamless elasticity to grow and the orchestration capabilities to dynamically spin up resources in real time for varying workloads,” says Chitnis.
Compute Cost Reduction with 99.5% Uptime
In the first three years after containerizing, Druva nearly quadrupled the amount of data under management from 45 PB to 175 PB. But by leveraging out-of-the-box functionality of Amazon ECS, the business only increased its operations headcount by two during this period.
Costs have also decreased from using Amazon EC2 Spot Instances in conjunction with Amazon ECS. Before 2020, the company relied on a mix of up to 80 percent Amazon EC2 Reserved Instances, 10 percent On-Demand Instances, and very few Spot Instances.
The balance has now shifted so that Spot Instances handle most workloads, with Reserved Instances deployed when Spot Instances become unavailable. The company’s On-Demand Amazon EC2 instance usage has dropped to zero. As of 2021, Druva projects 20–25 percent monthly savings on its Amazon EC2 monthly computing bill. Critically, the shift hasn’t affected system uptime, which maintains 99.5 percent availability for over 10 years on AWS.
Speedier Deployment with Native Integrations
Druva recently onboarded its largest customer to date, the Port of New Orleans, with 20,000 account users and about 4 PB of data to secure. Accelerated software delivery on AWS enabled the company to onboard this customer in just three weeks with no manual intervention.
Druva’s customers are also saving time with its modern microservices architecture. In the case of the Port of New Orleans, backups that used to take a day now take just 30 minutes or less. “The move to Druva gave us the opportunity to consolidate data that was previously dispersed throughout the enterprise and off-site. We have the ability to restore and backup data within seconds and continue to meet data requirements in a timely manner,” says David Cordell, chief information officer at the Port of New Orleans.
Shifting Focus to Cloud Economics and New Technologies
The Druva platform’s high degree of automation allows the company’s operations team to build their skill sets in cloud economics instead of simply performing maintenance tasks. They can also more effectively identify and remove bottlenecks related to capacity restraints to further optimize their cloud architecture.
Druva is also investing more into R&D with the cost savings it has achieved. It’s giving engineers the freedom to experiment with new technology—such as using telemetry data to develop and train machine learning (ML) models using Amazon SageMaker—to better understand how customers use Druva. Engineering teams are starting to build business intelligence dashboards using tools including Amazon Athena and Amazon QuickSight for ML-powered insights. Chitnis concludes, “AWS makes it straightforward for us to take on any new projects and scale them for our global customer base.”
Learn More
To learn more, visit aws.amazon.com/ec2/spot
About Druva
Druva offers cloud-based data protection and backup services to 4,000 organizations across the globe. Its customers include Fortune 500 companies, governments, and public agencies.
Benefits of AWS
- Scales from managing 45 PB of data to 175 PB in 3 years
- Onboards customers with 20,000 users in 3 weeks with no manual intervention
- Saves 20–25% on monthly computing costs
- Maintains 99.5% availability for 10 years
- Adheres to FedRAMP, SOC 2, HIPAA, and other compliance controls
- Ensures low latency and data residency compliance
- Automates software delivery so engineers can focus on innovation
AWS Services Used
Amazon EC2 Spot Instances
Amazon EC2 Spot Instances let you take advantage of unused EC2 capacity in the AWS cloud.
Amazon Elastic Container Service
Amazon Elastic Container Service (Amazon ECS) is a fully managed container orchestration service that helps you easily deploy, manage, and scale containerized applications.
Amazon QuickSight
Amazon QuickSight is a scalable, serverless, embeddable, machine learning-powered business intelligence (BI) service built for the cloud.
Amazon Athena
Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL.
Get Started
Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.