Customer Stories / Software & Internet / United States
VMware Carbon Black Secures Millions of Endpoints by Analyzing Over 1 PB of Data Per Day with Amazon Kinesis Data Streams
Learn how VMware used Amazon Kinesis Data Streams to build a scalable, cost-effective data streaming solution to enhance their security products, simplify operations, and accelerate customer growth.
VMware Carbon Black analyzes over a trillion events every day as it strives to protect its customers from malicious online behavior. To ingest data from millions of distributed edge devices efficiently at scale, the team needed to transition from a solution that built workflows for each customer individually to a scalable, less costly alternative.
VMware Carbon Black looked to Amazon Web Services (AWS) for a managed service and began to implement Amazon Kinesis Data Streams, a serverless streaming data service that makes it simple to capture, process, and store data streams at any scale. Using Amazon Kinesis Data Streams, VMware supports real-time analytics by collecting data that is available in milliseconds. The service has no upfront cost—users pay only for the resources they use. Using Amazon Kinesis Data Streams, VMware Carbon Black processes 1 PB per day of streaming data as it reduces operating and infrastructure costs.
Opportunity | Finding a Multitenant Solution on AWS
Founded in 2002, Carbon Black was acquired in October 2019 by multicloud services provider VMware—a company that delivers solutions for cloud computing, app modernization, networking, security, and digital workplaces. “You can think of our system as one big search box—we have around eight million individual devices belonging to different individual customers sending us events,” says Stoyan Dimkov, staff engineer and software architect at VMware Carbon Black. “We have thousands of customers sending this data across different regions of the globe.”
The VMware Carbon Black team started to refine its data analytics architecture while the company was still Carbon Black. The company had an on-premises product for automated detection and response for threats, such as ones to network security. First, the team migrated this on-premises product to the cloud, using an instance or cluster of instances for each individual customer. But after a year, the use of the product by smaller customers increased the overall cost per customer. In short, the infrastructure and operational costs were increasing faster than usage at scale. The team soon realized that it needed a more manageable, efficient solution. “It became obvious quite quickly that the solution was unwieldy,” says Corey Leopold, staff engineer at VMware. “We were creating more and more instances for a growing number of customers, and costs were increasing.”
To achieve the necessary elasticity and efficiency, VMware Carbon Black needed a multitenant solution that could support multiple smaller customers on a single resource and avoid wasted capacity. “Cost is driven by the smallest customer,” says Leopold. “Under our previous solution, the minimum cost threshold for every customer was relatively high.” By adopting Amazon Kinesis Data Streams, VMware Carbon Black built a serverless solution with the flexibility to provision only the resources it needed for each customer while reducing the maintenance load on its small team.
When individual customers’ data increases or decreases, we can use the elasticity of Amazon Kinesis Data Streams to scale compute up or down to process data reliably while effectively managing our cost.”
Staff Engineer and Software Architect, VMware Carbon Black
Solution | Achieving Scale Reliably and Efficiently
The team started implementing the serverless solution on AWS in late 2017 and quickly launched an early-access program to begin migrating customers to the new architecture. It spent the next year refining its use of Amazon Kinesis Data Streams so that the solution would stream data in a compact, efficient way to help to control and optimize costs. In 1 year, the team migrated all customers and processed more data at a reduced cost compared with the previous single-tenant-per-customer solution.
VMware Carbon Black also needed a solution that could scale up quickly without overprovisioning. To meet this requirement efficiently, the team measures each customer’s incoming data on Amazon Kinesis Data Streams so that it can scale each data stream proactively in response to customers’ traffic spikes. “When individual customers’ data increases or decreases, we can use the elasticity of Amazon Kinesis Data Streams to scale compute up or down to process data reliably while effectively managing our cost,” says Dimkov. “This is why Amazon Kinesis Data Streams is a good fit.” By routing incoming data by customer, VMware Carbon Black can also isolate and manage unexpected data influxes without broadly affecting its customer base.
The solution on Amazon Kinesis Data Streams writes more than 1 PB of data per day and reads up to 3 PB per day globally. Previously, the team at VMware Carbon Black had limited visibility into the aggregate throughput of all customer instances. By scaling Amazon Kinesis Data Streams in response to actual data consumption, VMware Carbon Black can continue to grow flexibly without rearchitecting its solution, which optimizes engineering resources. “We have grown our customer base multiple folds in last few years, and the volume of traffic we capture from individual endpoints evolves alongside our product,” says Dimkov. VMware Carbon Black has also delivered customer satisfaction with its new solution. “We’ve migrated all customers, including our largest customers, seamlessly to the solution, and they’ve been happy with the consistent real-time insights at scale,” says Leopold.
The solution also uses Amazon Simple Storage Service (Amazon S3)—an object storage service that offers industry-leading scalability, data availability, security, and performance—to store VMware Carbon Black’s data consistently and reliably. The overall setup delivers cost savings by reducing data redundancy on VMware Carbon Black’s live compute storage. “The previous system effectively limited the amount of data a customer could have indexed by the size of the provisioned instances,” says Leopold. “Now, due to the Amazon S3 storage and dynamic hosting of indexes, every customer is granted 30 days’ worth of searchable data, no matter how much data they send into the system.”
On its new solution, VMware Carbon Black has improved query performance significantly and delivered more scalable search capabilities. “We start with billions of events going into our streams, which essentially aggregated into sets of up to 50 GB that are then migrated to Amazon S3,” says Dimkov. “After we have processed the ingestion, we can build our security products on top of that and make it searchable.”
Outcome | Enhancing Operations and Supporting Customers in the Cloud
Using managed services from AWS reduces VMware Carbon Black’s management load and helps the team dedicate more resources to supporting its internally managed services in production. “The biggest advantage is the managed nature of our solution on AWS,” says Dimkov. “This has shaped our architecture and helped us shift complexity elsewhere.” By focusing on its expertise while AWS manages data streaming, VMware Carbon Black efficiently scales to meet evolving customer traffic and needs so that millions of endpoints continue to be secure in real-time.
VMware Carbon Black plans to continue enhancing the scaling and efficiency of its solution using Amazon Kinesis Data Streams. It also is transitioning more products to the solution while developing full feature parity in support of its broader cloud-adoption and growth goals. “We want to grow fast and build more products and do it all in the cloud,” says Dimkov.
VMware has been developing virtualization software since 1998. Headquartered in Palo Alto, California, the company is known for its app modernization, cloud, networking, security, and digital workspace offerings.
AWS Services Used
Amazon Kinesis Data Streams
Amazon Kinesis Data Streams is a serverless streaming data service that makes it easy to capture, process, and store data streams at any scale.
Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-leading scalability, data availability, security, and performance.
Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.