AWS Storage Blog

How Cohesity uses Amazon EBS direct APIs to accelerate application backup and recovery times

When backing up applications, AWS Backup and Restore Partners seek methods that minimize complexity and reduce costs for their customers. Most backup applications protect Amazon Elastic Block Store (EBS) volumes using EBS snapshots as a part of Amazon EC2 protection feature. For backups with long term retention, backup applications offer additional streaming backup capabilities that store backup data on Amazon Simple Storage Service (S3) and Amazon S3 Glacier storage classes for cost effective long term storage. Technologies such as compression and de-duplication are also used to reduce the backup storage footprint. To support streaming backup capabilities, backup application workflows use temporary EBS volumes to read data from EBS snapshots, resulting in operational complexity and management overhead for customers. In addition, for incremental backups, backup vendors determine the changed blocks by comparing each block in the snapshot with the data stored in the backup repository. This further increases the backup times and results in longer application recovery point objectives (RPO).

AWS introduced Amazon EBS direct APIs to provide a set of primitives that simplify this backup and recovery workflow for Amazon EBS volumes. These APIs can read and write data directly from Amazon EBS snapshots eliminating the need for temporary Amazon EBS volumes. They also provide a list of blocks that have changed with respect to the last snapshot, making incremental backups faster as well as removing the need for backup applications to determine changed blocks via worker instances. As a result, integrating with Amazon EBS direct APIs helps backup applications reduce the overall complexity of the solution, resulting in better backup and recovery performance for applications.

In this blog post, we discuss the challenges that backup applications face when protecting Amazon EBS volumes, and how Amazon EBS direct APIs help solve those challenges. We then walk you through the steps of Cohesity’s DataProtect Amazon EBS volume protection feature that is built on Amazon EBS direct APIs. Cohesity, an AWS Advanced Technology Partner, has a portfolio that encompasses multiple as-a-Service offerings, including Backup as a Service (BaaS), which is available in AWS Marketplace as Cohesity DataProtect.

The challenges with protecting Amazon EBS volumes

Figure 1 below highlights the typical backup workflow that backup applications follow. In step 1, the backup application, running either directly in your AWS account or in a separate service account, creates a snapshot of the Amazon EBS volume attached to the Amazon EC2 application that needs to be protected. The backup application then creates an EBS volume from the EBS snapshot, spins up a temporary worker instance, and mounts the volume on the instance, as shown in steps 2 and 3. As a next step, the temporary instance reads the data from the attached volume, processes it and stores it in the backup repository, typically hosted on Amazon S3 (step 4).

Backup application workflow for Amazon EBS volume backup

Figure 1: Backup application workflow for Amazon EBS volume backup

For full backups, the temporary instance reads all occupied file system blocks on the volume, whereas for incremental backups, it needs to know which blocks have changed with reference to the last backup. To determine which blocks have changed, the application reads all the occupied blocks on the volume, calculates a checksum for each block, and compares this checksum with the checksum of blocks stored for the last backup in the backup repository. Because the entire volume has to be read each time an incremental backup is run, and checksum for each block calculated and compared, the result is increased time to complete the backups. For reading the data and checksum processing, temporary Amazon EBS volumes and worker instances need to be spun up, which increases the overall cost and complexity of the backup solution.

This process is reversed for restores. The Amazon EBS volume to be restored is first mounted on a temporary worker instance, in the target Availability Zone of the application that must be restored. Backup data is then read from the backup repository and written back to this volume, after which the volume is detached and re-attached to the application Amazon EC2 instance. This process leads to increased infrastructure costs and complexity.

Amazon EBS direct APIs simplify and accelerate the backup workflow

Amazon EBS direct APIs are designed to solve challenges related to backup and recovery performance and simplify overall architecture of the backup applications. For full backups, the backup application first creates a snapshot of the volume of the application to be protected and uses the ListSnapshotBlocks API to determine the list of occupied blocks on the volume from the snapshot. The application then uses the GetSnapshotBlock API to read the contents of those blocks and stores the blocks in the backup repository.

For incremental backups, the ListChangedBlocks API is used to determine the changed blocks on the snapshot, with respect to the last snapshot, and these blocks are directly backed up as above. No temporary Amazon EBS volumes or worker instances are required for this method.

For the restore workflow, the backup application simply creates a new Amazon EBS snapshot and uses the PutSnapshotBlock API to restore the backup data from the backup repository. The backup application then creates an Amazon EBS volume from the snapshot and mounts it directly on the Amazon EC2 instance on which the application is being restored. Similar to the backup process, no temporary worker instances are required.

Using Amazon EBS direct APIs is fast because it directly provides the list of changes between snapshots, avoiding a need for the backup application to compute the changes manually. And since it does not need temporary worker instances or Amazon EBS volumes (possibly in each Availability Zone), Amazon EBS direct APIs often result in lower cost and complexity.

In addition, Amazon EBS direct APIs support reading and writing data at 500 MiBps from a snapshot, which leads to faster backup and recovery times. This simplified process helps customers achieve better recovery time objectives (RTOs), and recovery point objectives (RPOs) for their applications.

With that background, let’s focus on how Cohesity’s DataProtect as a Service uses Amazon EBS direct APIs for protecting Amazon EBS volumes.

Cohesity DataProtect as a Service – Amazon EBS volume backup workflow

Cohesity’s DataProtect as a Service runs in Cohesity’s AWS accounts and through cross-account AWS Identity and Access Management (IAM) roles accesses AWS resources running in customer accounts. To protect Amazon EBS volumes, Cohesity deploys a SaaS Connector. It is a lightweight agent running on Amazon EC2 instances that runs in the customer’s AWS account. Figure 2 below highlights the backup workflow.

Cohesity DataProtect as a service EBS Volume backup workflow

Figure 2: Cohesity DataProtect as a Service EBS volume backup workflow

Step 1:

Cohesity SaaS connector creates Amazon EBS snapshots for Amazon EBS volumes of the application that needs to be protected.

Step 2:

For full backups, it uses the ListSnapshotBlocks API to get a list of all occupied blocks on the volume. For incremental backups, it uses the ListChangedBlocks API to get the list of changed blocks since the previous snapshot.

Step 3:

Once the SaaS connector has the list of blocks that need to be backed up, it uses the GetSnapshotBlock API to read the data for those blocks.

For optimal read speed, the SaaS connector creates multiple sub tasks to read data in parallel, with each sub task reading 32 blocks at a time from the snapshot. The data is compressed and deduplicated before being sent to Cohesity’s DataProtect as a Service. This creates an air-gapped copy of the backup data.

Cohesity DataProtect as a Service Amazon EBS volume restore workflow

Cohesity’s DataProtect as a Service can restore complete Amazon EC2 instances or selective Amazon EBS volumes. Cohesity DataProtect as a Service’s EBS Volume restore workflow is shown Figure 3 below.

Figure 3: Cohesity DataProtect as a Service’s EBS Volume restore workflow

Figure 3: Cohesity DataProtect as a Service’s EBS volume restore workflow

For restoring data back to Amazon EBS volumes, Cohesity’s SaaS connector reverses the workflow followed for backups, as described in the following steps:

Step 1:

Cohesity’s SaaS Connector creates a new Amazon EBS snapshot by calling the StartSnapshot API.

Step 2:

SaaS Connector accesses the backup data from Cohesity’s DataProtect as a Service, and writes it back to this newly created snapshot via the PutSnapshotBlock API. For optimal performance, the SaaS connector creates multiple restore sub tasks, with each sub task running concurrently and writing 32 blocks at a time, to the snapshot. After the data has been copied to the snapshot, the CompleteSnapshot API is called to complete the snapshot.

Step 3:

Once the data is restored, for snapshots representing boot volumes, the SaaS connector creates an Amazon Machine Image (AMI) from this snapshot. AMIs provide all the information required to launch Amazon EC2 instances. If the snapshots are for data volumes, Amazon EBS volumes are created from these snapshots and attached to Amazon EC2 instances.

Conclusion

Amazon EBS direct APIs provide an efficient mechanism for backing up and recovering Amazon EBS volumes. These APIs form the foundation that powers Cohesity DataProtect as a Service’s Amazon EC2 and EBS data protection feature. For backups, the APIs provide a simple way to read the data directly from snapshots with high performance and determine the changes between snapshots, eliminating the need for using temporary Amazon EBS volumes or worker instances. The benefit is a faster and simpler backup and recovery process.

For restores, backup data is written directly to snapshots with high performance, from which Amazon EBS volumes are created and attached to application instances, bypassing the need for using temporary worker instances. This simplifies the overall process, reduces cost, and improves backup and recovery times for applications running on AWS.

For more information about Cohesity’s DataProtect (as a Service), please visit cohesity.com.

This blog post was co-written by AWS and Cohesity, an AWS Advanced Technology Partner.

Girish Chanchlani

Girish Chanchlani

Girish Chanchlani is a Principal Partner Solutions Architect at AWS and is a member of the Amazon Partner Network (APN) team that works closely with ISV Storage Partners. Prior to AWS, his experience includes working for data and storage management companies as a Product Manager covering File Systems, NAS, Media Management, and Data Protection Appliance solutions.

Edwin Galang

Edwin Galang

Edwin Galang is a Cloud Solutions Architect at Cohesity. He has 20+ years experience in UNIX, hypervisors, storage and data protection. He has worked as a system and SAN administrator, as a professional services technical consultant, a pre-sales SME for data protection and a product manager. Today he is focused on building solutions with public and private cloud providers with Cohesity.

AJ Park

AJ Park

AJ Park is a product manager on the Amazon EBS Snapshots team at Amazon Web Services (AWS). AJ is passionate about data protection and storage and has been innovating in this area for 20+ years as a software developer and a product manager.