SAP on AWS: Build for availability and reliability
In the words of Amazon CEO Andy Jassy ‘there is no compression algorithm for experience’. With over 5000 SAP customers on AWS, AWS has become a platform for innovation for SAP workloads. With our working backwards leadership principle, AWS has built several tools and services to help SAP customers build robust, reliable and scalable SAP systems on AWS regions across the world. In this blog, we will discuss various AWS services to build reliable SAP systems on the platform.
A robust backup policy is at the center of an enterprise’s business continuity and disaster recovery (DR) strategies. When migrating to AWS, customers can adopt new tools and services available on AWS, to simplify their SAP applications’ recovery steps from various availability events.
When planning a backup policy for your applications, consider a combination of file backups and storage snapshots to minimize recovery time objective (RTO). Mission critical applications need protection from events that occur within an AWS availability zone (AZ), as well as events that can affect an entire region. Before we go any further, it’s important to understand the system design principles for SAP systems on AWS that are detailed in the technical document, “Architecture Guidance for Availability and Reliability of SAP on AWS”, as we will extend on that topic in this blog post.
Building a backup policy on AWS
Below are some of the key services and features that can be used to build a backup policy for SAP applications and databases.
Backup of HANA databases
AWS Backint Agent for SAP HANA is used to backup SAP HANA databases to Amazon Simple Storage Service (S3) buckets directly and restores it using SAP management tools such as HANA Cockpit. Optionally, you can use SAP Backint Agent for Amazon S3, as per SAP Note 2935898. It is also possible to add a storage snapshot based backup and recovery strategy to your backup policy for your SAP HANA databases. When deploying HANA using AWS Launch Wizard for SAP, you can choose to install backint agent for integrating backups with Amazon S3 service.
Backup of AnyDB databases
AnyDB(non-HANA) databases running SAP, can be backed up to files on an Amazon Elastic Block Storage (EBS) volume, for staging your backups. Once the backups are finished, the files can be uploaded to S3 using the AWS CLI. The process can be automated using features of AWS Systems Manager (SSM).
Alternative approaches to backup non-HANA databases, using database and AWS native features, include:
Customers running SAP with Oracle DB on AWS can use Oracle Secure Backup(OSB) Cloud Module to integrate Oracle backups with AWS S3 service. This feature is available with Database 9i Release 2 or later. When using RMAN, multiple backup channels can be used to improve performance.
Customers running SAP workloads with Oracle can also leverage AWS Native EBS multi-volume crash-consistent snapshots to perform backup and recovery for Oracle databases. This procedure saves storage cost as snapshots are incremental. Amazon EBS Fast Snapshot Restore improves restore time. Please look at Kellogg’s customer story for the benefits of this approach.
SAP ASE Database:
Customers running SAP workloads with SAP ASE (Adaptive Server Enterprise) database use Amazon S3 as their backup storage. This solution needs AWS File Gateway which is used to transfer asynchronously data to Amazon S3 over an HTTPS connection. SAP has also performed detailed analysis and shared solution with configuration steps for this approach.
Similar to other databases, the SAP ASE database also provides the option to leverage the Amazon EBS snapshots option for backup and restore operations. Refer to this blog for detailed steps to perform and automate backup/restore operations on SAP ASE database using Amazon EBS Snapshots.
Microsoft SQL Server Database:
MS SQL Server running on Microsoft Windows can use VSS (Volume Shadow Copy Service) feature to perform consistent DB backup. VSS is also integrated with AWS Backup which make administration easy for backup/restore operations. Please refer to the blog for detailed configuration and testing steps.
Backup using third party tools
Many third-party enterprise backup tools are able to read and write backups to Amazon S3, also, and integrate with the SAP backint interface for supported backup methods. If you are already using a tool and would like to use it on AWS, please check with the vendor for integration with S3. We also have solutions offered by our partners such as Linke Emory Cloud Backup, where SAP HANA, Oracle, and SAP ASE database backups can be stored and recovered directly from Amazon S3. Partners like Commvault, Veritas, N2WS, Actifio and others are also able to write SAP HANA and AnyDB database backups directly into Amazon S3 buckets. The solutions offered by these ISVs may provide additional features such as de-duplication, encryption, compression, and may save you money by reducing or removing the need for EBS storage for backup staging.
Amazon S3 Replication
Amazon S3 Replication is a feature which can be used to replicate all or a subset of your SAP database backups stored in an S3 bucket, to a separate S3 bucket for alternative recovery purposes, such as for Disaster Recovery. S3 Cross Region Replication adds the capability to replicate these files to a different AWS region, for Disaster Recovery purposes. And with S3 Replication Time Control, you can now replicate your S3 objects in predictable time frame, based on Service Level Agreements from AWS. When replicating backup files in S3 to a DR region, you can choose the S3 Standard-Infrequent Access (S3 Standard-IA) tier in the DR region, to store your backups at a lower cost in the DR region.
AWS Backup provides a control plane to manage backups for services such as EBS, EFS, EC2, DynamoDB, Aurora and storage gateway. AWS backup plan can also copy the backups to another region for Disaster Recovery. SAP applications often use services such as EBS, EFS, and EC2, which can be backed up using AWS backup.
Database backups via Amazon Elastic Block Storage(EBS) snapshots
Databases can be backed up using snapshots. Snapshots are incremental, meaning only the blocks on the device that have changed after your most recent snapshot are saved. Snapshots are suited to back up SAP file systems such as /usr/sap/*, /sapmnt/*. When using EBS snapshots to backup databases, make sure the database is in “backup mode” or shut down your database before a snapshot is triggered for consistency.
Database “backup mode” is invoked to pause I/O operations to storage area, for an application consistent snapshot. Most modern databases provide “backup mode” option, including HANA. Note that, when you run your database on LVM striped volumes, make sure snapshots are initiated on all EBS volumes in the volume group. Amazon EBS snapshots can be scheduled using AWS Backup.
Amazon Elastic File System(EFS) backups
Amazon EFS files systems are used for hosting saptrans and sapglobal files. These files are shared across multiple EC2 instances running SAP. Amazon EFS can be backed up using AWS Backup, either on a schedule or on-demand. Using AWS backups you can also replicate your backups across regions to meet your DR requirements. Additionally, SAP customers also replicate their EFS to their DR region, using AWS data sync.
Amazon AMI backups
AMI (Amazon Machine Image) backups provide a fully recoverable copy of your entire EC2 instance, including all EBS volumes. This can be used for a quick rollback of a change made to the entire database or application. For instance, when you are applying an O/S patch, database or application patch, an AMI backup provides a solid rollback options to recover from failure. You can use AWS Backup to schedule AMI backups periodically, or simply create an on demand backup.
CloudEndure Disaster Recovery
CloudEndure continuously replicates your machines (including operating system, system state configuration, databases, applications, and files) into a low-cost staging area in your target AWS account and preferred Region. This enables minimal Recovery Point Objectives (RPO) for all applications and databases running on supported operating systems. CloudEndure is also widely adopted for replicating SAP applications across regions.
EC2 Auto recovery
You can automatically recover an impaired instance using an Amazon CloudWatch Alarm, which will execute a recovery action. EC2 instances running SAP applications can take advantage of EC2 auto recovery feature to be highly available within an availability zone. AWS recommends enabling this feature for lowering the RTO from failures.
Let’s take a look at the architecture below to understands how various AWS services discussed so far can be used for building reliable SAP solutions on AWS.
1. AWS backint agent backing up a SAP HANA database to S3 bucket in Region1
2. Backup bucket A replicated to another region using Amazon S3 cross region replication feature
3. Amazon services EC2(AMI), EBS, EFS snapshots replicated across regions
4. CloudEndure replicating SAP application server to another region
5. Database replicated to DR region using log replication
6. Amazon CloudWatch alarm for auto recover enabled on all hosts for protection from component failures
7. Database high availability cluster across Availability Zones
8. Application high availability cluster across Availability Zones
Using your backup policy to recover from events
Now that we know all the tools to help us build reliable backup policies, let’s take a look at services event sand how to recover from each.
Scenario 1: Amazon EC2 event
These are considered as events within the AWS Availability Zone arising out of common issues with networking, power, software bugs at hypervisor layer etc. EC2 auto recovery provides a robust solution to recover from such events. SAP HANA database provides auto-restart function, which simply starts the database upon intended/unintended host reboots. Applications and databases that do not provide this feature, may have to rely on bootstrapping with a script written in shell.
Though auto recovery provides a quick means of recovering from various events, certain applications such as S/4 HANA may need to operate at near zero recovery time objective(RTO). This can be achieved by SAP native high availability configuration using pacemaker solutions offered by SUSE or Red Hat, and other third-party clustering software providers.
Scenario 2: Amazon EBS events
Scenario 2a: Independent or single EBS volume event
SAP Databases: an Amazon EBS volume can be recovered from an EBS snapshot of that volume. This recovery is consistent only if the snapshot was invoked after the database was held in backup mode. However, when recovering from a snapshot that was taken without invoking backup mode, you can achieve consistency by applying database log files, as you will be working from a known database checkpoint. One example of this approach is described in the blog post, “How to use snapshots for SAP HANA database to create an automated recovery procedure”, in the AWS for SAP blogs.
SAP Applications: Amazon EBS snapshot is a frequently used method to meet the RPO requirements of SAP application servers. As Application servers do not store data that requires long-term persistence, failed volumes can be recovered from EBS snapshots made from the application server’s EBS volumes.
Scenario 2b: EBS volume in a volume group
Often SAP databases are installed on a volume group, using the operating system’s logical volume manager (LVM), which is striped across multiple volumes for performance. During an event with an EBS volume in volume group, you have to rebuild the volume group and use a roll forward recovery approach, by restoring the database from a known good backup, then applying database log to bring the database back up to the desired point of time, prior to the failure.
An alternative approach may be used by following the steps in Scenario 2a, for using EBS snapshots to backup and restore the database data.
Scenario 3: Availability Zone events
Availability zone failure can impact SAP applications and databases. Mission critical production SAP applications and databases are deployed with highly available architecture across AZs to mitigate events within an AWS Availability Zone. You can further lower your RTO by clustering SAP applications across AZs using third party products.
Non-production SAP instances are often deployed in single AZ. Such applications rely on Amazon AMI backup, file backups and EBS snapshots. During an event within an AWS Availability Zone, you could recover the impaired EC2 instances from a recent AMI and recover your database to a known good point in one of the available AZ’s within the region.
Scenario 4: Regional events
Amazon AMIs, Amazon EBS snapshots, Amazon S3 buckets with database backup files, and Amazon EFS file systems can be replicated across regions for Disaster Recovery. Tools such as CloudEndure can also be used to replicate EC2 instances, or even on-premises based servers across regions, which helps in building reliable DR strategies at low cost. Customers often run table top DR tests, to exercise how well they are prepared for an event, and constantly update their DR strategies.
To recover databases from regional events, launch an EC2 instance using the latest AMI of the database instance, and recover the database using its backup. To achieve this, you need to ensure AMIs and database backups are copied across the regions during normal operations. Features such as S3 Replication, AWS DataSync, and AWS Backup can help achieve replication of data across the regions.
For low recovery time/point objective(RTO/RPO), databases can be replicated data across the regions using native database replication technologies. During a regional event, the replicated database in the DR region can take over as the primary. This approach can provide you the lowest RPO possible.
Snapshots provide second layer of security for reliability. You can back up the data on your Amazon EBS volumes by taking point-in-time snapshots.
SAP application servers can be backed up using AMIs. These AMIs can be copied across to the designated region for DR, by either AWS backup or using AMI copy feature. During a DR event, use the latest AMI copy to launch your application server EC2 instances. An AMI copy based DR strategy provides a low cost DR approach.
Alternatively, CloudEndure Disaster Recovery can be used for replicating SAP application servers across regions. This approach can ensure the replicated VMs are current, in terms of operating system patches, configuration, and SAP kernel version, ready to go online during an event.
In this blog, we discussed various tools available to backup your SAP resources on AWS. We also looked at several availability events and how to recover from each of them. I hope this will help you in develop resilient business continuity and disaster recovery strategies for your SAP workloads on AWS. Let us know if you have any comments or questions — we value your feedback.