How Atos Transformed a Customer’s SAP Landscape with Minimal Downtime and Low Risk
By Mark Ross, Lead Cloud Architect at Atos
By Yusuf Okul, Partner Solutions Architect at AWS
Like many SAP customers, one of Atos’ customers had a large, on-premises environment, encompassing a combination of virtual and physical servers and traditional storage.
To support future business objectives, they sought to update their SAP applications, replace Oracle databases with HANA, update their security posture, and remove on-premises provisioning constraints around changes and new environments.
For a global business, arranging downtime on the mission-critical SAP landscape can be challenging. A solution was required that allowed the customer to implement all of the necessary changes within a single, short outage window.
This needed to be achieved whilst migrating multi-TBs of data to Amazon Web Service (AWS) over limited AWS Direct Connect bandwidth, replacing the operating system and database layers, encrypting all of the interfaces, and upgrading the SAP applications.
An AWS Partner since 2013 with a 35-year history of delivering SAP solutions to customers, Atos was perfectly placed to help the customer to achieve their goals. Atos is an AWS Advanced Consulting Partner with the AWS Migration Competency, and is a member of the AWS Managed Service Provider (MSP) and AWS Well-Architected Partner Programs.
In this post, we will describe the migration process used to move the customer from their on-premises estate to AWS. We will also describe the target architecture used within AWS, along with the benefits the customer has achieved both during and post migration.
Migration Architecture Overview
A traditional migration and upgrade approach directly from on-premises to AWS couldn’t be achieved with the customer’s amount of data, number of systems to upgrade, and available Direct Connect bandwidth.
Benefitting from the increased agility of AWS, Atos was able to undertake a proof of concept (POC) to develop the migration approach for the customer, and then iterated to three variations depending on the source systems.
The migration approach used the following principles:
- Undertake as much of the upgrade as possible without downtime on the on-premises systems (SAP uptime phase).
- Replicate the systems into AWS in advance of the migration, and maintain replication through the SAP uptime phase by copying the changes to avoid bandwidth limitations.
- Launch replica systems in AWS to run the downtime phase of the upgrade, and the migration to the target systems in AWS to provide additional system performance and low latency connectivity as the data was streamed into the new systems.
- Undertake migration of dev and test systems in advance to build confidence in the process and allow full regression testing, due to the amount of change happening within the single outage window.
- Undertake “dress rehearsals” of the production upgrade and migration to further build confidence, find issues, and refine the timings in the plan.
- Use newly-built systems in AWS as the target with optimized images such as Red Hat Enterprise Linux for SAP with HA and US via AWS Marketplace.
Through the POC phase, Atos found the appropriate approach was to use CloudEndure Migration for replication for the majority of the system. However, due to the size and number of disks associated with the largest SAP systems, Atos used a combination of CloudEndure and a migration process from one of their partners for CRM and ECC.
Finally, the POC showed that CloudEndure wasn’t an appropriate solution for replicating SAP HANA appliances, so temporary legacy HANA systems were built as the intermediate step for these.
The following diagram gives an overview of the migration architecture for the solution.
Figure 1 – Migration architecture.
Target Architecture Overview
The target architecture was designed in line with the AWS Well-Architected Framework. The solution has development, test, and production environments, with separation across accounts and virtual private clouds (VPCs).
The solution is built in code, allowing for version-controlled changes, and is managed via CI/CD. AWS native services are used wherever possible; for example, on-premises network file systems servers were replaced with Amazon Elastic File System (Amazon EFS), while tools like AWS CloudTrail, AWS Config, and Amazon CloudWatch are used for logging, change detection, and monitoring.
AWS Systems Manager is used for remote access and patching, and AWS Backup is used for snapshot backups. These services were augmented with additional third-party products such as Datadog, ServiceNow for ITSM, and Dell Networker for SAP HANA databases and logs.
The operating systems utilize AWS images, augmented with additional configuration and agents as part of our Amazon Machine Image (AMI) baking process, which layers in anti-virus and monitoring agents, as well as hardening in-line with Center for Internet Security (CIS) benchmarks. Additional configuration management tasks are performed in code using Ansible.
The production environment is built to be highly available within a region. SAP application servers are deployed across multiple AWS Availability Zones (AZs), with sufficient numbers of additional application servers deployed but powered off should an AZ failure occur
SAP database servers are active/passive across Availability Zones, using RHEL Pacemaker clustering to handle the failover between nodes in the different AZs. Overlay IP is used to ensure connectivity to the database both inside and outside the VPC, either as part of a planned activity (system patching, for example) or reaction to a failure event like the loss of a server or Availability Zone.
AWS native services (with availability built in) supplemented this configuration, with use of the AWS WAF, Elastic Load Balancing, Amazon FSx, and Amazon EFS ensuring the solution can withstand an AZ failure and continue to operate.
As part of the Operational Acceptance Test (OAT) phase of the project, a full disaster recovery (DR) test simulating the loss of an AZ was undertaken to prove service could continue un-interrupted.
This architecture provides the benefit of the solution being highly available across AZs constantly, and each part of the architecture is regularly in use as systems are patched, rebooted, and failovers seamlessly occur.
This provides sufficient additional benefit over the on-premises passive DR architecture which would be tested once per year. Downtime is also reduced through techniques such as SAP Rolling Kernel Switch (RKS), and the service continues to operate when components are taken out of service to be patched.
Figure 2 – Target architecture.
Atos achieved excellent results for the customer through the combination of their build approach, adhering to AWS best practices as well as their migration approach.
Results achieved for the customer included:
- Increased agility post-migration to AWS. Atos provided the ability to alter capacity requirements and spin up new environments, including increasing the ECC system from 6TB of memory to 9TB of memory in a fraction of the time and risk associated with a similar upgrade on-premises.
- Downtime kept within anticipated single window for the migration and upgrades.
- All versions of SAP applications in scope upgraded (ECC, BW, CRM, Portal, Business Objects, Solution Manager).
- All operating systems updated.
- Oracle databases replaced with SAP HANA.
- Migration to AWS achieved without saturation of the limited Direct Connect bandwidth.
- Significant improvement in security posture—interface encryption, alignment to CIS security benchmarks, successful security audit via IT Health Check.
- Increased compliance and simplified operations.
- Seamless switch from on-site, co-located working to 100 percent remote working as COVID-19 lockdown was initiated in the United Kingdom, with collaboration between Atos, the customer, AWS, and Atos partners.
With the migration processes used by Atos, it was possible to achieve a significant amount of change for the customer with minimal downtime, and at low risk.
With the agility possible using AWS, Atos was able to utilize an approach that changed all layers of the solution in a controlled, planned, and tested way—and with rollback possible should any issues occur. Additional computing power could be used on-demand to complete the upgrade, and then removed once no longer required to shorten the upgrade timeline.
The new solution is based on AWS best practices, has been subject to rigorous testing (including for security posture), and is one of the largest SAP ECC on HANA migrations to AWS in Europe.
The customer is now positioned to undertake the change program the migration and upgrades has unlocked with increased agility. The ability to spin up new systems to accommodate development projects in parallel, rather than being environment constrained on-premises, is a big plus.
In addition, the customer has already benefitted from the reduced upgrade time to increase the size of the ECC system’s HANA footprint in an upgrade time that would have been impossible on premise.
Atos – AWS Partner Spotlight
Atos is an AWS Advanced Consulting Partner and leader in digital services that believe bringing together people, business, and technology is the way forward.
*Already worked with Atos? Rate the Partner
*To review an AWS Partner, you must be a customer that has worked with them directly on a project.