Automating the update process of a clustered SAP HANA DB using nZDT and Ansible
SAP HANA is the defacto database for new SAP deployments and will be the only choice in the near future, with SAP ending general support for all non-HANA based systems by 2027. Patching databases in a consistent and automated manner is key to reduce TCO, especially for customers who operate a large number of HANA instances. Often times, HANA System Replication (HSR) is also enabled between HANA nodes. When combining this with pacemaker clusters, customers achieve HA architecture for their databases. When the database is clustered in this fashion, you can benefit from the nZDT (near-Zero Downtime) method of patching the HANA software. In this blog post, we will explain how to carry out nZDT patching of clustered HANA nodes and also demonstrate a sample Ansible playbook that automates the entire process on Red Hat Enterprise Linux (RHEL) based systems.
The database is nearly always available throughout the patching process, on at least one node. The patching activity, without clusters, is fairly well documented on the SAP help site but, there is only limited information available on how to perform nZDT patching when the HANA nodes are clustered.
- A working HANA HSR cluster pair (an easy and fully automated way to deploy working SAP HANA clusters is to use AWS Launch Wizard – Get HANA installations done in a few hours. For a full explanation, see the Launch Wizard User Guide)
- Required OS patches (if any) have been applied prior to HANA patching. You can get more information from the following notes.
You need “Red Hat Enterprise Linux for SAP Solutions” (if BYOS) or “Red Hat Enterprise Linux for SAP with High Availability and Update Services” if from AWS Marketplace. For supported operating systems refer to OSS Note 1631106. You may also consult the Red Hat Enterprise Linux for SAP Solutions subscription knowledgebase article.
- You will need the root password and it should be the same on both nodes – contact your organization’s Linux admins if you need help with this process.
- You will need the SYSTEM account password for the SYSTEMDB and TENANT – contact your organization’s DB admins if you need help with this process.
- You have a working Ansible infrastructure available and configured to run playbooks on HANA nodes.
- You have the desired HANA patch software package on an S3 bucket or staged on the file system (the automation can source it from both)
Download SAP HANA patch software from SAP Marketplace Software Download Center. (SAP Marketplace account required to access download area)
- Amazon Elastic Compute Cloud (EC2) has appropriate Identity and Access Management (IAM) roles to access Amazon Simple Storage Service (S3) bucket in case the patch file is sourced from a bucket.
The general process to patch HANA clustered nodes with nZDT method is listed below. For this example, we will assume node 1 is currently the primary and node 2 is the secondary node.
* caution: the sequence of steps is important
- Put cluster node 2 in standby mode
By enabling standby mode, the specified node is no longer able to host resources. Any resources currently active on the node will be attempted to be moved to another node, if constraints allow. In this case, the HANA instance is the Secondary role and the only other available cluster node 1 already hosts the Primary Instance, hence no resources are moving to cluster node 1.
- Put the cluster into maintenance mode
When putting the entire cluster in to maintenance mode it makes sure that the cluster will not manage any cluster resources. It is essential because during HANA patching the HANA service may be stopped intermittently and the cluster would interfere otherwise.
- Update the HANA software on node 2
Patching HANA on the Secondary node. This step is the core part to update the software version for the database.
Throughout the patching process, the secondary node will be unavailable for certain periods of time. The primary node is operating as usual and serves the SAP application and users, though.
- Take node 2 out of standby mode
In this step cluster node 2 is made available to the cluster again and accepts resources.
- Turn off maintenance mode for the cluster
When maintenance mode is disabled, the cluster re-establishes the HSR between Primary node and Secondary node, automatically. Note, that SAP does support having Secondary node on higher patch level than the Primary node. The Secondary node syncs up with the Primary node.
- Validate that replication has resumed and that it is healthy
At this phase we have to wait until the Secondary is fully sync up again with the Primary node and is ready to take over (status SOK).
- Put node 1 in standby mode
In standby mode, the Primary cluster node cannot host any resources anymore, and it triggers the takeover of the Primary HANA role from node 1 to node 2. The cluster will demote the current Primary node and promotes the HANA instance on cluster node 2. The cluster also moves the overlay IP over to cluster node 2 and modifies the route table. This ensures that after the take over SAP and users can continue to connect to the HANA database.
- Wait for takeover to complete
Takeover process takes a short time. Before patching node 1 we have to make sure the HANA instance on node 2 has fully assumed the Primary role.
- Put the cluster into maintenance mode
To avoid having the cluster interfere with any of the patching process, the entire cluster needs to be put in to maintenance mode.
- Patch the HANA software on node 1
At this stage, HANA DB is running on node 2 as Primary and accepts connections from the SAP system as well as from the users. HANA instance on node 1 can now be patched.
- Take node 1 out of standby
Once patching is completed on node 1, the cluster node can be enabled again by removing it from standby status.
- Turn off maintenance mode for the cluster
When the cluster gets out of maintenance mode, it will make sure that cluster node 1 will have HANA instances started. Since currently the HANA instance on cluster node 2 has the Primary role, the HANA instance on cluster node 1 will be started as Secondary role and replication from node 2 to node 1 will be started.
- Clear the cluster resources
During maintenance activities some errors or alerts may occur in the cluster framework. They have to be cleaned up to give the cluster a fresh start.
To summarize, during the patching process the database should be accessible at least on one node at any given time with the exception of the brief outage when the takeover occurs. Note: the primary and secondary roles will switch places by the end. This is normal and will not affect the operation of the database. You may switch back to the original topology at a convenient time.
The sample Ansible playbook code, that is automating the entire process, is located in this public github repo.
Preparing to run the Ansible playbook
- Download the target HANA patch SAR file from SAP Marketplace and place it into an S3 bucket or somewhere on the file system of the servers. Make sure the bucket or directory does not contain any other files than the one SAR patch file. Below is an example bucket that contains the SAR file for the patch HANA SP05 rev64.
- Clone the repo to the Ansible controller server
One way to clone a repo is using git clone command – see reference section for git commands. For this, git needs to be installed first – see instructions how to install on Linux.
- Change directory to the cloned repo and create an inventory file that has a group in it, named “SAP_<SID>_hana_ha” and add the two HANA node IPs to the group. For example if the HANA SID is “HDB”, HANA node 1 IP is 10.20.30.40, HANA node 2 IP is 10.20.30.50, then the content of the inventory file should be something similar:
- The automation will need to be aware of various credentials to be able to run HANA patching tool, hdblcm. As a security best practice, passwords should not be plain visible in variable files or any other way. Ansible provides the ansible-vault tool to help to encrypt sensitive information. The HANA patching ansible playbook solution expects a vault file, called “passvault.yml”, that contains the following credentials…
root password – variable name: ROOTPWD
<sid>adm password – variable name: SIDADMPWD
SYSTEM @ tenant password – variable name: SYSTEMTNTPWD
SYSTEM @ SYSTEMDB password – variable name: SYSTEMDBPWD
When adding these credentials to the “passvault.yml” file, it stores the variable name
and its encrypted value.
For example, to add the encrypted root password to the passvault.yml file run:
ansible-vault encrypt_string ‘theactualpassword’ --name ‘ROOTPWD’ | tee -a passvault.yml
As another example, to add the encrypted password of SYSTEM user in SYSTEM DB to the passvault.yml file run:
ansible-vault encrypt_string ‘somepassword’ --name ‘SYSTEMDBPWD’ | tee -a passvault.yml
When adding the password variables use the same vault password for encryption. Once all the encrypted passwords are added, make sure they start in a new line in the file. The file should look something like this after adding all required passwords.
[root@ip-***-***-***-*** sap-hana-update-cluster-nzdt]# cat passvault.yml
SYSTEMDBPWD: !vault |
SYSTEMTNTPWD: !vault |
ROOTPWD: !vault |
SIDADMPWD: !vault |
Once the file is set up, carry on with the next steps.
Running the playbook
Switch directory to the cloned repo root and issue the following command:
ansible-playbook -i <inventoryfile> --ask-vault-pass -e "SID=<SID>" -e "MEDIASRC=<s3/fs>" -e "MEDIALOC=<locationofSARfile" ./patch_sap_hana.yml
for example, in case the inventory file is “myinventory”, the HANA DB SID is “HDB”, the MEDIA source is from S3 bucket “s3://hanapatch/” then use the following syntax:
ansible-playbook -i myinventory --ask-vault-pass -e "SID=HDB" -e "MEDIASRC=s3" -e "MEDIALOC=s3://hanapatch/" ./patch_sap_hana.yml
As another example, in case the inventory file is “myinventory”, the HANA DB SID is “HDB”, the MEDIA source is from file system and the SAR file is in /tmp/hanapatch/, then use the following syntax:
ansible-playbook -i myinventory --ask-vault-pass -e "SID=HDB" -e "MEDIASRC=fs" -e "MEDIALOC=/tmp/hanapatch/" ./patch_sap_hana.yml
The automation patches the nodes in the sequence discussed earlier. Please note that at the end of the patching process the roles of the nodes would swap. The original roles of the nodes can be reverted anytime later. To verify that patching worked, you can find the new patched version of each HANA nodes at the tail end of the ansible logs.
The password template file is cleaned up automatically after a successful run of the playbook.
The software is still available after the playbook has been run in case it is needed again. If it is not required anymore it’s recommended to archive or simply delete.
Besides the cost of the two HANA nodes, the automation may need a small Ansible control node, running on an Amazon EC2 instance.
The HANA Patch software needs to be stored in a S3 bucket. A typical patch file is about 3~4 GB which equates to only a few dollars a year (0.023$ / GB / month).
To learn more about SAP HANA HSR concepts follow the SAP official help documentation.
Find more answers to common questions to HANA HSR in SAP Note 1999880 – FAQ: SAP HANA System Replication.
To learn more about SAP pacemaker clusters for HANA on RHEL read the official SAP HANA on AWS guide.
Take advantage of the AWS Free Tier services to help with learning about using Ansible with SAP on AWS, at a minimal cost. You can set up an EC2 instance using AWS Free Tier, with Amazon Linux 2. This instance can be used to run an Ansible control node.
To learn more about ansible modules and coding techniques, read the official documentation of ansible.
In this example we have discussed how the nZDT patching process looks like for clustered HANA nodes. Also, as an example, we provided an Ansible playbook to demonstrate how to automate the process.
Please note that AWS also supports AWS Systems Manager Documents (SSM documents) to automate HANA patching, however, clustered nodes are not yet supported.
For SLES based systems, the concept is the same. Replace the pcs commands with the respective crm equivalents in the code, or use the YAST module to aid the process.
Call to Action
Get started and try to deploy a new HANA cluster using AWS Launch Wizard for SAP. Make sure to familiarize yourself with the online documentation first.
Install ansible on a free tier amazon instance and clone the sample code from our public repo.
Verify the roles of the HANA cluster nodes, and check the HANA DB version before patching.
Prepare the inventory and passvault.yaml files, and launch the patching playbook.
Verify again the roles of the HANA cluster nodes, and check the HANA DB version after patching. Which node is the primary now?
Make sure to remove the resources after testing to avoid unnecessary costs.
Updating SAP HANA Systems with SAP HANA System Replication official SAP pages.
Use SAP HANA System Replication for Near Zero Downtime Upgrades.
SLES nZDT patching of HANA cluster with YAST module.
Git Clone command reference.