Microsoft Workloads on AWS

Provisioning SQL Server on Amazon EC2 with AWS Lambda, AWS Step Functions, and AWS Systems Manager

Customers often provision Microsoft SQL Server on top of Amazon Elastic Compute Cloud (Amazon EC2) to take advantage of advanced customization capabilities, such as putting temp_db on ultra-low latency storage, storing SQL Server data files across multiple Amazon Elastic Block Store (Amazon EBS) volumes in a RAID array, or other operating system customizations. Starting with SQL Server 2017, customers also have the option to deploy SQL Server on Amazon EC2 using Amazon Linux.

Automating the process of deploying SQL Server on Amazon EC2 not only simplifies the deployment, but it also gives customers the ability to spin up additional SQL Server instances for additional capacity or spin down instances when not in use to save money. The provisioning of SQL Server is a multi-step process, including infrastructure provisioning, operating system, and application configuration.

Optum (part of UnitedHealth Group) was able to automate this process of provisioning SQL Server on Linux utilizing Amazon EC2, AWS Systems Manager, AWS Step Functions, AWS Lambda, and Amazon Simple Storage Service (Amazon S3). Optum uses this automated provisioning to spin up additional SQL Server instances for better isolation between various batch processes. Optum also uses this process to spin up ad hoc servers for troubleshooting and provide additional capacity during annual benchmarks and monthly runs.

Overview of the Solution

In this blog post, we will provide you with the code you need to get started with the automated provisioning of your own SQL Server instances. After you have built this automation, you may want to schedule or trigger the build of new SQL Server instances using Step Functions, as discussed in the AWS documentation.

This solution relies on the following AWS services:

  • Amazon EC2 – for the SQL Server instance itself
  • AWS Step Functions – to orchestrate the provisioning process
  • AWS Systems Manager – to run commands on the SQL Server EC2 instance, for example, restoring a backup after the instance is provisioned
  • AWS Lambda – to call AWS APIs to provision EC2 instance, run Systems Manager commands, and wait for Systems Manager commands to finish
  • Amazon S3 – to store artifacts (shell scripts) and database backups

Architecture Overview

Architecture of the SQL Server provisioning process

Figure 1. Architecture of the SQL Server provisioning process

Process Flow

  1. Step Functions State Machine starts – either invoked manually or by another supported invocation method.
  2. Step Functions launches the first Lambda function{create_instance} that starts and configures an EC2 Linux Instance with SQL Server. The Lambda function returns the EC2 instance ID to the Step Functions state machine for use in the invocation of the next Lambda function in the workflow.
  3. Step Functions launches the second Lambda function{run_sql_command} to run commands on the EC2 instance using Systems Manager. These commands download a database backup from Amazon S3 and restores it. This Lambda function returns the Systems Manager command invocation ID to the Step Function state machine for use in the invocation of the next Lambda function in the workflow.
  4. Then the Step Functions launches the third Lambda function{wait_sql_command} to wait for the Systems Manager commands to finish.
  5. The step function completes.

Step functions process flow

Figure 2. Step functions process flow

Prerequisites

For this walkthrough, you should have the following:

  • Provisioned AWS account
  • Amazon Virtual Private Cloud (Amazon VPC), subnet, and security group for the EC2 instance providing internet connectivity to download database backup and deployment packages
  • AWS Identity and Access Management (IAM) role with appropriate permissions to access Amazon S3 and Systems Manager Parameter Store
  • AWS IAM role for Step Functions with permissions to run the Lambda functions
  • AWS IAM role for the Lambda functions with permissions to create EC2 instances, access the Systems Manager Parameter Store (read/write), and run Systems Manager commands (send/run commands)

** please ensure you are building IAM roles using least privilege **

 

Implementation

We will provide the code for Step Functions and Lambda functions, which should allow you to launch your own SQL Server on Amazon EC2 Linux instance from an Amazon S3 backup.

Lambda Function 1 {create_instance.py} Timeout – 5 minutes, Python 3.8

Let’s start with the Lambda function for provisioning the Amazon EC2 SQL Server instance. This Lambda function first creates the EC2 instance from an AWS license-included SQL Server Amazon Machine Image (AMI) in the specified VPC and subnet. The EC2 instance created by this Lambda function will have a root partition and 5 additional GP3 Amazon Elastic Block Store (Amazon EBS) volumes, of which 4 will be used for SQL Server data and 1 for SQL Server logs. An EC2 User Data script will run when the instance is launched and will download shell scripts from Amazon S3 to complete the SQL Server configuration with the following actions:

  • Initialize the SQL Server “sa” account with a random password and store this password in the Systems Manager Parameter Store.
  • Configure the SQL Server “maxdop” parameter dynamically based on number of vCPU and NUMA settings.
  • Configure OS-level settings based on SQL Server best practices for Linux deployment.
  • Create the following mount points: “/var/opt/mssql/tempdb0” through “var/opt/mssql/tempdb7” mount point will be created on the instance store volume, “/var/opt/mssql/ebs-data” mount point will be created on EBS RAID, “/var/opt/mssql/ebs-logs” mount point will be also created on the instance store volume.

On success, the Lambda function returns the instance ID of the EC2 instance, which can be used by the next Lambda function in the step function.

Note that the Lambda functions will return the original event and add the new output to the original event to facilitate the process flow in Step Functions.

The Lambda function accepts the following inputs from the step function:

  • name: name of the EC2 instance to create
  • dataVolumeSize: size for each of the 4 EBS data volumes (20GB volume would create 80GB total size)
  • dataVolumeIops: IOPS for the 4 EBS data volumes (default 3000)
  • logVolumeThroughput: throughput for the 4 EBS data volumes (default 125 MB/sec)
  • logVolumeSize: sSize for the EBS log volumes
  • logVolumeIops: IOPS for the EBS log volume (default 3000)
  • logVolumeThroughput: throughput for the EBS log volume (default 125 MB/sec)
  • instanceType: EC2 instance type, e.g., t3.xlarge
  • ami: AMI ID to use (should be based on an Amazon Linux SQL Server, license-included, base AMI)
  • securityGroupId: security group ID to associate with the instance
  • subnetId: subnet ID to associate with the instance
    • The instance will need to be in a subnet with internet connectivity to download packages
  • iamRole: name of the IAM role
    • Needs access to the Amazon S3 bucket where you have stored the scripts and SQL Server backup
  • keyName: name of the key pair used to SSH into the instance

Lambda Function 2 {run_sql_command.py} Timeout – 5 minutes, Python 3.8

The second Lambda function calls the Systems Manager Run Command to run shell commands on the SQL Server EC2 instance. The shell commands will download a database backup from Amazon S3 and restore it.

This Lambda function expects these inputs:

  • name: same name of instance provided to the first Lambda function
  • instanceId: EC2 instance ID created in the first Lambda function
  • db_name: SQL Server database name to be restored
  • s3_url: S3 location of SQL Server backup file

The Lambda function returns the original event with the addition of the Systems Manager command invocation ID as “commandId”.

Lambda Function 3 {wait_sql_command.py} Timeout – 5 minutes, Python 3.8

The final Lambda function waits for the Systems Manager command to restore the database backup.

Shell Scripts and Step Functions State Machine {SFN def}

Finally, the three Lambda functions discussed above are tied together with Step Functions:

  • The inputs of the Step Functions State Machine specify the characteristics of the instance to create: instance type, name, AMI, EBS volume configurations, IAM role, SQL Server database name and SQL Server backup location.
  • Step Functions calls the create_instance Lambda function, which starts an EC2 instance and returns the EC2 instance ID.
  • Step Functions calls the run_sql_command Lambda function, which runs Systems Manager commands on the provided EC2 instance ID. The run_sql_command Lambda function returns a Systems Manager command ID.
  • Step Functions calls the wait_sql_command Lambda function, which watches the Systems Manager command ID, and provides success or failure of the step function job. Step Functions contains a retry for this Lambda function in case the download or restore takes longer than the wait specified in the Lambda function (a Lambda function can run at most for 15 minutes).

The input to the Step Functions State Machine are the parameters required by the Lambda functions. See example below. Ensure you are using an AMI for Amazon Linux 2 with SQL Server; for example, AMI ami-0874d82d2138e9fd1 for US-East-1 or ami-03ff9501b439808eb for US-West-2. You will want to select an instance type that supports Instance Store Volumes so that temp_db is mapped properly to a local volume.

{
"instanceType": "z1d.xlarge",
"ami": "ami-XXXXX",
"logVolumeSize": 30,
"dataVolumeSize": 20,
"name": "sql-instance",
"subnetId": "subnet-XXXX",
"securityGroupId": "sg-XXXX",
"iamRole": "IAM_ROLE_NAME",
"db_name": "mydb",
"s3_url": "s3://YOUR/S3/PATH/mydb.bak",
"keyName": "YourKeyPair"
}

You can create corresponding Lambda functions to stop, restart or terminate the EC2 instance. The Step Functions State Machine can be triggered manually from another Step Functions State Machine or from an AmazonCloudWatch event – e.g., perhaps you want to automatically stop any instances in a non-production account at night to reduce costs.

 

Cleaning up

In order to avoid unnecessary charges, please be sure to clean up any resources that were built as part of this architecture.  This should include removing any EC2 instances and their associated EBS volumes, Lambda functions, Step Functions, S3 objects, Systems Manager Parameter Store parmeters and other Systems Manager resources that were built as part of this tutorial.

 

Conclusion

In this blog post, we presented a solution that allows you to automatically provision and configure SQL Server on Amazon Linux EC2 instances. The benefit of this architecture is that you will be able to quickly scale your SQL Server capacity with automation. When the provisioning of SQL Server instances is automated, it becomes feasible to stop or terminate these SQL Server instances when not in use. This will allow you to reduce your cost on SQL Server compute, storage, and possibly even licensing costs.

About Optum 

UnitedHealth Group (UHG) is a health care and well-being company with a mission to help people live healthier lives and help make the health system work better for everyone. Optum, part of UnitedHealth Group, is a leading information and technology-enabled health services business. With more than 210,000 people worldwide, Optum delivers intelligent, integrated solutions that help to modernize the health system and improve overall population health.

The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.


AWS can help you assess how your company can get the most out of cloud. Join the millions of AWS customers that trust us to migrate and modernize their most important applications in the cloud. To learn more on modernizing Windows Server or SQL Server, visit Windows on AWSContact us to start your modernization journey today.

TAGS:
Matt Nispel

Matt Nispel

Matt Nispel is an Enterprise Solutions Architect at AWS. He has more than 10 years of experience building cloud architectures for large enterprise companies. At AWS, Matt helps customers rearchitect their applications to take full advantage of the cloud. Matt lives in Minneapolis, Minnesota, and in his free time enjoys spending time with friends and family.

Bill Schneider

Bill Schneider

Bill Schneider is a Distinguished Engineer at Optum. He has over 20 years industry experience working with healthcare data, and 5 years experience with AWS. At Optum, he is a hands-on technical leader with recent focus on re-designing applications for cloud, serverless technologies, and the big-data ecosystem. He lives in the Washington, DC area and in his spare time, he enjoys music, cooking and outdoor activities.

Momchil Georgiev

Momchil Georgiev

Momchil Georgiev is a Senior Manager of Software Engineering at Optum. He has over 20 years industry experience working with library, weather, and healthcare data, and 5 years experience with AWS. At Optum, he is a technical leader re-imagining legacy applications for the cloud world. Outside work, he enjoys spending time with family and friends as well as traveling and cooking.