How DocuTAP automates cloud operations using AWS Management Tools

Now that large organizations have the flexibility to quickly launch infrastructure and leverage new services, they must find the means to maintain consistent controls without restricting development velocity.

In this guest post, Brian Olson, Cloud Architect at health care company DocuTAP, discusses how a combination of AWS Management and Governance services and AWS developer tools allows a small, agile team of engineers to programmatically provision, monitor, and automate all the components of their cloud environment.

Introduction

DocuTAP is an on-demand focused health care company. Our main two applications, Practice Management (PM) and Electronic Medical Records (EMR), run on Windows. DocuTAP has over a thousand clinics running our software, powered by several hundred Amazon EC2 instances.

Our engineering team has fewer than 10 members, including DBAs, DevOps engineers, and system administrator. It’s impractical for this small team to manage hundreds of EC2 instances by hand. In fact, no one on the engineering team has even logged into all of those servers.

In this blog post, I’m going to walk you through our solution for scaling configuration management in a Windows environment. Our solution consists of source control, a pipeline for delivering changes, several AWS services (AWS CodePipeline, AWS CodeBuild, AWS CodeDeploy, and AWS CloudFormation), some Windows PowerShell tools and techniques (Modules, Desired State Configuration, and Pester for unit tests), and the idea of paranoid defensive programming.

The Pipeline

We use a continuous integration (CI), continuous delivery (CD), and Git workflow pipeline. You can use source control to track changes to your automation, and every time you commit a change, the pipeline kicks off automated tests and starts moving that change toward production.

At DocuTAP we built out this pipeline with a combination of GitHub and several AWS services, as shown in the following diagram:

Infrastructure Deployment: CloudFormation

We’re invested in CloudFormation at DocuTAP. We use the YAML configuration language because it’s intuitive enough for developers and infrastructure engineers to collaborate on. The CloudFormation service usually does a good job of managing updates to resources, and CloudFormation Change Sets allow you to preview the changes CloudFormation is going to make before you turn it loose.

At DocuTAP we decompose CloudFormation templates into a several distinct types.

Pipeline stack
The pipeline stack holds our CodePipeline setup, CodeBuild jobs, and a reference to the GitHub repository. This is the only stack we deploy manually, so we keep it as small as possible.

Dependency (DEP) stack
The DEP stack holds things that apply to an entire environment of the application like IAM roles, CodeDeploy groups, load balancers, and general AWS Systems Manager State Manager associations that run patch scans and server inventory. This stack groups things that are painful and time consuming to delete and recreate.

Instances (INST) stack
Our instances stacks have the EC2 instances, security groups that are not referenced by other stacks or services, and Auto Scaling policies. These stacks group things that are meant to be ephemeral. These are things that, when they break, you’re likely to say, “Delete it and try again.”

Systems Manager stacks
These hold State Manager associations that run instance-specific configuration. We keep these broken out so it’s easy to connect Run Command documents to the stacks they’re created by.

PowerShell Modules
Here’s a diagram of a Systems Manager State Manager run:

We keep our Systems Manager Run Command documents as thin as possible. These documents have to contain some PowerShell to kick off a config run, but it’s difficult to unit test or debug PowerShell wrapped in YAML. The less they do the better.

We break apart our PowerShell scripts into modules that expose common functions and give us some organization to group common features together.

As you start to break apart your scripts into modules, managing module dependencies becomes very important. Here are some general guidelines:

Keep your module dependencies going in one direction (two modules should never import each other).
Group code into layers (e.g., Windows, application type, server type).
Document, document, document what’s in each module and when and how to use it.

For example, at DocuTAP we have a Windows module, Electronic Medical Records (EMR) and Practice Management (PM) module, application server, and database server modules that interact like this:

Configuration Management and Idempotency: Defensive, paranoid programming

The PowerShell code that we maintain in these repositories runs every half hour to check configuration and apply changes if configuration has drifted.

Some of these changes could interrupt production workloads, like renaming the server and triggering a reboot, or updating directory permissions.

That means a couple of things: every script must be idempotent (meaning you can run the script multiple times and always get the same result), every change must be tested and retested, and good logging is very important.

Other than some simple PowerShell DSCs, most idempotency is the responsibility of the developer building the script. Developers need to make sure they have reasonable “happy state” checks to see if they need to make changes, and good safety checks before and after they apply changes.

In our example, we have a test to make sure the server name needs to be updated:

$tags = Get-EC2Tag -Filter @{Name = "resource-id"; Value = Get-MyInstanceId}
$serverName = ($tags | Where-object {$_.Key -eq "Name"}).value
if ($serverName -ne $env:computername) {
    write-log -message "Computer name does not match host name! Renaming to match, this will trigger a reboot!" -level "WARNING" -category "Windows" -item "hostname";
    rename-computer -newname "$serverName" -force -restart
}

And on any real server you would want an extra test before triggering the reboot to make sure your server can be safely rebooted. In our case, the most relevant check is for logged-on users.

The ideal format for a configuration management script or function is to do the following:

Test if we need to make a change, and log if we do.
Test if it’s safe to make a change, and log if it is.
Make the change.
Test that the change worked, and log the result.

Having three tests or “safety checks” for every actual change you might make may seem excessive, but these scripts are going to be running unattended. If something goes wrong the more logging you have to look into the problem the better off you’ll be.

Continuous Integration with CodeBuild: Unit tests

We run unit tests on our PowerShell code with a PowerShell module called Pester.

Lots of languages have unit testing frameworks, but here are some quick tips for writing PowerShell unit tests:

Keep your PowerShell functions as simple as possible:
1. Limit the number of inputs.
2. Limit the number of outputs, or Cmdlets your functions call.
3. Limit the amount of branching inside of a function.
Use functions to abstract over other management SDKs:
1. We use the Citrix PowerShell SDK a lot. Instead of installing it everywhere we need unit tests, we have a wrapper module we can mock in our unit tests.
2. Some built-in PowerShell functions are difficult to mock (for example, Rename-Computer).
Document, document, document:
1. Document your code so it’s clear what you’re testing for, and why the results are valid.

Let’s walk through a quick example from the GitHub repo.

Inside of CommonFunctions.psm1 we have a function for getting the instance ID for the server your code is running on. This function is pretty small, and looks like this:

function Get-MyInstanceId {
    if($global:instanceid -eq $null) {
        try {
            $global:instanceID = Invoke-RestMethod "http://169.254.169.254/latest/meta-data/instance-id"
        } catch {
            invoke-bomb "Could not get instanceid from ec2 meta data" 99;
        }
    }
    write-host $global:instanceid;
    return $global:instanceid;
}

In this example invoke-bomb is a function that performs graceful error handling.

This function depends on EC2 metadata, which wouldn’t be available on my laptop. However, I’d still like to make sure it will return the right value when EC2 metadata is available. To do this I add a Pester test that mocks Invoke-RestMethod that looks like this in another .ps1 file that sits in the “tests” directory:

It "Returns the meta data result for the instance name" {
    Mock Invoke-RestMethod {return "instance_name"} -modulename CommonFunctions -Verifiable;
    Get-MyInstanceId | Should Be "instance_name"
}

This is a trivial example, but you get the idea.

One last point on unit tests is to be pragmatic about them. The more unit test code you have, the more unit test code you have to maintain. Only add unit tests when they clearly add value, because testing the same result repeatedly doesn’t buy you anything but more maintenance effort.

Continuous Integration with CodeBuild: PowerShell DSCs

The last step we have for CodeBuild is to output our MOF files for our PowerShell DSCs. If you’re not familiar with PowerShell DSC, there’s a good overview here.

DSC is a good fit when you can use a built-in DSC resource for creating files or adding Windows features, or you have a very small configuration, like installing chocolatey. If you find yourself writing complex TestScripts (the script DSC runs to decide if it needs to update configuration), you’re probably better off writing a function in a PowerShell module that will be easier to unit test.

Here’s an example of our custom script to install chocolatey:

Script InstallChoco {
  GetScript = {
      $result = Test-Path C:\ProgramData\chocolatey\choco.exe;
      return @{Result = $result}
  }
  SetScript= {
      Set-ExecutionPolicy Bypass -Scope Process -Force; Invoke-Expression ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))
  }
  TestScript = {
      Test-Path C:\ProgramData\chocolatey\choco.exe
  }
}

CodeBuild builds the MOF by importing the PowerShell module and calling the name of the DSC resource:

Import-Module .\DscHsWindowsManage.psm1;
Get-Module;
DscHsWindowsManager;
Remove-Module DscHsWindowsManage;

This gets pushed by CodeDeploy later on, but we’ll cover that in more detail later.

Configuration Run

All of that works gets us to running instances, using IAM profiles. We use AWS CodeDeploy to push our PowerShell management modules out to these instances. In our example, we’re just renaming the server to match the EC2 Name tag, but we do more complex things like setting up Citrix applications, configuring directory permissions, installing applications, etc.

Once the modules are on the server we use a simple Run Command document to import those modules and call the functions inside of them. Here’s a high level overview of the config run:

Common configurations like scanning for patches and configuring the CloudWatch Logs agent run once a day.

Logging

We built out a logging function that prepends a time stamp, formats the log messages as JSON (so they’re easier to search with CloudWatch Logs), and outputs them to a file that rolls over regularly.

function write-log([string]$message, [string]$level = "INFO", [string]$category = "general", [string]$item = "NA") {
    $logLine = "$((get-date).ToUniversalTime().toString("yyyy-MM-dd HH:mm:ss")) {`"computername`": `"$($env:computername)`", `"level`": `"$($level)`", `"category`": `"$($category)`", `"item`": `"$($item)`", `"message`": `"$($message)`"}"

    # Dump the log message into std out
    write-host $logLine

    $logFile = "$($global:logDir)\$((get-date).toString("yyyyMM")).log"

    $logLine | out-file -encoding 'UTF8' -append -filepath $logFile
}

These logs get routed to CloudWatch Logs by the CloudWatch Logs agent so we can view them there.

Here’s a snippet of our CloudWatch Logs agent config that gets applied by a Systems Manager State Manager association:

{
    "Id": "SSMConfigLogs",
    "FullName": "AWS.EC2.Windows.CloudWatch.CustomLog.CustomLogInputComponent,AWS.EC2.Windows.CloudWatch",
    "Parameters": {
        "LogDirectoryPath": "C:\\Docutap\\SSMLogs\\",
        "TimestampFormat": "yyyy-MM-dd HH:mm:ss",
        "Encoding": "UTF-8",
        "Filter": "",
        "CultureName": "en-US",
        "TimeZoneKind": "UTC",
        "LineCount": "1"
    }
}

Because we format our logs as JSON we can use CloudWatch Logs Insights to filter and search our logs using the Insights query language.

For example, here’s a query looking for the most recent application versions that have been installed on one of our worker servers.

AWS is how!

About the Authors

Brian Olson is a Cloud Architect at DocuTAP Inc. He has a B.S. and an M.S. in Computer Science, and became passionate about DevOps in late 2013 when he took a job as a Sr. Security Analyst and started to work on automating simple steps of security research.

If you’re interested in more content on using PowerShell to interact with AWS and manage Windows servers, he runs a blog at https://rollingwebsphere.blogspot.com/.

Reach out to him on LinkedIn if you’re interested in talking about DevOps or cloud technology!

Eric Westfall is an Enterprise Solutions Architect at AWS. He helps customers in the Education vertical deploy workloads to AWS. In his spare time, he helps customers manage their cloud and hybrid environments using AWS Management Tools.

The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

AWS Cloud Operations & Migrations Blog