AWS for Industries

Automating GxP Infrastructure Installation Qualification on AWS with Chef InSpec

Introduction

In this blog, we will discuss automating the infrastructure Installation Qualification (IQ) steps of GxP Computer System Validation using Infrastructure as Code and Chef InSpec in the Amazon Web Services (AWS) environment. In GxP Computer System Validation, the underlying infrastructure supporting a regulated workload is required to be qualified to demonstrate controls for a closed system.

This blog post shares a high-level strategy, automation and lessons learnt for the Qualification of infrastructure supporting GxP workloads running on AWS. We will also discuss how customer’s cloud agility could be maximized by automating infrastructure deployment and infrastructure Installation Qualification (IQ) steps, which helps to significantly reduce GxP qualification cycle time.

Controls managed manually through documents are labor intensive to produce. Once created, it is difficult to maintain the qualified state, especially if the Change Release cycle is very regular. These difficulties result in changes, such as enhancements, bug fixes, and new features now being bundled into bigger releases. Meeting compliance requirements is a key objective to deliver a highly reliable and consistent system. An automated cloud qualification framework improves overall GxP systems compliance of workloads running on AWS, while providing the ability to scale on other regulated workloads throughout the AWS environment.

AWS Services in scope for automating testing

One of the main advantages of infrastructure deployment in the cloud is the ability to automate deployments through Infrastructure as Code. Automating deployment provides enhanced consistency, increased efficiency, improved security and operational visibility. Most importantly it maintains a continuously qualified state of deployed infrastructure. A few examples of Infrastructure as Code and automation tools are AWS CloudFormation, AWS Cloud Development Kit, Terraform, Ansible, Chef, Puppet and SaltStack.

AWS CloudFormation and benefits of Infrastructure as Code in Installation Qualification

AWS CloudFormation (CloudFormation) is a service that gives developers and businesses a way to create a collection of related AWS and third-party resources. It manages them in an orderly and predictable fashion. CloudFormation allows the entire infrastructure to be modeled in a script/text file. This template becomes the single source of truth for your infrastructure, which can be version controlled to provide traceability and auditability. Besides that, having infrastructure deployment scripted and version controlled helps to standardize the infrastructure components used to ensure continuous compliance and faster troubleshooting.

CloudFormation provisions resources in a repeatable fashion.  Infrastructure stacks are built and rebuilt without having to perform manual actions or write custom scripts. CloudFormation determines the right operations to perform when managing your stack, and rolls back changes automatically if errors are detected.

CloudFormation interacts with AWS services via their API. The AWS infrastructure resources can be scripted either in JSON or YAML format templates and presented to CloudFormation, which will then form a stack of resources. A stack is a collection of AWS resources that can be managed as a single unit, or a template.

Chef InSpec

Chef InSpec is an open-source framework for testing deployed infrastructure. The Chef InSpec’s AWS resource pack provides test set capabilities on more than 50 AWS resources. One of the benefits of scripting test sets in InSpec is the ability to write test cases in easily readable formats without the need to write code or call APIs. This enables the ability to perform testing during the post deployment phase to ensure that resources deployed have the expected attributes as specified in the installation specification. The resource pack provides a list of out of the box features to test.

Running an InSpec test on deployed resources simplifies the testing effort in terms of test script manageability, and integration. For example, to test features of a deployed Amazon Simple Storage Service (Amazon S3) bucket, the statement below can be written in Ruby language, which is the language used in InSpec, instead of writing lines of code to test the same attributes:

describe aws_s3_bucket(bucket_name: 'inspec_demo_bucket') do   
  it { should exist }
  it { should have_versioning_enabled }
  it { should have_secure_transport_enabled }
  it { should have_access_logging_enabled }
  it { should_not be_public }
end

Reference architecture

The solution enables end-to-end automation of infrastructure Installation Qualification, from version control of script to protocol generation and dashboard for compliance monitoring.

Figure 1: Reference architecture for automated IQ process using InSpec

Figure 1: Reference architecture for automated IQ process using InSpec

The following steps describes the automation workflow and highlights AWS services and processes for this automation. At the high level, there are three main categories of processes that triggers eight automated steps. The three categories are:

  • CI/CD: AWS CodePipeline for version control, code build, infrastructure deployment, integration and tests (including InSpec).
  • IQ: This is the post deployment steps. It runs the processes to validate resources and generate IQ protocol.
  • Continuous Compliance: Monitors the resources’ attributes for changes from the baseline. The dashboard displays deployed resources with alerts to resources that are different from expected.

Let’s walk through the steps of the end-to-end Installation Qualification automation solution.

Step 1: AWS CodeCommit version control and trigger

Source code and scripts are version controlled in AWS CodeCommit. The repository contains scripts for infrastructure resources deployment and source code for IQ generation. Proper repository permissions need to be defined to ensure permissions are controlled and governed. Please refer to the CodeCommit Identity and Access Management documentation to ensure proper users and roles are setup for various actions.

Step 2: AWS CodeBuild trigger for build (Infrastructure Installation/Deployment)

AWS CodeBuild (CodeBuild) will be triggered to run upon any changes to the repository. CodeBuild performs a series of tasks as defined in the infrastructure as code script. This includes the AWS resources creation, IQ resources deployment and testings.

Step 3: InSpec execution

InSpec script runs within CodeBuild to validate and compare deployed resources against desired state per installation specification. Below is a reference of an InSpec profile, along with other attributes saved as inspec.yml file.

name: my-profile 
title: My own AWS profile 
version: 0.1.0 
inspec_version: '>= 4.6.9' 
depends:
  - name: inspec-aws     
    url: https://github.com/inspec/inspec-aws/archive/x.tar.gz
supports:
  - platform: aws 

Chef InSpec allows for test results to be output to one or more reporters. A reporter is a facility for formatting and delivering the results of a Chef InSpec auditing run. A report can be generated in multiple formats:

  • JSON
  • YAML
  • html
  • and many more

In this example, the command line below will produce a reporter output in a JSON format.

$inspec exec example_profile --reporter cli json:/tmp/output.json

The output of the InSpec JSON files are stored in Amazon S3.

Step 4: AWS CodeBuild trigger for IQ automation

Upon completion of the InSpec test validation, results of the test will be inspected. If there are failures, infrastructure changes will be rolled back. Otherwise, the CodeBuild will trigger the IQ process.

Below is an example of a successful InSpec run:

Finished in 0.08943 seconds (files took 6.92 seconds to load)
16 examples, 0 failures

Step 5: Execution of IQ report generation

This is an AWS Step Function that consists of two AWS Lambda (Lambda) functions. The two Lambda are Fetch Data and Generate PDF. The purpose of the Fetch Data Lambda function is to iterate through the InSpec output files in the Amazon S3 bucket and consolidate them into a single JSON file. This consolidated JSON file is an input to the subsequent Lambda function. The Generate PDF Lambda function performs data aggregation, formatting and report generation. The output of both Lambda functions is stored in different Amazon S3 buckets for traceability and audit purposes.

In order to enable quick retrieval of all IQ report runs, the data from the processing pipeline are stored in the following structure:

  • Amazon S3 raw data bucket: Bucket to store consolidated InSpec output JSON for every IQ run. This is the raw data that feeds as an input to the Generate PDF Lambda function.
  • Amazon S3 report bucket: Bucket to store actual IQ output.
  • DynamoDB table: Table that stores metadata of each run. A few of the key attributes in this table includes Run ID, Timestamp, Path to Amazon S3 raw data, and Path to Amazon S3 IQ report. This table enable easy query and retrieval of all versions of IQ reports.
DynamoDB table

DynamoDB table

Step 6: IQ completion notification

Upon completion of the AWS Step Function, an Amazon Simple Notification System (Amazon SNS) notification is sent. The notification can be in different formats such as email to a group or single individual, text messages, or other protocols supported by AWS SNS. There are two types of notification:

  • Successful report generation: Email to notify recipients of successful IQ report generation.
  • Error: In cases where errors are encountered and exceptions are not handled, an error notification will be sent with a link to the Amazon CloudWatch Log Groups’ log stream ID.

The Amazon SNS notification recipient should be specified in the Infrastructure as Code deployment as an environment variable for the Lambda function.

Step 7: InSpec data engineering pipeline for dashboard

In parallel to IQ report generation, the output of the InSpec test run is used to support Continuous Compliance monitoring and tracking. Before the data can be used for the dashboard, the data needs to pre-processed and transformed. The data engineering pipeline includes data transformation, labelling and metrics calculation.

Step 8: Continuous compliance dashboard

Since this is a fully integrated Infrastructure as Code solution, any changes to the infrastructure deployment script will trigger the deployment and execution of test runs. The Continuous Compliance dashboard provides an overview of deployed resources and their corresponding status.

Figure 2: Reference dashboard to display compliance metrics by resources

Figure 2: Reference dashboard to display compliance metrics by resources

Benefits of automated infrastructure Installation Qualification

There are many benefits of automating the infrastructure IQ process. One of the main benefits is the ability to efficiently run infrastructure deployment. This will automate IQ protocol generation once deployment completes, which significantly reduces the Change Management cycle time and ensures continuous compliance. Having an automated solution for infrastructure IQ improves consistency due to automation of development, deployment and testing processes.

Besides the benefits stated above, the following are some of the advantages of automating infrastructure IQ process:

  • Cost reduction: Cost of managing infrastructure compliance can be reduced by decreasing manual effort of updating documentation.
  • Version control: The infrastructure environment can be governed by having the Infrastructure as a Code script version controlled and rolls back to the last good state upon failures.
  • IaC tool of choice: This reference architecture is independent of the Infrastructure as Code tools of choice. Any Infrastructure as Code tools can be used for the deployment of the AWS infrastructure.
  • Repeatable: Infrastructure that you can replicate, re-deploy, and re-purpose. This solution can be centralized at a master account, Organization Unit or per app basis.
  • Scalable: Serverless architecture and can be used to test qualification for most of the services.

Assumptions

There are a few assumptions built into this solution to enable fully compliant IQ automation. In regards to security and permission, only Read access should be allowed in the production account’s AWS Management Console. Also, any deployment in the production environment should be through the Infrastructure as Code pipeline (upon approval of the Change Management and Pull Request).

Conclusion

In this blog, we walked through the reference architecture of automating the infrastructure IQ process. The steps provided above covers the end-to-end process of infrastructure deployment, qualification test case execution, IQ report generation, and a continuous compliance strategy. The effort of maintaining continuous qualified and validated state is simplified with automation. Compared to a manual effort, cycle time to produce an IQ report can be reduced by at least 60% through automation. Automating the infrastructure Installation Qualification process will help maximize the agility and features that AWS Cloud infrastructure offers.

To learn more about AWS CloudFormation, please visit the complete list of supported AWS resources and refer to AWS CloudFormation Template Anatomy for more information about template structure. Also, please check out the latest list of Chef InSpec AWS resources and visit Chef InSpec documentation for more information.

Iftikhar Khan

Iftikhar Khan

Iftikhar Khan is a Healthcare and Life Sciences Cloud Application Developer with 11+ years of experience in leading end-to-end software development from design to deployment using diverse technologies. He helped develop solutions for customers in the Health Care and Life Sciences industry, particularly in the regulatory and compliance space. He specializes in migration, and modernization of cloud native application, Devops culture and microservices. A thought leader in the area of AWS cloud computing, one of his areas of focus is continuous compliance automation.

Firasat Ansari

Firasat Ansari

Firasat Ansari has over 17 years of experience working in the biopharmaceutical and manufacturing industry. In recent years, he has focused on delivering automated solutions in regulated environments using AWS cloud computing.

Nelson Key

Nelson Key

Nelson Key has over 14 years in the Healthcare and Life Science industry with a specific focus in leading application development and automation in biopharmaceutic industry. Prior to joining AWS, Nelson was leading enterprise platform and cross functional application development to deliver advanced analytics and insights across value chain in Amgen. His area of expertise is regulatory and compliance automation in cloud computing. Nelson received his MS in Computer Science from University of Southern California.