What does this AWS Solutions Implementation do?

The Genomics Secondary Analysis Using AWS Step Functions and AWS Batch solution creates a scalable environment in AWS to develop, build, deploy, and run genomics secondary analysis pipelines, for example, processing raw whole genome sequences into variant calls. This solution includes continuous integration and continuous delivery (CI/CD) using AWS CodeCommit source code repositories and AWS CodePipeline for building and deploying updates to both the genomics workflows and the infrastructure that supports their execution. This solution fully leverages infrastructure as code principles and best practices that enable you to rapidly evolve the solution.

Amazon CloudWatch operational dashboards are deployed to monitor status and performance for pipelines and tools. Customers can deploy this solution for their genomics analysis and research projects.

AWS Solutions Implementation overview

The diagram below presents the architecture you can automatically deploy using the solution's implementation guide and accompanying AWS CloudFormation template.

Genomics Secondary Analysis Using AWS Step Functions and AWS Batch | Architecture Diagram
 Click to enlarge

Genomics Secondary Analysis Using AWS Step Functions and AWS Batch solution architecture

The AWS CloudFormation template creates four CloudFormation stacks in your AWS account including a setup stack to install the solution. The other stacks include a landing zone (zone) stack containing the common solution resources and artifacts, a deployment pipeline (pipe) stack defining the solution's CI/CD pipeline, and a codebase (code) stack providing the tooling, workflow definitions, and job execution environment source code.

The solution’s setup stack creates an AWS CodeBuild project containing the setup.sh script. This script creates the remaining CloudFormation stacks and provides the source code for both the AWS CodeCommit pipe repository and the code repository, once they have been created.

The landing zone (zone) stack stack creates the CodeCommit pipe repository, an Amazon CloudWatch event, and the AWS CodePipeline pipe pipeline which defines the continuous integration/continuous delivery (CI/CD) pipeline for the genomics workflow. The deployment pipeline (pipe) stack stack creates the CodeCommit code repository, an Amazon CloudWatch event, and the CodePipeline code pipeline.

The CodePipeline code pipeline deploys the codebase (code) CloudFormation stack. The resources deployed in your account include Amazon Simple Storage Service (Amazon S3) buckets, CodeCommit repositories for source code, AWS CodeBuild projects, AWS CodePipeline pipelines, Amazon Elastic Container Registry (Amazon ECR) image repositories, an example AWS Step Functions state machine, and AWS Batch compute environments, job queues, and job definitions. An example Amazon CloudWatch dashboard provides operational workload monitoring. In total, this solution enables building and deploying updates to both the genomics workflows, and the infrastructure that supports their execution.

Genomics Secondary Analysis Using AWS Step Functions and AWS Batch

Version 1.0.1
Last updated: 05/2020
Author: AWS

Estimated deployment time: 15 min

Use the button below to subscribe to solution updates.

Note: To subscribe to RSS updates, you must have an RSS plug-in enabled for the browser you are using.  

Features

Provide a scalable environment in AWS to run genomics analysis and research projects

Create a scalable environment in AWS to develop, build, deploy, and run genomics secondary analysis pipelines, such as processing raw whole genome sequences into variant calls.

Leverage infrastructure as code best practices

Rapidly evolve the solution using infrastructure as code principles and best practices.

Leverage continuous integration and continuous delivery (CI/CD)

Use AWS CodeCommit source code repositories and AWS CodePipeline to build and deploy updates to both the genomics workflows and the infrastructure that supports their execution.

Modify for your genomics analysis and research projects

Modify the solution to fit your particular needs, for example, by adding new containerized tools and creating new workflows. Each change will be tracked by the CI/CD pipeline, facilitating change control management, rollbacks, and auditing.
Build icon
Deploy a Solution yourself

Browse our library of AWS Solutions Implementations to get answers to common architectural problems.

Learn more 
Find an APN partner
Find an APN Partner

Find AWS certified consulting and technology partners to help you get started.

Learn more 
Explore icon
Explore Solutions Consulting Offers

Browse our portfolio of Consulting Offers to get AWS-vetted help with solution deployment.

Learn more