This Guidance demonstrates how to set up a continuous integration and continuous delivery (CI/CD) pipeline to automate the lifecycle of your bioinformatics workflows on AWS HealthOmics. By integrating your existing workflows with source control systems like Git, the CI/CD pipeline enables efficient development, testing, version management, and deployment of workflow updates. Whenever code changes are committed, the pipeline automatically builds, tests, and deploys the new workflow version. This approach streamlines your workflow processes, reducing manual effort while maintaining data provenance and consistent, repeatable results across all versioned workflows.

Please note: [Disclaimer]

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF 

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

  • For efficient and effective operations, this Guidance helps you automate your build, test, and deployment processes with CodeBuild and CodePipeline. CodeCommit is another service that supports your operations; it stores code in private Git repositories for version control. You can track changes, test new versions, and roll back if necessary. You can also orchestrate your automated tests with Step Functions to validate the quality and reliability of your deployments. Finally, you can centralize the management of your workflows in HealthOmics to improve visibility and monitoring; HealthOmics is a managed service purpose-built for healthcare and life science organizations to store, query, and analyze omics data.

    Read the Operational Excellence whitepaper 
  • CodeCommit, HealthOmics, and Amazon ECR work in tandem to protect your systems, applications, and data from potential threats. Specifically, CodeCommit provides secure storage and version control for your workflow code, with access controls, change tracking, and encryption. HealthOmics offers isolated, secure, and scalable processing of your bioinformatics workflows. Amazon ECR helps ensure secure storage and access control for your container images. Additionally, by separating your CI/CD and production environments, implementing least-privilege access, and securely managing your artifacts, you can achieve a higher level of isolation and security for your bioinformatics workflows.

    Read the Security whitepaper 
  • Building resilient and highly available systems that can withstand failures requires services like CodePipeline, Step Functions, and HealthOmics. CodePipeline provides an automated way to build, test, and deploy new versions of your workflows. Step Functions orchestrates the various steps in your CI/CD pipeline, setting the framework for a resilient and fault-tolerant way to coordinate and automatically retry failed steps. HealthOmics manages the underlying infrastructure and resource management, supporting the reliability and availability of your workflow processing.

    Read the Reliability whitepaper 
  • You can optimize the use of computing resources while maximizing efficiency with CodeBuild, CodePipeline, Step Functions, and HealthOmics. CodeBuild is a service with capabilities to support a fully managed build and test workflow with features like cache and auto-discovery. The efficient deployment processes, powered by CodePipeline and Step Functions, minimize the risk of performance regressions. Finally, HealthOmics provides a managed service for running your bioinformatics workflows, handling the provisioning and scaling of the underlying compute resources and storage systems for optimal workflow performance.

    Read the Performance Efficiency whitepaper 
  • By supporting cross-account deployments, this Guidance helps you maintain secure and isolated environments for development, testing, and production, reducing the risk of inadvertent resource utilization and costs. It utilizes CodeBuild, CodePipeline, Lambda, Amazon ECR, and HealthOmics to do this. For example, the automated build and deployment processes of CodeBuild and CodePipeline allow only the necessary resources to be provisioned. By using Lambda for lightweight tasks, you reduce the need for always-on compute resources. Also, storing your built container images in Amazon ECR allows for reuse across multiple workflow deployments, saving time and compute costs. Furthermore, HealthOmics, as a managed service, eliminates the need for you to manage the underlying infrastructure and configuration complexities and reduces your operational costs.

    Read the Cost Optimization whitepaper 
  • Minimize your carbon footprint and support responsible resource utilization with CodeBuild, Lambda, Amazon ECR, and HealthOmics. CodeBuild only provisions the necessary compute resources to perform the build and deployment tasks, scaling up and down as required, reducing energy consumption and the associated environmental impacts. Lambda avoids the need to provision and manage dedicated server infrastructure, running only when needed and shutting down when idle. Amazon ECR provides centralized, scalable, and durable storage of your container images, eliminating the need for additional container registries or storage solutions and reducing the overall hardware and energy footprint. By utilizing HealthOmics, you can use the service's scalable and serverless architecture to run your bioinformatics workflows and help lower your overall energy consumption.

    Read the Sustainability whitepaper 

Implementation Resources

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

[Content Type]


This [blog post/e-book/Guidance/sample code] demonstrates how [insert short description].


The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.

Was this page helpful?