Guidance for Bioinformatics Workflow Development Using DevOps on AWS
Important: This Guidance requires the use of AWS CodeCommit, which is no longer available to new customers. Existing customers of AWS CodeCommit can continue using and deploying this Guidance as normal.
Overview
How it works
These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.
Well-Architected Pillars
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Operational Excellence
For efficient and effective operations, this Guidance helps you automate your build, test, and deployment processes with CodeBuild and CodePipeline . CodeCommit is another service that supports your operations; it stores code in private Git repositories for version control. You can track changes, test new versions, and roll back if necessary. You can also orchestrate your automated tests with Step Functions to validate the quality and reliability of your deployments. Finally, you can centralize the management of your workflows in HealthOmics to improve visibility and monitoring; HealthOmics is a managed service purpose-built for healthcare and life science organizations to store, query, and analyze omics data.
Security
CodeCommit , HealthOmics , and Amazon ECR work in tandem to protect your systems, applications, and data from potential threats. Specifically, CodeCommit provides secure storage and version control for your workflow code, with access controls, change tracking, and encryption. HealthOmics offers isolated, secure, and scalable processing of your bioinformatics workflows. Amazon ECR helps ensure secure storage and access control for your container images. Additionally, by separating your CI/CD and production environments, implementing least-privilege access, and securely managing your artifacts, you can achieve a higher level of isolation and security for your bioinformatics workflows.
Reliability
Building resilient and highly available systems that can withstand failures requires services like CodePipeline , Step Functions , and HealthOmics . CodePipeline provides an automated way to build, test, and deploy new versions of your workflows. Step Functions orchestrates the various steps in your CI/CD pipeline, setting the framework for a resilient and fault-tolerant way to coordinate and automatically retry failed steps. HealthOmics manages the underlying infrastructure and resource management, supporting the reliability and availability of your workflow processing.
Performance Efficiency
You can optimize the use of computing resources while maximizing efficiency with CodeBuild , CodePipeline , Step Functions , and HealthOmics . CodeBuild is a service with capabilities to support a fully managed build and test workflow with features like cache and auto-discovery. The efficient deployment processes, powered by CodePipeline and Step Functions , minimize the risk of performance regressions. Finally, HealthOmics provides a managed service for running your bioinformatics workflows, handling the provisioning and scaling of the underlying compute resources and storage systems for optimal workflow performance.
Cost Optimization
By supporting cross-account deployments, this Guidance helps you maintain secure and isolated environments for development, testing, and production, reducing the risk of inadvertent resource utilization and costs. It utilizes CodeBuild , CodePipeline , Lambda , Amazon ECR , and HealthOmics to do this. For example, the automated build and deployment processes of CodeBuild and CodePipeline allow only the necessary resources to be provisioned. By using Lambda for lightweight tasks, you reduce the need for always-on compute resources. Also, storing your built container images in Amazon ECR allows for reuse across multiple workflow deployments, saving time and compute costs. Furthermore, HealthOmics , as a managed service, eliminates the need for you to manage the underlying infrastructure and configuration complexities and reduces your operational costs.
Sustainability
Minimize your carbon footprint and support responsible resource utilization with CodeBuild , Lambda , Amazon ECR , and HealthOmics . CodeBuild only provisions the necessary compute resources to perform the build and deployment tasks, scaling up and down as required, reducing energy consumption and the associated environmental impacts. Lambda avoids the need to provision and manage dedicated server infrastructure, running only when needed and shutting down when idle. Amazon ECR provides centralized, scalable, and durable storage of your container images, eliminating the need for additional container registries or storage solutions and reducing the overall hardware and energy footprint. By utilizing HealthOmics , you can use the service's scalable and serverless architecture to run your bioinformatics workflows and help lower your overall energy consumption.
Deploy with confidence
Ready to deploy? Review the sample code on GitHub for detailed deployment instructions to deploy as-is or customize to fit your needs.
Disclaimer
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages