This Guidance demonstrates how to import omics sequence data from Amazon Simple Storage Service (Amazon S3) into AWS HealthOmics Storage. HealthOmics Storage can help you efficiently store and share genomics data, allowing you to realize cost savings when managing your growing volume of genomics data. Because it integrates with other AWS services, not only can you safely and securely store your genomics data, but this Guidance can also you help you protect patient privacy and automate workflows, streamlining data processing and analysis.

Please note: [Disclaimer]

Architecture Diagram

Download the architecture diagram PDF 

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

  • This Guidance is implemented using AWS CDK where the business logic, infrastructure, and configuration are defined as code. This allows changes and integration to perform as code within a version control system. 

    Read the Operational Excellence whitepaper 
  • Amazon S3 is protected by the AWS secure global network infrastructure. Security and Compliance are a shared responsibility between AWS and the customer. And this shared model helps relieve the operational burden from the customer because AWS operates, manages, and controls the components of the operating system. 

    Amazon S3 secures data from unauthorized access with encryption features and access management tools. HealthOmics provides encryption by default to protect sensitive customer data at rest by using a service-owned AWS Key Management Service (AWS KMS) key. Customer-managed KMS keys are also supported. For more on protection with HealthOmics, follow Data protection in AWS HealthOmics

    Read the Security whitepaper 
  • By building this Guidance using AWS serverless and managed services, AWS is responsible for the efficient operation of its services and enables the applications to scale with demand. This ensures that the workload performs its intended function correctly and consistently when it's expected to. It also allows customers to operate and test the workload through its total lifecycle. 

    Read the Reliability whitepaper 
  • The backbones of this Guidance are AWS serverless and managed services that minimize operational overhead, such as server management. HealthOmics Storage is purpose built for omics sequence data, allowing customers to store, discover, and share raw sequence data efficiently, securely, and at low cost.

    Read the Performance Efficiency whitepaper 
  • This Guidance includes the functionality to move data into HealthOmics Storage. HealthOmics provides a cost-effective, omics-aware storage option for reference and sequence data that can reduce the Total Cost of Ownership (TCO) for storing raw sequence data. Such data can include BAMs, CRAMs, and FASTQ file formats.

    HealthOmics automatically moves data to the less expensive storage class if the data are not regularly accessed (such as data that has not been accessed for more than 30 days). This is similar to the Amazon S3 Intelligent-Tiering storage class that automates storage cost savings by moving data when access patterns change, resulting in cost savings for customers.

    This Guidance is built with the AWS serverless service, Lambda, for event-driven computing. Step Functions is used for orchestration, sequencing the data import workflow. AWS serverless services and products allow applications to scale quickly with demand, while ensuring that only the minimum resources are required. 

    Read the Cost Optimization whitepaper 
  • When building cloud workloads, the practice of sustainability is knowing the impacts of the services used and applying design principles to reduce those impacts. In the case of this Guidance, because it relies extensively on serverless and managed services, the services scale to continually match the load, but with just the minimum resources needed, reducing the risk of over-provisioning resources. 

    Read the Sustainability whitepaper 

Implementation Resources

A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

AWS Architecture
Blog

Title

Subtitle
Text.
 
This post demonstrates how...

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.

Was this page helpful?