reference deployment

Illumina DRAGEN on AWS

Ultra-rapid analysis of next-generation sequencing (NGS) data with DRAGEN and F1 instances

This Quick Start deploys Dynamic Read Analysis for GENomics Complete Suite (DRAGEN CS), a data analysis platform by Illumina, on the AWS Cloud in about 15 minutes.

DRAGEN CS enables ultra-rapid analysis of next-generation sequencing (NGS) data, significantly reduces the time required to analyze genomic data, and improves accuracy. It includes bioinformatics pipelines that provide highly optimized algorithms for mapping, aligning, sorting, duplicate marking, and haplotype variant calling. These pipelines include DRAGEN Germline V2, DRAGEN Somatic V2 (Tumor and Tumor/Normal), DRAGEN Virtual Long Read Detection (VLRD), DRAGEN RNA Gene Fusion, DRAGEN Joint Genotyping, and GATK Best Practices.

The Quick Start builds an AWS environment that spans two Availability Zones for high availability, and provisions two AWS Batch compute environments for Spot Instances and On-Demand Instances. These environments include DRAGEN F1 instances that are connected to field programmable gate arrays (FPGAs) for hardware acceleration.

Illumina logo

This Quick Start was developed by Illumina in collaboration with AWS. Illumina is an
APN Partner.

  •  What you'll build
  •  How to deploy
  •  Cost and licenses
  •  What you'll build
  • Use this Quick Start to set up the following configurable environment on AWS:

    • A highly available architecture that spans two Availability Zones.*
    • A virtual private cloud (VPC) configured with public and private subnets according to AWS best practices. This provides the network infrastructure for your deployment.*
    • An internet gateway to provide access to the internet.*
    • In the public subnets, managed NAT gateways to allow outbound internet access for resources in the private subnets.*
    • An AWS CodePipeline pipeline that builds a Docker image and uploads it into an Amazon Elastic Container Registry (Amazon ECR) repository.
    • Two AWS Batch compute environments: one for Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances and the other for On-Demand Instances.
    • An AWS Batch job queue that prioritizes submission to the compute environment for Spot Instances to optimize for cost.
    • An AWS Batch job definition to run DRAGEN.
    • AWS Identity and Access Management (IAM) roles and policies for the AWS Batch jobs to run.

    * The template that deploys the Quick Start into an existing VPC skips the tasks marked by asterisks and prompts you for your existing VPC configuration.

  •  How to deploy
  • To deploy Illumina DRAGEN on AWS, follow the instructions in the deployment guide. The deployment process includes these steps:

    1. If you don't already have an AWS account, sign up at https://aws.amazon.com.
    2. Subscribe to DRAGEN Complete Suite in AWS Marketplace.
    3. Launch the Quick Start. Each deployment takes about 15 minutes. You can choose from two options:
    4. Test the deployment by running a DRAGEN job.

    To customize your deployment, you can configure the network architecture, set the desired number of virtual CPUs for the AWS Batch environment, specify a bid percentage for Spot Instances, and set the number of AWS Batch job retries.

  •  Cost and licenses
  • You are responsible for the cost of the AWS services used while running this Quick Start reference deployment. There is no additional cost for using the Quick Start.

    The AWS CloudFormation template for this Quick Start includes configuration parameters that you can customize. Some of these settings, such as instance type, will affect the cost of deployment. See the pricing pages for each AWS service you will be using for cost estimates. Prices are subject to change.

    This Quick Start requires a subscription to the Amazon Machine Image (AMI) for DRAGEN Complete Suite, which is available with per-hour pricing from AWS Marketplace.