With Amazon Omics, you pay only for what you use. You are charged based on the amount of data you store and the compute instances you use for processing your workflow. With Amazon Omics, you can store sequence and reference data objects or variant and annotation data. You can also run bioinformatics workflows to analyze and transform genomic, transcriptomic, and other omics data. Amazon Omics is optimized for the storage and computation of omics data and works with other AWS services such as Amazon SageMaker, Amazon Simple Storage Service (S3), and Amazon Athena.
As part of the AWS Free Tier, you can get started with Amazon Omics for free. Your Free Tier starts from the first month when you create your first Amazon Omics resource. The details of the Amazon Omics Free Tier are in the table below.
Free Tier usage per month for the first 2 months
|Amazon Omics storage||1500 gigabase-months in active storage class and 1500 gigabase-months in archive storage class|
|Amazon Omics workflows||
275 omics.m.xlarge instance hours or equivalent compute instances and 49,000 GB-hours of run storage
|Amazon Omics analytics||200 gigabyte-months|
Amazon Omics storage pricing
When you store genomic sequences in Amazon Omics storage, you pay for storage per gigabase per month. A gigabase is one billion bases from your imported sequence files (such as FASTQ, BAM, and CRAM). Amazon Omics storage stores the bases, quality scores, alignments, and other metadata from your source files. You pay per gigabase stored, so you don’t need to worry about optimal file formats or compression techniques. Amazon Omics takes care of all of that for you.
Sequence objects are called read sets and are logically equivalent to a FASTQ, BAM, or CRAM file. Amazon Omics storage offers you an active storage class and an archive storage class for your read sets. Read sets in the archive class cost less per month to store than read sets in the active class. Read sets in the active class can be accessed in milliseconds, while read sets in the archive tier need to be activated before they can be accessed. After read sets have not been accessed for 30 days, they automatically move to the lower-cost archive storage class until they are reactivated.
There is no import fee for read sets. Amazon Omics storage data is charged for a minimum storage duration of 30 days, and data deleted before 30 days incurs a prorated charge equal to the storage charge for the remaining days. Amazon Omics storage is designed for long-lived but infrequently accessed data that is retained for years.
You pay for GET requests made to your read-set objects. All other request types on read sets are free.
Amazon Omics analytics pricing
Amazon Omics analytics helps you prepare your genomic variant data and genomic annotations for use with the broad suite of AWS analytics and machine learning services such as Amazon Athena and Amazon SageMaker. You can store any amount of genomic variant data, and you only pay for what you store. Data size is defined as the size of transformed data. However, when you query and analyze the data in other services, you pay for the use of those services.
Amazon Omics analytics data is charged for a minimum storage duration of 30 days, and data deleted before 30 days incurs a prorated charge equal to the storage charge for the remaining days.
Amazon Omics workflows pricing
Amazon Omics also manages the interpretation and execution of bioinformatics workflows. Amazon Omics workflows can run bioinformatics scripts written in the two most commonly used workflow languages, WDL and Nextflow. A single execution of these scripts is called a run. You pay only for what you use and are billed separately for omics instance types and run storage. All tasks in your workflow are mapped to the instance that is the best fit for their defined resources. For example, a task that is defined to use 8 CPUs and 60 GB of RAM will map to the omics.r.2xlarge instance type for execution. Workflow logs are stored in Amazon CloudWatch logs in your account and billed in CloudWatch for as long as you choose to retain them. You can configure the service to report resource use per run for simplified budgeting, planning, and accounting.
A population sequencing initiative is starting to sequence individuals from a biobank they have collected. They choose to do this in the EU West (Ireland) Region. They sequence 100,000 individuals, each at 130 gigabases, and store the raw sequencing data in Amazon Omics storage. Over the next five years, they remain in the archive storage class after the 30 days following import and are accessed twice, on average, when they transition to the active storage class for 30 days. Each genome is downloaded in 500 parts, generating 500 GET API calls. Their total cost over five years for a single genome is:
Active storage class: $0.005769 gigabase/month * 130 gigabases * 90 days = $2.22
Archive storage class: $0.001154 gigabase/month * 130 gigabases * (1825 – 90) days = $8.56.
GET APIs: $0.005 / 1000 API calls * (2 * 500 API calls) = $0.005
Total cost for 5 years: $2.22 + $8.56 + $0.005 = $10.79 (or $2.16/year)
A bioinformatics scientist wants to run a Nextflow workflow in Amazon Omics workflows in the US East (N. Virginia) Region. She has three tasks in the workflow. The first reserves 16 vCPUs and 30 GB memory and takes 3 hours to run. The second requires 32 vCPUs and 160 GB memory and takes 2 hours to run. The third reserves 4 vCPU and 10 GB memory and takes 10 minutes to run. The customer registers the workflow and calls the StartRun API with the default 1200 GB filesystem. Her overall costs are:
Task 1 (omics.c.4xlarge): $ 0.9180/hr * 3 hrs = $ 2.754
Task 2 (omics.r.8xlarge): $ 2.7216/hr * 2 hrs = $5.4432
Task 3 (omics.m.xlarge): $ 0.2592/hr * 1/6 hrs = $ 0.0432
Storage: $0.0001918/ GB-hour * (1200GB*(3 hr+2 hr+1/6 hr)) = $1.18916
A data scientist has 3,202 variant call format (VCF) files that he wants to analyze in Amazon Athena in the US East (N Virginia) Region. He creates a variant store and ingests these files using the Amazon Omics APIs. The ingested data is 1.5 TB in size. Over the course of the next month, he executes 1,000 queries in Athena, calculating allele frequencies for different subpopulations, each on average consuming 50 GB. His overall monthly costs are:
Variant store: $0.035 GB/month * (1024 GB/TB * 1.5 TB) = $53.76
Amazon Athena: $5 / TB * 1000 * 50 / 1024 = $244.14