Codex Genetics Accelerates the Results of Clinical Trials Using AWS


Using AWS, Codex Genetics provides access to diagnostic and clinical trial data within 4 hours. Codex helps healthcare providers diagnose and treat neurological disorders and cancers. It uses Amazon S3 and Amazon ECS to store and process sequencing data and AWS Glue and Amazon Athena to query the results.

start a python tutorial

For startups like Codex, the AWS Cloud provides a platform that enables us to focus on developing competencies and attracting healthcare customers that are focused on reducing time and costs.”

Allen Yu
Cofounder, Codex Genetics

About Codex Genetics

Codex Genetics has developed sequencing technology that provides valuable insights to healthcare providers and pharmaceutical companies as part of clinical trials for new treatments.

Benefits of AWS

  • Increases data querying speed by 70%
  • Saves days sequencing clinical trial data
  • Reduces application costs by 70%
  • Dedicates 20% more resources to development
  • Cuts data storage expense by 10%

Advancing Biotech

Hong Kong–based Codex Genetics uses analytics powered by CoGenesis® Bioinformatics artificial intelligence to help the healthcare industry diagnose and treat neurological disorders and cancers. Founded in 2013, Codex works with pharmaceutical companies to develop medicines and assists healthcare providers such as public hospitals in Hong Kong in accelerating clinical trials.

Outgrowing On-Premises IT

Codex first launched its next-generation DNA sequencing technology and bioinformatics services as on-premises solutions located at customer sites. The major challenge from this approach was troubleshooting platform failures. When an issue occurred, Codex had to send an engineer onsite to resolve the issue. Allen Yu, cofounder of Codex Genetics, says, “We found it increasingly harder to find engineers to carry out visits because of a shortage of people in Hong Kong with the biology and computer science background that we need.”

A Cloud Provider That Understands Its Needs

To avoid onsite deployments, Codex looked to deliver its technology as a service from the cloud. It studied offerings from leading cloud providers to find the best fit for the business and chose Amazon Web Services (AWS). AWS had experience architecting for HIPAA in the cloud, which was crucial for Codex. Plus, documentation from AWS, particularly around its APIs, was more detailed than that of its competitors. “The AWS team had a deeper level of knowledge in optimizing the query of terabyte-scale databases down to seconds. We built two beta implementations on AWS, and based on these, we made an informed decision, balancing cost and performance,” says Yu.

Increases Data Querying Speed by 70%

Codex managed its cloud migration to AWS internally, working closely with AWS solutions architects. The Codex IT team ran proofs of concept to maximize the environment’s data-querying performance while minimizing admin workloads. The team tested Amazon Athena, a serverless interactive query service, with AWS Glue, a fully managed extract, transform, and load (ETL) service. Yu says, “With AWS Glue and Amazon Athena, query performance of complex join queries increased by 70 percent when compared with an on-premises database server, enabling us to deliver a better service to customers.”

Today, by taking advantage of the AWS Cloud environment, Codex can transform DNA sequencing data into diagnostic data cost-effectively and at speed. It uses Amazon Simple Storage Service (Amazon S3) to hold sequencing data, which gets transformed into files for analytics processing in Amazon Elastic Container Service (Amazon ECS), a container orchestration service. The solution runs on Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances. Dante Tsang, cloud solution engineer at Codex Genetics, says, “Processed data is then packaged and restored in Amazon S3 to reduce storage costs and speed up querying in Amazon Athena. This way, researchers can easily find the statistical data they are looking for, and access it via the Amazon API Gateway service.”

Keeping Data Well-Protected

As a healthcare player, Codex data needs to be highly secure, hence the company takes advantage of the multiple layers of data security built into AWS. To prevent unauthorized access, data is safeguarded using Amazon Cognito, a simple sign-up, sign-in, and access control service, alongside AWS Identity and Access Management (IAM). The compute layer is governed by AWS Web Application Firewall (AWS WAF) and AWS CloudTrail, which tracks user and API activity. This enables auditing and identifies abnormal activities that Codex can shut down. The data layer is then protected through the advanced encryption standard in Amazon S3.

Three Times More Data in Less Time at a Lower Cost

Codex provides healthcare customers with diagnostic and clinical trial data within 4 hours, compared to a week with a traditional lab. Using AWS, as well as a high-throughput Illumina NextSeq 2000 sequencer, Codex can process three times more data at a lower cost than it could in an on-premises infrastructure. This helps its customers, including public hospitals in Hong Kong and a global pharmaceutical company, accelerate their clinical trials to find treatments and develop new drugs faster. Comments Yu, “Sequencing processes are automatically triggered, and there’s no manual intervention. Customers can now receive the clinical trial data they need in hours.”

Reducing the Cost of Web Application by 70%

Codex customers access diagnostic and clinical trial data using a customer-facing web application. Rather than have an Amazon EC2 instance constantly running to serve the application, Codex uses AWS Lambda, which enables customers to run application code without provisioning dedicated servers. Functions run only when a customer launches the application. Comments Yu, “We have reduced the cost of the application by 70 percent, freeing up resources to invest in other areas of our business.”

Thanks to the high levels of automation in AWS and Lambda functions that bypass the need for dedicated server provisioning, Codex has been able to deploy IT engineers away from infrastructure management. Tsang says, “The company has shifted about 20 percent more of its IT resources to software development focusing on innovation and how to gain insights from the data.”

Cutting Storage Costs by 10%

Besides the reduced computing costs on important applications, Codex has also lowered its storage costs by 10 percent compared to its on-premises storage and software-defined backup solutions. The combination of faster processing and lower expense for compute and storage makes cloud computing a key enabler for biotech companies that need large IT infrastructures behind them. Says Yu, “For startups like Codex, the AWS Cloud provides a platform that enables us to focus on developing competencies and attracting healthcare customers that are focused on reducing time and costs.”

To learn more, visit

Get Started

Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.

AWS Services Used

Amazon Athena

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

Learn more »

AWS Glue

AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console. You simply point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. 

Learn more »

Amazon Simple Storage Service

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. This means customers of all sizes and industries can use it to store and protect any amount of data for a range of use cases, such as websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics.

Learn more »

Amazon Elastic Container Service

Amazon Elastic Container Service (Amazon ECS) is a fully managed container orchestration service. Customers such as Duolingo, Samsung, GE, and Cookpad use ECS to run their most sensitive and mission critical applications because of its security, reliability, and scalability. 

Learn more »