AWS HPC Blog
Category: Life Sciences
Announcing: Seqera Containers for the bioinformatics community
Genomics community: rejoice! Seqera and AWS have teamed up to announce Seqera Containers, an open-source, no cost, reliable way to generate containers.
Linter rules for Nextflow to improve the detection of errors before runtime
Check out this post to learn how linter rules for Nextflow’s DSL can help you catch errors in your workflows before runtime, which means greater developer productivity, which leads directly to a faster time to science.
Accelerating molecule discovery with computational chemistry and Promethium on AWS
Interested in performing high-accuracy computational chemistry simulations faster? Check out this new post about Promethium, a solution from QC Ware that leverages AWS to accelerate simulations by up to 100x.
Running protein structure prediction at scale using a web interface for researchers
Today, we’ll show you our open-source sample implementation of a web frontend and cloud HPC backend to support researchers using AI tools like AlphaFold for drug discovery and design.
Benchmarking the Oxford Nanopore Technologies basecallers on AWS
Oxford Nanopore sequencers enables direct, real-time analysis of long DNA or RNA fragments. They work by monitoring changes to an electrical current as nucleic acids are passed through a protein nanopore. The resulting signal is decoded to provide the specific DNA or RNA sequence by virtue of compute-intensive algorithms called basecallers. This blog post presents the benchmarking results for two of those Oxford Nanopore basecallers — Guppy and Dorado — on AWS. This benchmarking project was conducted in collaboration between G42 Healthcare, Oxford Nanopore Technologies and AWS.
How Evolvere Biosciences performs macromolecule design on AWS
In this blog post, we catch a glimpse into drug discovery to see how Evolvere Biosciences has deployed a customized architecture w/ AWS Batch and Nextflow to quickly and easily run its macromolecule design pipeline.
BioContainers are now available in Amazon ECR Public Gallery
Today we are excited to announce that all 9000+ applications provided by the BioContainers community are available within ECR Public Gallery! You don’t need an AWS account to access these images, but having one allows many more pulls to the internet, and unmetered usage within AWS. If you perform any sort of bioinformatics analysis on AWS, you should check it out!
Optimize Protein Folding Costs with OpenFold on AWS Batch
In this post, we describe how to orchestrate protein folding jobs on AWS Batch. We also compare the performance of OpenFold and AlphaFold on a set of public targets. Finally, we will discuss how to optimize your protein folding costs.
Benchmarking NVIDIA Clara Parabricks Somatic Variant Calling Pipeline on AWS
Somatic variants are genetic alterations which are not inherited but acquired during one’s lifespan, for example those that are present in cancer tumors. In this post, we will demonstrate how to perform somatic variant calling from matched tumor and normal genome sequence data, as well as tumor-only whole genome and whole exome datasets using an NVIDIA GPU-accelerated Parabricks pipeline, and compare the results with baseline CPU-based workflows.
Data Science workflows at insitro: how redun uses the advanced service features from AWS Batch and AWS Glue
Matt Rasmussen, VP of Software Engineering at insitro, expands on his first post on redun, insitro’s data science tool for bioinformatics, to describe how redun makes use of advanced AWS features. Specifically, Matt describes how AWS Batch’s Array Jobs is used to support workflows with large fan-out, and how AWS Glue’s DynamicFrame is used to run computationally heterogenous workflows with different back-end needs such as Spark, all in the same workflow definition.