Key Outcomes
Overview
The Allen Institute is reshaping the future of brain science by providing more than 1,000 organizations worldwide with open datasets and resources. As part of its mission to understand the brain’s structure, function, and role in disease, the Allen Institute created the Brain Knowledge Platform (BKP). Built on Amazon Web Services (AWS), the BKP unleashes large-scale, cloud-driven neuroscience discovery at unprecedented scale.
Through the BKP, the Allen Institute provides a centralized environment for collaboration and discovery for scientists around the world. By combining high performance computing (HPC) with generative AI, the BKP empowers researchers to move beyond traditional laboratory experimentation to accelerate discovery through in-silico research.
About the Allen Institute
A leader in large-scale research and committed to open science, the Allen Institute is an independent 501(c)(3) nonprofit research organization dedicated to answering the biggest questions in bioscience and accelerating breakthroughs worldwide.
Opportunity | Modernizing brain science with HPC to fuel discovery
Before implementing the BKP, the Allen Institute managed its rapidly growing neuroscience data through siloed, on-premises systems. The fragmented environment limited collaboration between teams, slowed down analysis, and made scaling increasingly difficult as data volumes approached exabytes. Those conditions also made it challenging for the organization to adopt emerging technologies—such as generative AI—as they became available.
To address these challenges, the organization’s data and technology team consulted with scientists to understand what a future-ready research platform might look like. “We wanted to provide scientists with a modern technological environment where they can do scientific discovery using state-of-the-art technologies on the cloud,” says Shoaib Mufti, senior director for data and technology at the Allen Institute. Researchers emphasized the need for intuitive, AI-supported tools for navigating the Allen Institute’s growing body of data and consistent frameworks and standards to help them align and compare data generated in different labs.
With this input, the Allen Institute sought a cloud-based solution that could provide a unified, modern technological environment for large-scale discovery. After evaluating several options, the organization selected AWS for its breadth of services, scalability, and ability to support collaboration.
Solution | Building a unified, scalable platform for neuroscience
Using AWS, the Allen Institute built the BKP, a unified, researcher-centered environment that incorporates AI tools for intuitive data access and large-scale discovery. Informed by feedback from scientists, the BKP brings together multimodal neuroscience data—including DNA, RNA, and cell imaging—and genetics tools in a single cloud-based system. The BKP ingests data from the Allen Institute’s laboratories as well as other sources and stores it in Amazon Simple Storage Service (Amazon S3), an object storage service, and Amazon Relational Database Service (Amazon RDS), an easy-to-manage relational database service. Within the database, users can find the data that they need using Amazon OpenSearch Service—an AWS-managed service that lets users run and scale OpenSearch clusters.
To scale compute-intensive workloads, the institute uses HPC data processing pipelines that are powered by AWS Batch, a fully managed batch computing service, and supported by Amazon Elastic Compute Cloud (Amazon EC2) for secure and resizable compute capacity to support elastic and high-throughput computing. “Using AWS Batch is increasingly important for us as we run more HPC-related workloads for advanced computing,” says Mufti.
The Allen Institute is using the BKP to design smarter, more precise neuroscience tools. Researchers use Amazon SageMaker AI to build, train, and deploy AI models that help predict which targets are viable, greatly decreasing research time and cost. Allen Institute researchers can accelerate the model training using Amazon SageMaker HyperPod, which streamlines model development by removing the undifferentiated heavy lifting in building generative AI models, and Amazon EC2 Elastic GPUs—GPU resources that can be attached to Amazon EC2 instances. “To determine whether a part of the genome is a good target for gene regulation, we have to build a genetic tool and go test it. That’s an expensive process in the wet lab,” says Tyler Mollenkopf, associate director for data and technology at the Allen Institute. “Using in-silico predictions and AI modeling, our scientists can get to precision tools much more efficiently.”
Outcome | Accelerating neuroscience workflows from weeks to 1 day
By combining HPC with generative AI on AWS, the Allen Institute has created a powerful environment that expands research possibilities. The BKP supports more than 300 of the Allen Institute’s scientists, as well as more than 1,000 organizations worldwide, facilitating faster analysis, broader collaboration, and greater scientific reach. Through Open Data on AWS, a service for sharing any volume of data with as many people as an organization wants, the Allen Institute makes large datasets available to the public.
Using HPC on AWS, the Allen Institute has achieved its goal of building a scalable, powerful, and adaptable environment for collaborative neuroscience research. “Previously, we were constrained by resources,” says Mufti. “Using AWS, our compute-intensive workloads can now be highly parallelized.” Thanks to this parallelization, omics pipeline sequences that once took weeks to run on premises now run in as little as 1 day on AWS.
Scientists also benefit from the scalability of AWS. Traditional tools could handle datasets containing up to 30,000 cells. Today, the BKP can support single-cell datasets of over 30 million cells—an increase of 1,000 times.
Looking ahead, the Allen Institute envisions the BKP as a place where scientists outside of the Allen Institute can run their own analysis pipelines, write code, and build workflows to further neuroscience research. “The BKP personifies the key values of the Allen Institute: open science, team science, big science, and impactful science,” says Mufti.
Architecture Diagram
AWS Services Used
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages