Accelerating new materials design with open data on AWS
The Materials Project at Lawrence Berkeley National Laboratory (LBNL) is an open database that offers information about material properties, or, all the elements and substances that make up the products we use every day. By harnessing the power of the Department of Energy’s (DOE) high-performance scientific computing and state of the art electronic structure methods, the Materials Project provides open web-based access on Amazon Web Services (AWS) to computational datasets on both known and potential materials, along with powerful analysis tools to help discover, inspire, and design new materials.
By computing properties of all known materials, the Materials Project can help users remove guesswork from materials design in a variety of applications and target the most promising compounds from computational datasets. To date, over 5,000 molecules and over 140,000 inorganic compounds are included in the database, with millions of calculated, associated properties. Research from the Materials Project’s open database can be applied to innovations like transparent conducting films, thermoelectric devices, LEDs, electrolytes, and other new technologies.
“Materials performance is the main limiter to new technology advancement,” said Kristin Persson, Director of the Materials Project, Professor in the Department of Materials Science and Engineering at University of California – Berkeley, and Faculty Senior Scientist at LBNL. “This work is critical to our journey towards increased energy efficiency and renewable energy production.”
When the Materials Project launched in 2011, they operated with on-premise support by the National Energy Research Scientific Computing Center (NERSC) at LBNL to buy, configure, and maintain their compute and data infrastructure. A team of four scientists lead the data production and ingestion into an on-premise MongoDB, a source-available NoSQL cross-platform document-oriented database program. As their community grew to over 200,000 global users from across multiple industries and academia, the Materials Project needed to transition their on-premises infrastructure to the cloud to future-proof their ability to scale without adding additional human resources.
In 2019, the Materials Project decided to migrate both their database and customer-facing website to AWS to increase their availability and ability to scale. A single computer systems engineer from the Materials Project designed, developed, and operates a microservices-based architecture using AWS services like Amazon Simple Storage Service (Amazon S3), Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Registry (Amazon ECR), and AWS Fargate (Fargate). This architecture established auto-scaling to meet modern requirements for increased security, high availability, rapid development, and scalability of the Materials Project’s technology services on a limited budget. Now, the small core team uses infrastructure-as-code from the ground up to allow the Materials Project to efficiently use its limited cloud computing and human resources and deliver data products to its users worldwide.
“We do with a team of four what would have taken ten or more with traditional on-premise infrastructure,” said Dr. Patrick Huck, Senior Computer Systems Engineer at the Materials Project. The Materials Project utilizes the AWS Global Data Egress Waiver to support reducing costs around open data set download requirements.
By using the cloud, the Materials Project is able to operate the website with the backbone necessary to fulfill modern expectations of near zero downtime for global users. Because AWS takes care of the undifferentiated heavy lifting of the cloud infrastructure with automation and managed services, the Materials Project can focus on delivering scientific data and apps to their community and significantly reduce the maintenance required to keep up with modern technology and infrastructure solutions. The same team of four has been able to support the Materials Project’s continued exponential growth in users.
“AWS significantly reduces the effort needed to architect and run compute, storage, and database infrastructure for the engineers in Materials Project,” said Dr. Huck. This enables the scientific team to focus on the science to create more materials data products which makes the site and its products more valuable to a growing user community. The site has virtually no downtime since moving to AWS which increases user satisfaction.
Learn more about the Materials Project and its impact on science. Visit the Research and Technical Computing on AWS hub for more about how AWS helps researchers accelerate time to science. Dive deeper into AWS for research with on-demand seminars.
Read more stories about AWS for researchers:
- Introducing 10 minute cloud tutorials for research
- How researchers at UC Davis support the swine industry with data analytics on AWS
- Preventing the next pandemic: How researchers analyze millions of genomic datasets with AWS
- Solving medical mysteries in the AWS Cloud: Medical data-sharing innovation through the Undiagnosed Diseases Network
- Cloud powers faster, greener, and more collaborative research, according to new IDC report
- How to set up Galaxy for research on AWS using Amazon Lightsail
Subscribe to the AWS Public Sector Blog newsletter to get the latest in AWS tools, solutions, and innovations from the public sector delivered to your inbox, or contact us.
Please take a few minutes to share insights regarding your experience with the AWS Public Sector Blog in this survey, and we’ll use feedback from the survey to create more content aligned with the preferences of our readers.