AWS Public Sector Blog

Tag: open data

Accelerating new materials design with open data on AWS

The Materials Project at Lawrence Berkeley National Laboratory (LBNL) is an open database that offers information about material properties, or, all the elements and substances that make up the products we use every day. By harnessing the power of the Department of Energy’s (DOE) high-performance scientific computing and state of the art electronic structure methods, the Materials Project provides open web-based access on AWS to computational datasets on both known and potential materials, along with powerful analysis tools to help discover, inspire, and design new materials.

Downscaled CMIP5, 1950 US Census, and open genomics data for Galaxy: The latest open data on AWS

The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on Amazon Web Services (AWS). Our full list of publicly available datasets are on the Registry of Open Data on AWS. This quarter, we released 13 new or updated datasets including CMIP5, 1950s US Decennial Census, and open genomics data for Galaxy. Read on for some highlights.

Predicting global biodiversity patterns in Costa Rica with ecosystem modeling on AWS

As part of the Amazon Sustainability Data Initiative (ASDI), AWS invited Rafael Monge Vargas, director of the National Center of GeoEnvironmental Information (CENIGA) at the Costa Rica’s Ministry of Environment and Energy (MINAE), to share how his team is helping advance conservation and economic development in Costa Rica and how they utilize ASDI and AWS to support these efforts.

From open data to machine learning, making 1950 Census data available with AWS

On April 1, the US National Archives and Records Administration (NARA) released the 1950 Census data to the general public. Census data is released 72 years after a census is conducted, and it has been 10 years since the last census data for the 1940 Census was publicly released. With the support of cloud technologies, this release marks a number of important firsts. AWS is honored to support the release of the 1950 Census and help make this data available to the public.

Bringing world-class satellite imagery to smallholder farmers with open data

As part of the Amazon Sustainability Data Initiative (ASDI), AWS invited Nils Helset, co-founder and chief executive officer (CEO) of DigiFarm, to share how AWS Cloud technology and open data support DigiFarm’s efforts in precision farming to make agricultural practices more sustainable and efficient.

Preventing the next pandemic: How researchers analyze millions of genomic datasets with AWS

How do we avoid the next global pandemic? For researchers collaborating with the University of British Columbia Cloud Innovation Center (UBC CIC), the answer to that question lies in a massive library of genetic sequencing data. But there is a problem: the data library is so massive that traditional computing can’t comprehensively analyze or process it. So the UBC CIC team collaborated with computational virologists to create Serratus, an open-science viral discovery platform to transform the field of genomics—built on the massive computational power of the Amazon Web Services (AWS) Cloud.

Street-scale global maps, orca sounds, and COVID-19 detection data: The latest open data on AWS

The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on AWS. We work with data providers to democratize access to data by making it available to the public for analysis on AWS; to develop new cloud-native techniques, formats, and tools that lower the cost of working with data; and to encourage the development of communities that benefit from access to shared datasets. This quarter, we released 19 new or updated datasets like validated OpenStreetMap data, bioacoustic data, COVID-19 detection data, and more.

Analyze terabyte-scale geospatial datasets with Dask and Jupyter on AWS

Terabytes of Earth Observation (EO) data are collected each day, quickly leading to petabyte-scale datasets. By bringing these datasets to the cloud, users can use the compute and analytics resources of the cloud to reliably scale with growing needs. In this post, we show you how to set up a Pangeo solution with Kubernetes, Dask, and Jupyter notebooks step-by-step on Amazon Web Services (AWS), to automatically scale cloud compute resources and parallelize workloads across multiple Dask worker nodes.

AWS hosts new open dataset to help businesses identify climate finance risks and investments

Companies and asset managers looking to protect their financial investments from climate change-related risks, and invest in more sustainable solutions, can now access a new dataset on the Amazon Web Services (AWS) Cloud to help inform their decision making. Amazon announced that the Legal Entity Identifier (LEI) dataset is now available and free for anyone to access in the cloud. The dataset includes key reference information that supports clear and unique identification of legal entities participating in financial transactions, and each LEI contains information about an entity’s ownership structure, including ‘who is who’ and ‘who owns whom’.

koala in tree

Climate data, koala genomes, analysis ready radar data, and highly-queryable genomic data: The latest open data on AWS

The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on AWS. We work with data providers to democratize access to data by making it available to the public for analysis on AWS; develop new cloud-native techniques, formats, and tools that lower the cost of working with data; and encourage the development of communities that benefit from access to shared datasets. Our full list of publicly available datasets are on the Registry of Open Data on AWS. This quarter, we released 26 new or updated datasets including datasets on climate, koala genomes, analysis ready radar data, and highly-queryable genomic data. Check out some highlights.