AWS Public Sector Blog
Decrease geospatial query latency from minutes to seconds using Zarr on Amazon S3
Geospatial data, including many climate and weather datasets, are often released by government and nonprofit organizations in compressed file formats such as the Network Common Data Form (NetCDF) or GRIdded Binary (GRIB). As the complexity and size of geospatial datasets continue to grow, it is more time- and cost-efficient to leave the files in one place, virtually query the data, and download only the subset that is needed locally. Unlike legacy file formats, the cloud-native Zarr format is designed for virtual and efficient access to compressed chunks of data saved in a central location such as Amazon S3. In this walkthrough, learn how to convert NetCDF datasets to Zarr using an Amazon SageMaker notebook and an AWS Fargate cluster and query the resulting Zarr store, reducing the time required for time series queries from minutes to seconds.
33 new or updated datasets on the Registry of Open Data for Earth Day and more
The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on AWS. Through this program, customers are making over 100PB of high-value, cloud-optimized data available for public use. As April 22 is Earth Day, the AWS Open Data team wanted to highlight some new datasets from our geospatial and environmental communities of practice, as well as the other new or updated datasets available now on the Registry of Open Data on AWS and also discoverable on AWS Data Exchange.
Querying the Daylight OpenStreetMap Distribution with Amazon Athena
In 2020, Meta introduced the Daylight Map Distribution, which combines OpenStreetMap (OSM) data with quality and consistency checks from Daylight mapping partners to create a no-cost, stable, and simple-to-use global map. This blog post provides a brief overview of OSM and Daylight followed by a step-by-step tutorial using five real-world examples. We combine the powerful query capabilities of Amazon Athena from with the feature-rich Daylight OSM data to demonstrate a typical OSM data analysis workflow.
Creating satellite communications data analytics pipelines with AWS serverless technologies
Satellite communications (satcom) networks typically offer a rich set of performance metrics, such as signal-to-noise ratio (SNR) and bandwidth delivered by remote terminals on land, sea, or air. Customers can use performance metrics to detect network and terminal anomalies and identify trends to impact business outcomes. This walkthrough presents an approach using serverless resources from AWS to build satcom control plane analytics pipelines. The presented architecture transforms the data to extract key performance indicators (KPIs) of interest, renders them in business intelligence tools, and applies machine learning (ML) to flag unexpected SNR deviations.
Orbital Sidekick uses AWS to monitor energy pipelines and reduce risks and emissions
Orbital Sidekick (OSK) uses advanced satellite technology and data analytics to help the energy industry protect pipelines and make them less vulnerable to risks such as leaks, contamination, and damage caused by construction and natural disasters. OSK uses compute and analytics services from AWS to power the scalable OSK data pipeline and imagery storage solution in order to persistently monitor tens of thousands of miles of pipeline energy infrastructure and deliver real-time, actionable insights to customers.
Supporting health equity with data insights and visualizations using AWS
In this guest post, Ajay K. Gupta, co-founder and chief executive officer (CEO) of HSR.health, explains how healthcare technology (HealthTech) nonprofit HSR.health uses geospatial artificial intelligence and AWS to develop solutions that support improvements in healthcare and health equity around the world.
34 new or updated datasets on the Registry of Open Data: New data for land use, Alzheimer’s Disease, and more
The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on AWS. This quarter, AWS released 34 new or updated datasets from Impact Observatory, The Allen Institute for Brain Science, Common Screens, and others, which are available now on the Registry of Open Data in the following categories.
22 new or updated open datasets on AWS: New polar satellite data, blockchain data, and more
The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on AWS. The full list of publicly available datasets are on the Registry of Open Data on AWS and are now also discoverable on AWS Data Exchange. This quarter, AWS released 22 new or updated datasets including Amazonia-1 imagery, Bitcoin and Ethereum data, and elevation data over the Arctic and Antarctica. Check out some highlights.
How to partition your geospatial data lake for analysis with Amazon Redshift
Data lakes are becoming increasingly common in many different workloads, and geospatial is no exception. In 2021, Amazon Web Services (AWS) announced geography and geohash support on Amazon Redshift, so geospatial analysts have the capability to quickly and efficiently query geohashed vector data in Amazon Simple Storage Service (Amazon S3). In this blog post, I walk through how to use geohashing with Amazon Redshift partitioning for quick and efficient geospatial data access, analysis, and transformation in your data lake.
OpenFold, OpenAlex catalog of scholarly publications, and Capella Space satellite data: The latest open data on AWS
The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on AWS. Our full list of publicly available datasets are on the Registry of Open Data on AWS and are now also discoverable on AWS Data Exchange. This quarter, we released 15 new or updated datasets including OpenFold, OpenAlex, and radar data from Capella Space. Check out some highlights from the new or updated datasets.