AWS Public Sector Blog

Tag: Jupyter

Decrease geospatial query latency from minutes to seconds using Zarr on Amazon S3

Decrease geospatial query latency from minutes to seconds using Zarr on Amazon S3

Geospatial data, including many climate and weather datasets, are often released by government and nonprofit organizations in compressed file formats such as the Network Common Data Form (NetCDF) or GRIdded Binary (GRIB). As the complexity and size of geospatial datasets continue to grow, it is more time- and cost-efficient to leave the files in one place, virtually query the data, and download only the subset that is needed locally. Unlike legacy file formats, the cloud-native Zarr format is designed for virtual and efficient access to compressed chunks of data saved in a central location such as Amazon S3. In this walkthrough, learn how to convert NetCDF datasets to Zarr using an Amazon SageMaker notebook and an AWS Fargate cluster and query the resulting Zarr store, reducing the time required for time series queries from minutes to seconds.

Predicting diabetic patient readmission using multi-model training on Amazon SageMaker Pipelines

Diabetes is a major chronic disease that often results in hospital readmissions due to multiple factors. An estimated $25 billion is spent on preventable hospital readmissions that result from medical errors and complications, poor discharge procedures, and lack of integrated follow-up care. If hospitals can predict diabetic patient readmission, medical practitioners can provide additional and personalized care to their patients to pre-empt this possible readmission, thus possibly saving cost, time, and human life. In this blog post, learn how to use machine learning (ML) from AWS to create a solution that can predict hospital readmission – in this case, of diabetic patients – based on multiple data inputs.