AWS Architecture Blog
Optimize your modern data architecture for sustainability: Part 2 – unified data governance, data movement, and purpose-built analytics
In the first part of this blog series, Optimize your modern data architecture for sustainability: Part 1 – data ingestion and data lake, we focused on the 1) data ingestion, and 2) data lake pillars of the modern data architecture. In this blog post, we will provide guidance and best practices to optimize the components […]
How to select a Region for your workload based on sustainability goals
The Amazon Web Services (AWS) Cloud is a constantly expanding network of Regions and points of presence (PoP), with a global network infrastructure linking them together. The choice of Regions for your workload significantly affects your workload KPIs, including performance, cost, and carbon footprint. The Well-Architected Framework’s sustainability pillar offers design principles and best practices […]
Optimize your modern data architecture for sustainability: Part 1 – data ingestion and data lake
The modern data architecture on AWS focuses on integrating a data lake and purpose-built data services to efficiently build analytics workloads, which provide speed and agility at scale. Using the right service for the right purpose not only provides performance gains, but facilitates the right utilization of resources. Review Modern Data Analytics Reference Architecture on […]
Field Notes: Develop Data Pre-processing Scripts Using Amazon SageMaker Studio and an AWS Glue Development Endpoint
This post was co-written with Marcus Rosen, a Principal – Machine Learning Operations with Rio Tinto, a global mining company. Data pre-processing is an important step in setting up Machine Learning (ML) projects for success. Many AWS customers use Apache Spark on AWS Glue or Amazon EMR to run data pre-processing scripts while using Amazon SageMaker […]

