AWS Architecture Blog

Category: AWS Glue

Reusable ETL framework architecture

Use a reusable ETL framework in your AWS lake house architecture

Data lakes and lake house architectures have become an integral part of a data platform for any organization. However, you may face multiple challenges while developing a lake house platform and integrating with various source systems. In this blog, we will address these challenges and show how our framework can help mitigate these issues. Lake […]

Serverless data archiving and retrieval

Reduce archive cost with serverless data archiving

For regulatory reasons, decommissioning core business systems in financial services and insurance (FSI) markets requires data to remain accessible years after the application is retired. Traditionally, FSI companies either outsourced data archiving to third-party service providers, which maintained application replicas, or purchased vendor software to query and visualize archival data. In this blog post, we […]

Architecture for AWS Clean Rooms Scope 3 collaboration

Managing data confidentiality for Scope 3 emissions using AWS Clean Rooms

Scope 3 emissions are indirect greenhouse gas emissions that are a result of a company’s activities, but occur outside the company’s direct control or ownership. Measuring these emissions requires collecting data from a wide range of external sources, like raw material suppliers, transportation providers, and other third parties. One of the main challenges with Scope […]

Data lake architecture with OpenSearch

Text analytics on AWS: implementing a data lake architecture with OpenSearch

Text data is a common type of unstructured data found in analytics. It is often stored without a predefined format and can be hard to obtain and process. For example, web pages contain text data that data analysts collect through web scraping and pre-process using lowercasing, stemming, and lemmatization. After pre-processing, the cleaned text is […]

Well-Architected logo

Optimize your modern data architecture for sustainability: Part 2 – unified data governance, data movement, and purpose-built analytics

In the first part of this blog series, Optimize your modern data architecture for sustainability: Part 1 – data ingestion and data lake, we focused on the 1) data ingestion, and 2) data lake pillars of the modern data architecture. In this blog post, we will provide guidance and best practices to optimize the components […]

Well-Architected logo

Optimize your modern data architecture for sustainability: Part 1 – data ingestion and data lake

The modern data architecture on AWS focuses on integrating a data lake and purpose-built data services to efficiently build analytics workloads, which provide speed and agility at scale. Using the right service for the right purpose not only provides performance gains, but facilitates the right utilization of resources. Review Modern Data Analytics Reference Architecture on […]

Let's Architect

Let’s Architect! Modern data architectures

With the rapid growth in data coming from data platforms and applications, and the continuous improvements in state-of-the-art machine learning algorithms, data are becoming key assets for companies. Modern data architectures include data mesh—a recent style that represents a paradigm shift, in which data is treated as a product and data architectures are designed around […]

DWBI workload with multiple tools

Data warehouse and business intelligence technology consolidation using AWS

Organizations have been using data warehouse and business intelligence (DWBI) workloads to support business decision making for many years. These workloads are brought to the Amazon Web Services (AWS) platform to utilize the benefit of AWS cloud. However, these workloads are built using multiple vendor tools and technologies, and the customer faces the burden of […]

Zendesk data pipelines

Insights for CTOs: Part 3 – Growing your business with modern data capabilities

This post was co-wrtiten with Jonathan Hwang, head of Foundation Data Analytics at Zendesk. In my role as a Senior Solutions Architect, I have spoken to chief technology officers (CTOs) and executive leadership of large enterprises like big banks, software as a service (SaaS) businesses, mid-sized enterprises, and startups. In this 6-part series, I share […]