AWS Architecture Blog

Category: Amazon EMR

Zendesk data pipelines

Insights for CTOs: Part 3 – Growing your business with modern data capabilities

This post was co-wrtiten with Jonathan Hwang, head of Foundation Data Analytics at Zendesk. In my role as a Senior Solutions Architect, I have spoken to chief technology officers (CTOs) and executive leadership of large enterprises like big banks, software as a service (SaaS) businesses, mid-sized enterprises, and startups. In this 6-part series, I share […]

Figure 1. Audit Surveillance data lake architecture diagram

How Parametric Built Audit Surveillance using AWS Data Lake Architecture

Parametric Portfolio Associates (Parametric), a wholly owned subsidiary of Morgan Stanley, is a registered investment adviser. Parametric provides investment advisory services to individual and institutional investors around the world. Parametric manages over 100,000 client portfolios with assets under management exceeding $400B (as of 9/30/21). As a registered investment adviser, Parametric is subject to numerous regulatory […]

Figure 2. Building Lake House architectures with AWS Glue

How to Accelerate Building a Lake House Architecture with AWS Glue

Customers are building databases, data warehouses, and data lake solutions in isolation from each other, each having its own separate data ingestion, storage, management, and governance layers. Often these disjointed efforts to build separate data stores end up creating data silos, data integration complexities, excessive data movement, and data consistency issues. These issues are preventing […]

reference architecture - build automated scene detection pipeline - Autonomous Driving

Field Notes: Building an automated scene detection pipeline for Autonomous Driving – ADAS Workflow

This Field Notes blog post in 2020 explains how to build an Autonomous Driving Data Lake using this Reference Architecture. Many organizations face the challenge of ingesting, transforming, labeling, and cataloging massive amounts of data to develop automated driving systems. In this re:Invent session, we explored an architecture to solve this problem using Amazon EMR, Amazon […]

Figure 2: AI Factory high-level architecture

ERGO Breaks New Frontiers for Insurance with AI Factory on AWS

This post is co-authored with Piotr Klesta, Robert Meisner and Lukasz Luszczynski of ERGO Artificial intelligence (AI) and related technologies are already finding applications in our homes, cars, industries, and offices. The insurance business is no exception to this. When AI is implemented correctly, it adds a major competitive advantage. It enhances the decision-making process, […]

Figure 2. Lake House architecture on AWS

Architecting Persona-centric Data Platform with On-premises Data Sources

Many organizations are moving their data from silos and aggregating it in one location. Collecting this data in a data lake enables you to perform analytics and machine learning on that data. You can store your data in purpose-built data stores, like a data warehouse, to get quick results for complex queries on structured data. […]

EMR solution diagram

Field Notes: Launch Amazon EMR with a Static Private IP in a Private Subnet

Organizations across every industry and sector are looking to easily and cost-effectively process vast amounts of data. Amazon EMR offers a way to instantly provision as much or as little capacity as needed to perform data- intensive tasks. When launching Amazon EMR, the IPs of the primary (master) and core node are automatically assigned at […]

Figure 3. Replay Architecture

Amazon MSK Backup for Archival, Replay, or Analytics

Amazon MSK is a fully managed service that helps you build and run applications that use Apache Kafka to process streaming data. Apache Kafka is an open-source platform for building real-time streaming data pipelines and applications. With Amazon MSK, you can use native Apache Kafka APIs to populate data lakes. You can also stream changes to […]

Young man at his laptop

AWS Architecture Monthly Magazine: Education

One of the missions of the education industry is to educate the next generation of the industry-ready workforce. Whether K-12, higher education, or continuing education, enabling teachers and professors to effectively deliver curriculum and improve student performance is a goal of Education Technology (EdTech) and learning companies. Two trends for AWS use cases in education […]