AWS Big Data Blog

Category: AWS Database Migration Service

Apply record level changes from relational databases to Amazon S3 data lake using Apache Hudi on Amazon EMR and AWS Database Migration Service

Data lakes give organizations the ability to harness data from multiple sources in less time. Users across different roles are now empowered to collaborate and analyze data in different ways, leading to better, faster decision-making. Amazon Simple Storage Service (Amazon S3) is the highly performant object storage service for structured and unstructured data and the […]

Read More

Stream CDC into an Amazon S3 data lake in Parquet format with AWS DMS

Most organizations generate data in real time and ever-increasing volumes. Data is captured from a variety of sources, such as transactional and reporting databases, application logs, customer-facing websites, and external feeds. Companies want to capture, transform, and analyze this time-sensitive data to improve customer experiences, increase efficiency, and drive innovations. With increased data volume and […]

Read More

Load ongoing data lake changes with AWS DMS and AWS Glue

Building a data lake on Amazon S3 provides an organization with countless benefits. It allows you to access diverse data sources, determine unique relationships, build AI/ML models to provide customized customer experiences, and accelerate the curation of new datasets for consumption. However, capturing and loading continuously changing updates from operational data stores—whether on-premises or on […]

Read More

Our data lake story: How Woot.com built a serverless data lake on AWS

In this post, we talk about designing a cloud-native data warehouse as a replacement for our legacy data warehouse built on a relational database. At the beginning of the design process, the simplest solution appeared to be a straightforward lift-and-shift migration from one relational database to another. However, we decided to step back and focus […]

Read More