AWS Big Data Blog

Category: Kinesis Data Streams

Unified serverless streaming ETL architecture with Amazon Kinesis Data Analytics

Businesses across the world are seeing a massive influx of data at an enormous pace through multiple channels. With the advent of cloud computing, many companies are realizing the benefits of getting their data into the cloud to gain meaningful insights and save costs on data processing and storage. As businesses embark on their journey […]

Read More

Streaming data from Amazon S3 to Amazon Kinesis Data Streams using AWS DMS

Stream processing is very useful in use cases where we need to detect a problem quickly and improve the outcome based on data, for example production line monitoring or supply chain optimizations. This blog post walks you through process of streaming existing data files and ongoing changes from Amazon Simple Storage Service (Amazon S3) to […]

Read More

Stream CDC into an Amazon S3 data lake in Parquet format with AWS DMS

Most organizations generate data in real time and ever-increasing volumes. Data is captured from a variety of sources, such as transactional and reporting databases, application logs, customer-facing websites, and external feeds. Companies want to capture, transform, and analyze this time-sensitive data to improve customer experiences, increase efficiency, and drive innovations. With increased data volume and […]

Read More

Stream, transform, and analyze XML data in real time with Amazon Kinesis, AWS Lambda, and Amazon Redshift

When we look at enterprise data warehousing systems, we receive data in various formats, such as XML, JSON, or CSV. Most third-party system integrations happen through SOAP or REST web services, where the input and output data format is either XML or JSON. When applications deal with CSV or JSON, it becomes fairly simple to […]

Read More

Enhancing customer safety by leveraging the scalable, secure, and cost-optimized Toyota Connected Data Lake

Toyota Motor Corporation (TMC), a global automotive manufacturer, has made “connected cars” a core priority as part of its broader transformation from an auto company to a mobility company. In recent years, TMC and its affiliate technology and big data company, Toyota Connected, have developed an array of new technologies to provide connected services that […]

Read More

How Wind Mobility built a serverless data architecture

We parse through millions of scooter and user events generated daily (over 300 events per second) to extract actionable insight. We selected AWS Glue to perform this task. Our primary ETL job reads the newly added raw event data from Amazon S3, processes it using Apache Spark, and writes the results to our Amazon Redshift data warehouse. AWS Glue plays a critical role in our ability to scale on demand. After careful evaluation and testing, we concluded that AWS Glue ETL jobs meet all our needs and free us from procuring and managing infrastructure.

Read More

Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics

Most businesses generate data continuously in real time and at ever-increasing volumes. Data is generated as users play mobile games, load balancers log requests, customers shop on your website, and temperature changes on IoT sensors. You can capitalize on time-sensitive events, improve customer experiences, increase efficiency, and drive innovation by analyzing this data quickly. The […]

Read More

Under the hood: Scaling your Kinesis data streams

Real-time delivery of data and insights enables businesses to pivot quickly in response to changes in demand, user engagement, and infrastructure events, among many others. Amazon Kinesis offers a managed service that lets you focus on building your applications, rather than managing infrastructure. Scalability is provided out-of-the-box, allowing you to ingest and process gigabytes of […]

Read More