AWS Big Data Blog
Top analytics announcements of AWS re:Invent 2022
Missed AWS re:Invent 2022? We’ve got you covered! AWS offers the most scalable, highest performing data services to keep up with the growing volume and velocity of data to help organizations to be data-driven in real-time. We help customers unify diverse data sources by investing in a zero ETL future. We provide the industry’s most […]
Monitor AWS workloads without a single line of code with Logz.io and Kinesis Firehose
February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. Observability data provides near real-time insights into the health and performance of AWS workloads, so that engineers can quickly address production issues and troubleshoot them before widespread customer impact. As AWS workloads […]
Introducing native Delta Lake table support with AWS Glue crawlers
June 2023: This post was reviewed and updated for accuracy. Delta Lake is an open-source project that helps implement modern data lake architectures commonly built on Amazon S3 or other cloud storages. With Delta Lake, you can achieve ACID transactions, time travel queries, CDC, and other common use cases on the cloud. Delta Lake is […]
Getting started with AWS Glue Data Quality for ETL Pipelines
June 2023: This post was reviewed and updated with the latest release from AWS Glue Data Catalog. Today, hundreds of thousands of customers use data lakes for analytics and machine learning. However, data engineers have to cleanse and prepare this data before it can be used. The underlying data has to be accurate and recent […]
AWS Marketplace Seller Insights team uses Amazon QuickSight Embedded to empower sellers with actionable business insights
AWS Marketplace enables independent software vendors (ISVs), data providers, and consulting partners to sell software, services, and data to millions of AWS customers. Working in partnership with the AWS Partner Network (APN), AWS Marketplace helps ISVs and partners build, market and sell their AWS offerings by providing crucial business, technical, and marketing support. The AWS […]
Amazon EMR Serverless cost estimator
Amazon EMR Serverless is a serverless option in Amazon EMR that makes it easy for data analysts and engineers to run applications using open-source big data analytics frameworks such as Apache Spark and Hive without configuring, managing, and scaling clusters or servers. You get all the features of the latest open-source frameworks with the performance-optimized […]
Analyze real-time streaming data in Amazon MSK with Amazon Athena
Recent advances in ease of use and scalability have made streaming data easier to generate and use for real-time decision-making. Coupled with market forces that have forced businesses to react more quickly to industry changes, more and more organizations today are turning to streaming data to fuel innovation and agility. Amazon Managed Streaming for Apache […]
LaunchDarkly’s journey from ingesting 1 TB to 100 TB per day with Amazon Kinesis Data Streams
February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. This post was co-written with Mike Zorn, Software Architect at LaunchDarkly as the lead author. LaunchDarkly’s feature management platform enables customers to release features and measure their impact. As part of this […]
Migrate Google BigQuery to Amazon Redshift using AWS Schema Conversion tool (SCT)
Amazon Redshift is a fast, fully-managed, petabyte scale data warehouse that provides the flexibility to use provisioned or serverless compute for your analytical workloads. Using Amazon Redshift Serverless and Query Editor v2, you can load and query large datasets in just a few clicks and pay only for what you use. The decoupled compute and […]
Create, Train and Deploy Multi Layer Perceptron (MLP) models using Amazon Redshift ML
Amazon Redshift is a fully managed and petabyte-scale cloud data warehouse which is being used by tens of thousands of customers to process exabytes of data every day to power their analytics workloads. Amazon Redshift comes with a feature called Amazon Redshift ML which puts the power of machine learning in the hands of every […]