AWS Big Data Blog
Build a DataOps platform to break silos between engineers and analysts
Organizations across the globe are striving to provide a better service to internal and external stakeholders by enabling various divisions across the enterprise, like customer success, marketing, and finance, to make data-driven decisions. Data teams are the key enablers in this process, and usually consist of multiple roles, such as data engineers and analysts. However, […]
Build a data lake using Amazon Kinesis Data Streams for Amazon DynamoDB and Apache Hudi
Amazon DynamoDB helps you capture high-velocity data such as clickstream data to form customized user profiles and online order transaction data to develop customer order fulfillment applications, improve customer satisfaction, and get insights into sales revenue to create a promotional offer for the customer. It’s essential to store these data points in a centralized data […]
Amazon EMR 2020 year in review
Tens of thousands of customers use Amazon EMR to run big data analytics applications on Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto at scale. Amazon EMR automates the provisioning and scaling of these frameworks, and delivers high performance at low cost with optimized runtimes and support for a wide range […]
Effective data lakes using AWS Lake Formation, Part 1: Getting started with governed tables
February 2023: The content of this blog post can be now be found on AWS Lake Formation public documentation. Please refer to it instead. Thousands of customers are building their data lakes on Amazon Simple Storage Service (Amazon S3). You can use AWS Lake Formation to build your data lakes easily—in a matter of days […]
Run usage analytics on Amazon QuickSight using AWS CloudTrail
Amazon QuickSight is a cloud-native BI service that allows end users to create and publish dashboards in minutes, without provisioning any servers or requiring complex licensing. You can view these dashboards on the QuickSight product console or embed them into applications and websites. After you deploy a dashboard, it’s important to assess how they and […]
Retaining data streams up to one year with Amazon Kinesis Data Streams
Streaming data is used extensively for use cases like sharing data between applications, streaming ETL (extract, transform, and load), real-time analytics, processing data from internet of things (IoT) devices, application monitoring, fraud detection, live leaderboards, and more. Typically, data streams are stored for short durations of time before being loaded into a permanent data store […]
Create a custom data connector to Slack’s Member Analytics API in Amazon QuickSight with Amazon Athena Federated Query
Amazon QuickSight recently added support for Amazon Athena Federated Query, which allows you to query data in place from various data sources. With this capability, QuickSight can extend support to query additional data sources like Amazon CloudWatch Logs, Amazon DynamoDB, and Amazon DocumentDB (with Mongo DB compatibility) via their existing Amazon Athena data source. You […]
Building an administrative console in Amazon QuickSight to analyze usage metrics
November 2022: Please visit our blog on Admin console for latest updates. Given the scalability of Amazon QuickSight to hundreds and thousands of users, a common use case is to monitor QuickSight group and user activities, analyze the utilization of dashboards, and identify usage patterns of an individual user and dashboard. With timely access to […]
Integrating Datadog data with AWS using Amazon AppFlow for intelligent monitoring
Infrastructure and operation teams are often challenged with getting a full view into their IT environments to do monitoring and troubleshooting. New monitoring technologies are needed to provide an integrated view of all components of an IT infrastructure and application system. Datadog provides intelligent application and service monitoring by bringing together data from servers, databases, […]
Getting started with Trace Analytics in Amazon OpenSearch Service
September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. Updated May 11, 2021. See the release notes below for more details. Trace Analytics is now available for Amazon OpenSearch Service domains running versions 7.9 or later. Developers and IT Ops teams can use this feature to troubleshoot performance and […]