AWS Big Data Blog
With the release of Kinesis Data Firehose HTTP endpoint delivery, you can now stream your data through Amazon Kinesis or directly push data to Kinesis Data Firehose and configure it to deliver data to MongoDB Atlas. You can also configure Kinesis Data Firehose to transform the data before delivering it to its destination. You don’t have to write applications and manage resources to read data and push to MongoDB. It’s all managed by AWS, making it easier to estimate costs for your data based on your data volume. In this post, we discuss how to integrate Kinesis Data Firehose and MongoDB Cloud and demonstrate how to stream data from your source to MongoDB Atlas.
QuickSight folders provide a powerful way for admins and authors to organize, manage, and share content while being a powerful discovery mechanism for readers. Folders are now generally available in QuickSight Enterprise Edition in all supported QuickSight Regions.
This post shows how to implement Vega visualizations included in Kibana, which is part of Amazon Elasticsearch Service (Amazon ES), using a real-world clickstream data sample. Vega visualizations are an integrated scripting mechanism of Kibana to perform on-the-fly computations on raw data to generate D3.js visualizations. For this post, we use a fully automated setup using AWS CloudFormation to show how to build a customized histogram for a web analytics use case. This example implements an ad hoc map-reduce like aggregation of the underlying data for a histogram.
This post demonstrates how customers, system integrator (SI) partners, and developers can use the serverless streaming ETL capabilities of AWS Glue with Amazon Managed Streaming for Kafka (Amazon MSK) to stream data to a data warehouse such as Amazon Redshift. We also show you how to view Twitter streaming data on Amazon QuickSight via Amazon Redshift.
This post discusses installing and configuring Prometheus and Grafana on an Amazon Elastic Compute Cloud (Amazon EC2) instance, configuring an EMR cluster to emit metrics that Prometheus can scrape from the cluster, and using the Grafana dashboards to analyze the metrics for a workload on the EMR cluster and optimize it. Additionally, we also cover how Prometheus can push alerts to the Alertmanager, and configuring Amazon SNS to send email notifications.
This post discusses how Vortexa harnesses the power of Apache Kafka to improve real-time data accuracy and accelerate time-to-market by using a combination of Lenses.io for greater observability and Amazon Managed Streaming for Apache Kafka (Amazon MSK) to create clusters on demand.
Most organizations have to comply with regulations when dealing with their customer data. For that reason, datasets that contain personally identifiable information (PII) is often anonymized. A common example of PII can be tables and columns that contain personal information about an individual (such as first name and last name) or tables with columns that, if joined with another table, can trace back to an individual. You can use AWS Analytics services to anonymize your datasets. In this post, I describe how to use Amazon Athena to anonymize a dataset. You can then use AWS Lake Formation to provide the right access to the right personas.
This is a guest post by Sara Miller, Head of Data Management and Data Lake, Direct Energy; and Zhouyi Liu, Senior AWS Developer, Direct Energy. Enterprise companies like Direct Energy migrate on-premises data warehouses and services to AWS to achieve fully manageable digital transformation of their organization. Freedom from traditional data warehouse constraints frees up […]
Amazon QuickSight recently introduced four new features—embedded authoring, namespaces for multi-tenancy, custom user permissions, and account-level customizations—that, with existing dashboard embedding and API capabilities available in the Enterprise Edition, allow you to integrate advanced dashboarding and analytics capabilities within SaaS applications. Developers and independent software vendors (ISVs) who build these applications can now offer embedded, […]
New Relic can now ingest data directly from Amazon Kinesis Data Firehose, expanding the insights New Relic can give you into your cloud stacks so you can deliver more perfect software. Kinesis Data Firehose is a fully managed service for delivering real-time streaming data to AWS services like Amazon Simple Storage Service (Amazon S3), Amazon […]