AWS Big Data Blog
Category: Analytics
How Isentia improves customer experience by modernizing their real-time media monitoring and intelligence platform with Amazon Kinesis Data Analytics for Apache Flink
August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. This is a blog post co-written by Karl Platz at Isentia. In their own words, “Isentia is the leading media monitoring, intelligence and insights solution provider in […]
Build seamless data streaming pipelines with Amazon Kinesis Data Streams and Amazon Data Firehose for Amazon DynamoDB tables
February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. The global wearables market […]
Migrate data into Amazon ES using remote reindex
September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. Amazon OpenSearch Service recently launched support for remote reindexing. This feature adds the ability to copy data to an Amazon OpenSearch Service domain from self-managed Elasticsearch running on-premises, self-managed on Amazon Elastic Compute Cloud (Amazon EC2) on AWS, or another […]
Enable private access to Amazon Redshift from your client applications in another VPC
November 2023: This post was reviewed and updated to include configurations and options for Amazon Redshift Serverless. You can now use an Amazon Redshift-managed VPC endpoint (powered by AWS PrivateLink) to connect to your private Amazon Redshift cluster with the RA3-instance type or Amazon Redshift Serverless within your virtual private cloud (VPC). With an Amazon […]
Extract multidimensional data from Microsoft SQL Server Analysis Services using AWS Glue
AWS Glue is fully managed service that makes it easier for you to extract, transform, and load (ETL) data for analytics. You can easily create ETL jobs to connect to backend data sources. There are several natively supported data sources, but what if you need to extract data from an unsupported data source? What if […]
Migrate terabytes of data quickly from Google Cloud to Amazon S3 with AWS Glue Connector for Google BigQuery
This blog post was last updated July, 2022 to update the new version of the connector and details on how to push down queries to Google BigQuery. The cloud is often seen as advantageous for data lakes because of better security, faster time to deployment, better availability, more frequent feature and functionality updates, more elasticity, […]
Doing data preparation using on-premises PostgreSQL databases with AWS Glue DataBrew
Today, with AWS Glue DataBrew, data analysts and data scientists can easily access and visually explore any amount of data across their organization directly from their Amazon Simple Storage Service (Amazon S3) data lake, Amazon Redshift data warehouse, and Amazon Aurora and Amazon Relational Database Service (Amazon RDS) databases. Customers can choose from over 250 […]
Orchestrate an Amazon EMR on Amazon EKS Spark job with AWS Step Functions
At re:Invent 2020, we announced the general availability of Amazon EMR on Amazon EKS, a new deployment option for Amazon EMR that allows you to automate the provisioning and management of open-source big data frameworks on Amazon Elastic Kubernetes Service (Amazon EKS). With Amazon EMR on EKS, you can now run Spark applications alongside other […]
Build a real-time streaming application using Apache Flink Python API with Amazon Kinesis Data Analytics
August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. Amazon Kinesis Data Analytics is now expanding its Apache Flink offering by adding support for Python. This is exciting news for many of our customers who use […]
Introducing Auto-Tune in Amazon ES
September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. Today we announced Auto-Tune in Amazon OpenSearch Service, an innovation undertaken to automatically optimize resources in Elasticsearch clusters to improve its performance and availability. Auto-Tune gives us a unique opportunity of applying our learnings from operating clusters at cloud scale […]