AWS Big Data Blog

Introducing ACK controller for Amazon EMR on EKS

AWS Controllers for Kubernetes (ACK) was announced in August, 2020, and now supports 14 AWS service controllers as generally available with an additional 12 in preview. The vision behind this initiative was simple: allow Kubernetes users to use the Kubernetes API to manage the lifecycle of AWS resources such as Amazon Simple Storage Service (Amazon […]

Announcing AWS Glue crawler support for Snowflake

For data lake customers who need to discover petabytes of data, AWS Glue crawlers are a popular way to scan data in the background, so you can focus on using the data to make better intelligent decisions. You may also have data in data warehouses such as Snowflake and want the ability to discover the […]

How ENGIE automates the deployment of Amazon Athena data sources on Microsoft Power BI

ENGIE—one of the largest utility providers in France and a global player in the zero-carbon energy transition—produces, transports, and deals in electricity, gas, and energy services. With 160,000 employees worldwide, ENGIE is a decentralized organization and operates 25 business units with a high level of delegation and empowerment. ENGIE’s decentralized global customer base had accumulated […]

Announcing the latest version of the AWS Well-Architected Data Analytics Lens

We are delighted to announce the latest version of the Data Analytics Lens, an AWS Well-Architected whitepaper. AWS Well-Architected provides a consistent approach to evaluate architectures and implement scalable designs. The AWS Well-Architected Framework is based on six pillars—operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability. With the framework, cloud architects, system architects, engineers, and developers can build secure, high-performance, resilient, and efficient infrastructure for their applications and workloads.

Share and publish your Snowflake data to AWS Data Exchange using Amazon Redshift data sharing

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. Today, tens of thousands of AWS customers—from Fortune 500 companies, startups, and everything in between—use Amazon Redshift to run mission-critical business intelligence (BI) dashboards, […]

Use Karpenter to speed up Amazon EMR on EKS autoscaling

Amazon EMR on Amazon EKS is a deployment option for Amazon EMR that allows organizations to run Apache Spark on Amazon Elastic Kubernetes Service (Amazon EKS). With EMR on EKS, the Spark jobs run on the Amazon EMR runtime for Apache Spark. This increases the performance of your Spark jobs so that they run faster […]

Fulfillment by Amazon uses Amazon QuickSight Embedded to deliver key reporting insights to Amazon Marketplace sellers

Fulfillment by Amazon (FBA) was launched in 2006, allowing businesses to outsource shipping to Amazon. With this fulfillment option, Amazon stores, picks, packs, ships, and delivers the products to customer as well as handling the customer service and returns for those orders. Within Seller Central, a website where sellers can monitor their Amazon sales activity, […]

What’s new with AWS Glue at AWS re:Invent 2022

AWS re:Invent is a learning conference hosted by AWS for the global cloud computing community. This year’s re:Invent will be held in Las Vegas, Nevada, from November 28 to December 2. AWS Glue is a serverless data integration service that makes it easier for analytics users to discover, prepare, move, and integrate data from multiple […]

Use an event-driven architecture to build a data mesh on AWS

In this post, we take the data mesh design discussed in Design a data mesh architecture using AWS Lake Formation and AWS Glue, and demonstrate how to initialize data domain accounts to enable managed sharing; we also go through how we can use an event-driven approach to automate processes between the central governance account and […]