AWS Big Data Blog
Category: Analytics
Introducing ACK controller for Amazon EMR on EKS
AWS Controllers for Kubernetes (ACK) was announced in August, 2020, and now supports 14 AWS service controllers as generally available with an additional 12 in preview. The vision behind this initiative was simple: allow Kubernetes users to use the Kubernetes API to manage the lifecycle of AWS resources such as Amazon Simple Storage Service (Amazon […]
Announcing AWS Glue crawler support for Snowflake
For data lake customers who need to discover petabytes of data, AWS Glue crawlers are a popular way to scan data in the background, so you can focus on using the data to make better intelligent decisions. You may also have data in data warehouses such as Snowflake and want the ability to discover the […]
How ENGIE automates the deployment of Amazon Athena data sources on Microsoft Power BI
ENGIE—one of the largest utility providers in France and a global player in the zero-carbon energy transition—produces, transports, and deals in electricity, gas, and energy services. With 160,000 employees worldwide, ENGIE is a decentralized organization and operates 25 business units with a high level of delegation and empowerment. ENGIE’s decentralized global customer base had accumulated […]
Share and publish your Snowflake data to AWS Data Exchange using Amazon Redshift data sharing
Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. Today, tens of thousands of AWS customers—from Fortune 500 companies, startups, and everything in between—use Amazon Redshift to run mission-critical business intelligence (BI) dashboards, […]
Use Karpenter to speed up Amazon EMR on EKS autoscaling
Amazon EMR on Amazon EKS is a deployment option for Amazon EMR that allows organizations to run Apache Spark on Amazon Elastic Kubernetes Service (Amazon EKS). With EMR on EKS, the Spark jobs run on the Amazon EMR runtime for Apache Spark. This increases the performance of your Spark jobs so that they run faster […]
Fulfillment by Amazon uses Amazon QuickSight Embedded to deliver key reporting insights to Amazon Marketplace sellers
Fulfillment by Amazon (FBA) was launched in 2006, allowing businesses to outsource shipping to Amazon. With this fulfillment option, Amazon stores, picks, packs, ships, and delivers the products to customer as well as handling the customer service and returns for those orders. Within Seller Central, a website where sellers can monitor their Amazon sales activity, […]
Your guide to streaming data & real-time analytics at re:Invent 2022
August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. Mark your calendars for November 28 through December 2, 2022 to attend AWS re:Invent in Las Vegas – a learning conference hosted by AWS for the global […]
Use an event-driven architecture to build a data mesh on AWS
In this post, we take the data mesh design discussed in Design a data mesh architecture using AWS Lake Formation and AWS Glue, and demonstrate how to initialize data domain accounts to enable managed sharing; we also go through how we can use an event-driven approach to automate processes between the central governance account and […]
Build an optimized self-service interactive analytics platform with Amazon EMR Studio
Data engineers and data scientists are dependent on distributed data processing infrastructure like Amazon EMR to perform data processing and advanced analytics jobs on large volumes of data. In most mid-size and enterprise organizations, cloud operations teams own procuring, provisioning, and maintaining the IT infrastructures, and their objectives and best practices differ from the data […]
How Hudl built a cost-optimized AWS Glue pipeline with Apache Hudi datasets
This is a guest blog post co-written with Addison Higley and Ramzi Yassine from Hudl. Hudl Agile Sports Technologies, Inc. is a Lincoln, Nebraska based company that provides tools for coaches and athletes to review game footage and improve individual and team play. Its initial product line served college and professional American football teams. Today, […]