AWS Big Data Blog

Add your own libraries and application dependencies to Spark and Hive on Amazon EMR Serverless with custom images

Amazon EMR Serverless allows you to run open-source big data frameworks such as Apache Spark and Apache Hive without managing clusters and servers. Many customers who run Spark and Hive applications want to add their own libraries and dependencies to the application runtime. For example, you may want to add popular open-source extensions to Spark, […]

Run a popular benchmark on Amazon Redshift Serverless easily with AWS Data Exchange

Amazon Redshift is a fast, easy, secure, and economical cloud data warehousing service designed for analytics. AWS announced Amazon Redshift Serverless general availability in July 2022, providing an easier experience to operate Amazon Redshift. Amazon Redshift Serverless makes it simple to run and scale analytics without having to manage your data warehouse infrastructure. Amazon Redshift […]

Code conversion from Greenplum to Amazon Redshift: Handling arrays, dates, and regular expressions

Amazon Redshift is a fully managed service for data lakes, data analytics, and data warehouses for startups, medium enterprises, and large enterprises. Amazon Redshift is used by tens of thousands of businesses around the globe for modernizing their data analytics platform. Greenplum is an open-source, massively parallel database used for analytics, mostly for on-premises infrastructure. […]

Build a search application with Amazon OpenSearch Serverless

In this post, we demonstrate how to build a simple web-based search application using the recently announced Amazon OpenSearch Serverless, a serverless option for Amazon OpenSearch Service that makes it easy to run petabyte-scale search and analytics workloads without having to think about clusters. The benefit of using OpenSearch Serverless as a backend for your […]

Green Flag uses Amazon QuickSight to democratize data and enable self-serve insights to all employees

This is a guest post by Jeremy Bristow, Head of Product at Green Flag. In the US, there’s a saying: “Sooner or later, you’ll break down and call Triple A.” In the UK, that same saying might be “Sooner or later, you’ll break down and call Green Flag.” Green Flag has been assisting stranded motorists […]

Accelerate your data exploration and experimentation with the AWS Analytics Reference Architecture library

Organizations use their data to solve complex problems by starting small, running iterative experiments, and refining the solution. Although the power of experiments can’t be ignored, organizations have to be cautious about the cost-effectiveness of such experiments. If time is spent creating the underlying infrastructure for enabling experiments, it further adds to the cost. Developers […]

Near-real-time fraud detection using Amazon Redshift Streaming Ingestion with Amazon Kinesis Data Streams and Amazon Redshift ML

The importance of data warehouses and analytics performed on data warehouse platforms has been increasing steadily over the years, with many businesses coming to rely on these systems as mission-critical for both short-term operational decision-making and long-term strategic planning. Traditionally, data warehouses are refreshed in batch cycles, for example, monthly, weekly, or daily, so that […]

How Novo Nordisk built a modern data architecture on AWS

Novo Nordisk is a leading global pharmaceutical company, responsible for producing life-saving medicines that reach more than 34 million patients each day. They do this following their triple bottom line—that they must strive to be environmentally sustainable, socially sustainable, and financially sustainable. The combination of using AWS and data supports all these targets. Data is […]

Convoy dashboard showing QuickSight BI

Convoy uses Amazon QuickSight to help shippers and carriers improve efficiency and save money with data-driven decisions

Convoy is the leading digital freight network in the United States. We move millions of truckloads around the country through our connected network of carriers, saving money for shippers, increasing earnings for drivers, and eliminating carbon waste for our planet. In 2015, Convoy started a movement toward efficient freight. We build technology to find smarter […]

Amazon Identity Services uses Amazon QuickSight to empower partners with self-serve data discovery

Amazon Identity Services is responsible for the way Amazon customers—buyers, sellers, developers—identify themselves on Amazon. Our team also manages customers’ core account information, such as names and delivery addresses. Our mission is to deliver the most intuitive, convenient, and secure authentication experience. We’re in charge of account security for Amazon, worldwide, on all device surfaces. […]