AWS Big Data Blog

Analyze daily trading activity using transaction data from Amazon Redshift in Amazon FinSpace

Financial services organizations use data from various sources to discover new insights and improve trading decisions. Finding the right dataset and getting access to the data can frequently be a time-consuming process. For example, to analyze daily trading activity, analysts need to find a list of available databases and tables, identify its owner’s contact information, […]

Register now for Flink Forward Global, October 26-27, 2021

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. Flink Forward Global 2021 is a 2-day virtual conference for the Apache Flink and stream processing communities. Apache Flink is an open-source distributed engine for processing data […]

Embed multi-tenant dashboards in SaaS apps using Amazon QuickSight without provisioning or managing users

Amazon QuickSight is a fully-managed, cloud-native business intelligence (BI) service that makes it easy to connect to your data, create interactive dashboards, and share these with tens of thousands of users, either within QuickSight itself, or embedded in software as a service (SaaS) apps. QuickSight Enterprise Edition recently added row-level security (RLS) using tags, a […]

Federated authentication to Amazon Redshift using AWS Single Sign-On

July 2023: This post was reviewed for accuracy. Managing database users through identity federation allows you to manage authentication and authorization procedures centrally. Amazon Redshift, a fast, fully managed cloud data warehouse, provides browser-based plugins for JDBC/ODBC drivers, which helps you easily implement identity federation capabilities added with multi-factor authentication (MFA) to secure your data […]

Implement anti-money laundering solutions on AWS

The detection and prevention of financial crime continues to be an important priority for banks. Over the past 10 years, the level of activity in financial crimes compliance in financial services has expanded significantly, with regulators around the globe taking scores of enforcement actions and levying $36 billion in fines. Apart from the fines, the […]

How to attribute Amazon Redshift costs to your end-users

Amazon Redshift is a fast, fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing business intelligence (BI) tools. In this two-part series, we discuss how to attribute Amazon Redshift costs at the user and group level. When using Amazon Redshift, you can […]

Amazon Data Firehose now supports dynamic partitioning to Amazon S3

February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. Amazon Data Firehose provides a convenient way to reliably load streaming data into data lakes, data stores, and analytics services. It can capture, transform, and deliver streaming data to Amazon Simple Storage […]

How MOIA built a fully automated GDPR compliant data lake using AWS Lake Formation, AWS Glue, and AWS CodePipeline

This is a guest blog post co-written by Leonardo Pêpe, a Data Engineer at MOIA. MOIA is an independent company of the Volkswagen Group with locations in Berlin and Hamburg, and operates its own ride pooling services in Hamburg and Hanover. The company was founded in 2016 and develops mobility services independently or in partnership […]

Scope of Solution

Centralize feature engineering with AWS Step Functions and AWS Glue DataBrew

One of the key phases of a machine learning (ML) workflow is data preprocessing, which involves cleaning, exploring, and transforming the data. AWS Glue DataBrew, announced in AWS re:Invent 2020, is a visual data preparation tool that enables you to develop common data preparation steps without having to write any code or installation. In this […]

Get started with the Amazon Redshift Data API

June 2023: This post was reviewed and updated for accuracy. The GitHub repository mentioned in this post is now updated with examples for serverless. Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that enables you to analyze your data at scale. Tens of thousands of customers use Amazon Redshift to […]