AWS Big Data Blog

Featured Stateful Architecture

Doing more with less: Moving from transactional to stateful batch processing

Amazon processes hundreds of millions of financial transactions each day, including accounts receivable, accounts payable, royalties, amortizations, and remittances, from over a hundred different business entities. All of this data is sent to the eCommerce Financial Integration (eCFI) systems, where they are recorded in the subledger. Ensuring complete financial reconciliation at this scale is critical […]

Use AnalyticsIQ with Amazon QuickSight to gain insights for your business

Decisions are made every day in your organization that impact your business. Making the right decision at the right moment can deeply impact your organization’s growth and your customers. Likewise, having the right data and tools that generate insights into the data can empower your organization’s leaders to make the right decisions. In the healthcare […]

Automate building data lakes using AWS Service Catalog

Today, organizations spend a considerable amount of time understanding business processes, profiling data, and analyzing data from a variety of sources. The result is highly structured and organized data used primarily for reporting purposes. These traditional systems extract data from transactional systems that consist of metrics and attributes that describe different aspects of the business. […]

Build a REST API to enable data consumption from Amazon Redshift

API (Application Programming Interface) is a design pattern used to expose a platform or application to another party. APIs enable programs and applications to communicate with platforms and services, and can be designed to use REST (REpresentational State Transfer) as a software architecture style. APIs in OLTP (online transaction processing) are called frequently (tens to […]

How GE Aviation automated engine wash analytics with AWS Glue using a serverless architecture

This post is authored by Giridhar G Jorapur, GE Aviation Digital Technology. Maintenance and overhauling of aircraft engines are essential for GE Aviation to increase time on wing gains and reduce shop visit costs. Engine wash analytics provide visibility into the significant time on wing gains that can be achieved through effective water wash, foam […]

How ENGIE scales their data ingestion pipelines using Amazon MWAA

ENGIE—one of the largest utility providers in France and a global player in the zero-carbon energy transition—produces, transports, and deals electricity, gas, and energy services. With 160,000 employees worldwide, ENGIE is a decentralized organization and operates 25 business units with a high level of delegation and empowerment. ENGIE’s decentralized global customer base had accumulated lots […]

Build a modern data architecture on AWS with Amazon AppFlow, AWS Lake Formation, and Amazon Redshift: Part 2

In Part 1 of this post, we provided a solution to build the sourcing, orchestration, and transformation of data from multiple source systems, including Salesforce, SAP, and Oracle, into a managed modern data platform. Roche partnered with AWS Professional Services to build out this fully automated and scalable platform to provide the foundation for their […]

Best practices to optimize your Amazon Redshift and MicroStrategy deployment

This is a guest blog post co-written by Amit Nayak at Microstrategy. In their own words, “MicroStrategy is the largest independent publicly traded business intelligence (BI) company, with the leading enterprise analytics platform. Our vision is to enable Intelligence Everywhere. MicroStrategy provides modern analytics on an open, comprehensive enterprise platform used by many of the […]

Add comparative and cumulative date/time calculations in Amazon QuickSight

Amazon QuickSight recently added native support for comparative (e.g., year-over-year) and cumulative (e.g., year-to-date) period functions which allow you to easily introduce these calculations in business reporting, trend analysis and time series analysis. This allows authors in QuickSight to implement advanced calculations without having to use complicated date offsets in calculations to achieve such datetime-aware […]

Validate streaming data over Amazon MSK using schemas in cross-account AWS Glue Schema Registry

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. Today’s businesses face an unprecedented growth in the volume of data. A growing portion of the data is generated in real time by IoT devices, websites, business […]