AWS Big Data Blog

Category: *Post Types

How Vanguard made their technology platform resilient and efficient by building cross-Region replication for Amazon Kinesis Data Streams

This is a guest post co-written with Raghu Boppanna from Vanguard.  At Vanguard, the Enterprise Advice line of business improves investor outcomes through digital access to superior, personalized, and affordable financial advice. They made it possible, in part, by driving economies of scale across the globe for investors with a highly resilient and efficient technical […]

Control access to Amazon OpenSearch Service Dashboards with attribute-based role mappings

Federated users of Amazon OpenSearch Service often need access to OpenSearch Dashboards with roles based on their user profiles. OpenSearch Service fine-grained access control maps authenticated users to OpenSearch Search roles and then evaluates permissions to determine how to handle the user’s actions. However, when an enterprise-wide identity provider (IdP) manages the users, the mapping […]

How Ruparupa gained updated insights with an Amazon S3 data lake, AWS Glue, Apache Hudi, and Amazon QuickSight

This post is co-written with Olivia Michele and Dariswan Janweri P. at Ruparupa. Ruparupa was built by PT. Omni Digitama Internusa with the vision to cultivate synergy and create a seamless digital ecosystem within Kawan Lama Group that touches and enhances the lives of many. Ruparupa is the first digital platform built by Kawan Lama […]

Build a data storytelling application with Amazon Redshift Serverless and Toucan

This post was co-written with Django Bouchez, Solution Engineer at Toucan. Business intelligence (BI) with dashboards, reports, and analytics remains one of the most popular use cases for data and analytics. It provides business analysts and managers with a visualization of the business’s past and current state, helping leaders make strategic decisions that dictate the […]

How SafeGraph built a reliable, efficient, and user-friendly Apache Spark platform with Amazon EMR on Amazon EKS

This is a guest post by Nan Zhu, Tech Lead Manager, SafeGraph, and Dave Thibault, Sr. Solutions Architect – AWS SafeGraph is a geospatial data company that curates over 41 million global points of interest (POIs) with detailed attributes, such as brand affiliation, advanced category tagging, and open hours, as well as how people interact […]

How Strategic Blue uses Amazon QuickSight and AWS Cost and Usage Reports to help their customers save millions

This is a guest post co-written with Frank Contrepois from Strategic Blue. For over 10 years, Strategic Blue has helped organizations unlock the most value from the cloud by enabling their customers to purchase non-standard commitments. By taking a commodity trading approach to purchasing from AWS, Strategic Blue helps customers purchase commitments for varying lengths […]

Automate replication of relational sources into a transactional data lake with Apache Iceberg and AWS Glue

Organizations have chosen to build data lakes on top of Amazon Simple Storage Service (Amazon S3) for many years. A data lake is the most popular choice for organizations to store all their organizational data generated by different teams, across business domains, from all different formats, and even over history. According to a study, the […]

Monitor Apache HBase on Amazon EMR using Amazon Managed Service for Prometheus and Amazon Managed Grafana

Amazon EMR provides a managed Apache Hadoop framework that makes it straightforward, fast, and cost-effective to run Apache HBase. Apache HBase is a massively scalable, distributed big data store in the Apache Hadoop ecosystem. It is an open-source, non-relational, versioned database that runs on top of the Apache Hadoop Distributed File System (HDFS). It’s built […]

Chargeback Gurus empowers eCommerce merchants with advanced chargeback intelligence to recover millions using Amazon QuickSight

This is a guest post by Suresh Dakshina and Damodharan Sampathkumar from Chargeback Gurus. Chargeback Gurus, a global financial technology company helps businesses fight, prevent, and win chargebacks. To date, we have helped businesses worldwide recover over $2 billion in lost revenue. As trusted advisors to card networks and Fortune 500 companies, we are known […]

How OLX Group migrated to Amazon Redshift RA3 for simpler, faster, and more cost-effective analytics

This is a guest post by Miguel Chin, Data Engineering Manager at OLX Group and David Greenshtein, Specialist Solutions Architect for Analytics, AWS. OLX Group is one of the world’s fastest-growing networks of online marketplaces, operating in over 30 countries around the world. We help people buy and sell cars, find housing, get jobs, buy […]