AWS Big Data Blog
Ingesting Jira data into Amazon S3
Consolidating data from a work management tool like Jira and integrating this data with other data sources like ServiceNow, GitHub, Jenkins, and Time Entry Systems enables end-to-end visibility of different aspects of the software development lifecycle and helps keep your projects on schedule and within budget. Amazon Simple Storage Service (Amazon S3) is an object […]
Transform data and create dashboards simply using AWS Glue DataBrew and Amazon QuickSight
Before you can create visuals and dashboards that convey useful information, you need to transform and prepare the underlying data. The range and complexity of data transformation steps required depends on the visuals you would like in your dashboard. Often, the data transformation process is time-consuming and highly iterative, especially when you are working with […]
Amazon EMR now provides up to 30% lower cost and up to 15% improved performance for Spark workloads on Graviton2-based instances
Amazon EMR now supports M6g, C6g and R6g instances with Amazon EMR versions 6.1.0, 5.31.0 and later. These instances are powered by AWS Graviton2 processors that are custom designed by AWS using 64-bit Arm Neoverse cores to deliver the best price performance for cloud workloads running in Amazon Elastic Compute Cloud (Amazon EC2). On Graviton2 […]
Building an ad-to-order conversion engine with Amazon Kinesis, AWS Glue, and Amazon QuickSight
August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. Businesses in ecommerce have the challenge of measuring their ad-to-order conversion ratio for ads or promotional campaigns displayed on a webpage. Tracking the number of users that […]
Preparing data for ML models using AWS Glue DataBrew in a Jupyter notebook
AWS Glue DataBrew is a new visual data preparation tool that makes it easy for data analysts and data scientists to clean and normalize data to prepare it for analytics and machine learning (ML). In this post, we examine a sample ML use case and show how to use DataBrew and a Jupyter notebook to […]
Enabling self-service data publication to your data lake using AWS Glue DataBrew
Data lakes have been providing a level of flexibility to organizations unparalleled to anything before them. Having the ability to load and query data in place—and in its natural form—has led to an explosion of data lake deployments that have allowed organizations to accelerate against their data strategy faster than ever before. Most organizations have […]
Building a scalable streaming data processor with Amazon Kinesis Data Streams on AWS Fargate
Data is ubiquitous in businesses today, and the volume and speed of incoming data are constantly increasing. To derive insights from data, it’s essential to deliver it to a data lake or a data store and analyze it. Real-time or near-real-time data delivery can be cost prohibitive, therefore an efficient architecture is key for processing, […]
Controlling data lake access across multiple AWS accounts using AWS Lake Formation
When deploying data lakes on AWS, you can use multiple AWS accounts to better separate different projects or lines of business. In this post, we see how the AWS Lake Formation cross-account capabilities simplify securing and managing distributed data lakes across multiple accounts through a centralized approach, providing fine-grained access control to the AWS Glue […]
Amazon QuickSight: 2020 in review
As 2020 draws to a close, we’ve put together this post to walk you through all that’s changed in Amazon QuickSight this year. For your reading convenience, this post is broken up into the following sections: Embedded Analytics at scale Faster insights with Q & ML Business Intelligence (BI) with QuickSight Build Rich, Interactive Dashboards […]
Data monetization and customer experience optimization using telco data assets: Part 1
The landscape of the telecommunications industry is changing rapidly. For telecom service providers (TSPs), revenue from core voice and data services continues to shrink due to regulatory pressure and emerging OTT players that offer an attractive alternative. Despite increasing demand from customers for bandwidth, speed, and efficiency, TSPs are finding that ROI from implementing new […]









