AWS Storage Blog

Tag: AWS Glue

AWS DataSync Featured Image 2020

Derive insights from AWS DataSync task reports using AWS Glue, Amazon Athena, and Amazon QuickSight

Update (10/30/2024): On October 30, 2024, AWS DataSync launched Enhanced mode tasks, prompting updates to this blog. Updates include a new step in the “Step 2: Populate Glue catalog with task reports data using a Glue crawler” section and detailed information on the new capabilities in “Updated steps for working with task reports of new […]

Amazon S3 featured image 2023

Use generative AI to query your Amazon S3 data lake for insights

Businesses store large volumes of data in their data lakes and rely on this data to extract insights and make important business decisions. However, business stakeholders sometimes lack the technical skills required to run complex queries against their data lakes. Instead, they rely on data scientists or analysts to build reports and dashboards or to […]

Amazon S3 featured image 2023

Siemens builds Datalake2Go on AWS to analyze disparate data globally

Siemens is a technology company focused on industry, infrastructure, transport, and healthcare. From resource-efficient factories, resilient supply chains, and smart buildings and grids, to cleaner and more comfortable transportation and advanced healthcare, the company creates technology with purpose, adding real value for its customers. Siemens technology is everywhere, supporting the critical infrastructure and vital industries […]

AWS DataSync Featured Image 2020

Migrate on-premises data to AWS for insightful visualizations

When migrating data from on premises, customers seek a data store that is scalable, durable, and cost effective. Equally as important, BI must support modern, interactive, and fast dashboards that can scale to tens of thousands of users seamlessly while providing the ability to create meaningful data visualizations for analysis. Visualization of on-premises business analytics […]

AWS DataSync Featured Image 2020

How TMAP Mobility transferred 2.4 PB of Hadoop data using AWS DataSync

Launched in 2002, TMAP Mobility is Korea’s leading mobility platform, with 20 million registered users and 14 million monthly active users. TMAP provides navigation services based on a wide range of real-time traffic information and data. Previously, the Data Intelligence group at TMAP Mobility operated a mobility-data platform based on a Hadoop Distributed File System […]

Visualizing usage of Provisioned IOPS volumes on Amazon EBS for analysis

Organizations are always looking to right-size cloud infrastructure and optimize to cost. Historically, one of the areas where it has been difficult to right-size at scale are Provisioned IOPS volumes on Amazon EBS, as optimization usually required third-party tools. The recently announced AWS Compute Optimizer assists in solving that problem, as it helps customers optimize compute resources […]

Amazon S3

Query Amazon S3 Analytics data with Amazon Athena

I recently had a customer explain that they were aware of the benefits of various Amazon S3 storage classes, like S3 Standard, S3 Infrequent-Access, and S3 One-Zone Infrequent-Access, but they were not sure which tiers and lifecycle rules to apply to optimize their storage. This customer, and others like them, have multiple buckets and various […]