AWS Architecture Blog
Category: AWS Glue
How to Accelerate Building a Lake House Architecture with AWS Glue
Customers are building databases, data warehouses, and data lake solutions in isolation from each other, each having its own separate data ingestion, storage, management, and governance layers. Often these disjointed efforts to build separate data stores end up creating data silos, data integration complexities, excessive data movement, and data consistency issues. These issues are preventing […]
Benefits of Modernizing On-premises Analytics with an AWS Lake House
Organizational analytics systems have shifted from running in the background of IT systems to being critical to an organization’s health. Analytics systems help businesses make better decisions, but they tend to be complex and are often not agile enough to scale quickly. To help with this, customers upgrade their traditional on-premises online analytic processing (OLAP) […]
Improving Retail Forecast Accuracy with Machine Learning
The global retail market continues to grow larger and the influx of consumer data increases daily. The rise in volume, variety, and velocity of data poses challenges with demand forecasting and inventory planning. Outdated systems generate inaccurate demand forecasts. This results in multiple challenges for retailers. They are faced with over-stocking and lost sales, and […]
Building a Showback Dashboard for Cost Visibility with Serverless Architectures
Enterprises with centralized IT organizations and multiple lines of businesses frequently use showback or chargeback mechanisms to hold their departments accountable for their technology usage and costs. Chargeback involves actually billing a department for the cost of their division’s usage. Showback focuses on visibility to make the department more cost conscientious and encourage operational efficiency. […]
Architecting Persona-centric Data Platform with On-premises Data Sources
Many organizations are moving their data from silos and aggregating it in one location. Collecting this data in a data lake enables you to perform analytics and machine learning on that data. You can store your data in purpose-built data stores, like a data warehouse, to get quick results for complex queries on structured data. […]
Using AppStream 2.0 to Deliver PACS and Image Analysis in Clinical Trials
Hospitals and clinical trial sites manage sensitive patient data. They are often required to grant remote access to custom Windows-based applications for patient record review and medical image analysis. This typically requires providing physicians and staff with remote access to on-premises workstations over VPN, with some flavor of remote desktop software. This can be both […]
Field Notes: Develop Data Pre-processing Scripts Using Amazon SageMaker Studio and an AWS Glue Development Endpoint
This post was co-written with Marcus Rosen, a Principal – Machine Learning Operations with Rio Tinto, a global mining company. Data pre-processing is an important step in setting up Machine Learning (ML) projects for success. Many AWS customers use Apache Spark on AWS Glue or Amazon EMR to run data pre-processing scripts while using Amazon SageMaker […]
Building a Cloud-based OLAP Cube and ETL Architecture with AWS Managed Services
For decades, enterprises used online analytical processing (OLAP) workloads to answer complex questions about their business by filtering and aggregating their data. These complex queries were compute and memory-intensive. This required teams to build and maintain complex extract, transform, and load (ETL) pipelines to model and organize data, oftentimes with commercial-grade analytics tools. In this […]
Designing a Successful Pilot Phase for Your Cloud Migration
Pilot phases, or pilots, as we will call them from now on, should be conducted to test and find the positive and negative aspects of a particular use case, design pattern, or application migration approach. They allow you to validate the foundation of your architecture (for example, with a landing zone governed by AWS Control […]
NLX is Helping Travelers Amid Disruption with AI-Powered Automation
This post was co-written by Andrei Papancea and Vlad Papancea of NLX and Sekhar Mallipeddi Travel impacts brought by the global pandemic left several airlines experiencing frequent flight disruptions, which increased flight scheduling change notifications being made to affected travelers. Every month, tens of thousands of passengers and related flight crew have to be contacted […]