AWS Big Data Blog
Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift
This series of posts demonstrates how you can onboard and access existing AWS data sources using SageMaker Unified Studio. This post focuses on onboarding existing AWS Glue Data Catalog tables and database tables available in Amazon Redshift.
Connect, share, and query where your data sits using Amazon SageMaker Unified Studio
In this blog post, we will demonstrate how business units can use Amazon SageMaker Unified Studio to discover, subscribe to, and analyze these distributed data assets. Through this unified query capability, you can create comprehensive insights into customer transaction patterns and purchase behavior for active products without the traditional barriers of data silos or the need to copy data between systems.
Foundational blocks of Amazon SageMaker Unified Studio: An admin’s guide to implement unified access to all your data, analytics, and AI
In this post, we discuss the foundational building blocks of SageMaker Unified Studio and how, by abstracting complex technical implementations behind user-friendly interfaces, organizations can maintain standardized governance while enabling efficient resource management across business units. This approach provides consistency in infrastructure deployment while providing the flexibility needed for diverse business requirements.
Seamless integration of data lake and data warehouse using Amazon Redshift Spectrum and Amazon DataZone
Unlocking the true value of data often gets impeded by siloed information. Traditional data management—wherein each business unit ingests raw data in separate data lakes or warehouses—hinders visibility and cross-functional analysis. A data mesh framework empowers business units with data ownership and facilitates seamless sharing. However, integrating datasets from different business units can present several […]
Implement data quality checks on Amazon Redshift data assets and integrate with Amazon DataZone
In this post, we show how to capture the data quality metrics for data assets produced in Amazon Redshift. With Amazon DataZone, the data owner can directly import the technical metadata of a Redshift database table and views to the Amazon DataZone project’s inventory. As these data assets gets imported into Amazon DataZone, it bypasses the AWS Glue Data Catalog, creating a gap in data quality integration. This post proposes a solution to enrich the Amazon Redshift data asset with data quality scores and KPI metrics.




