This Guidance demonstrates how you can extend the data governance capabilities of Amazon DataZone to other Java Database Connectivity (JDBC) sources, such as MySQL, PostgreSQL, Oracle, and SQL Server. Extending governance to other JDBC data sources, self-managed databases, or third-party offerings is a unified solution to govern all of your data assets. It can be set up as an add-on for Amazon DataZone with the AWS Cloud Development Kit (AWS CDK), making it easy to automatically deploy and customize to fit your needs. You can discover and collaborate with databases, regardless of where the data assets are hosted.

Please note: [Disclaimer]

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF 

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

  • AWS Cloud Development Kit (AWS CDK), Service Catalog, Lambda, Step Functions, Amazon CloudWatch, and DynamoDB are services that work in tandem to support your operational excellence. First, AWS CDK automates and simplifies the configuration of this Guidance at scale, allowing it to be deployed from within any continuous integration and continuous delivery (CI/CD) tooling that you use. Second, Service Catalog automates and simplifies the deployment of user-targeted tools so that you can deploy these tools in a way that supports your tasks, with the assurance that all deployed resources are aligned with your governance standards. Third, Lambda and Step Functions are serverless, meaning no infrastructure needs to be managed, thereby reducing your operational complexity. Fourth, DynamoDB is used as a storage layer to track all outputs for each component of this Guidance, providing governance teams visibility to support management activities.

    Read the Operational Excellence whitepaper 
  • AWS Identity and Access Management (IAM), Secrets Manager, and AWS Key Management Service (AWS KMS) are services that protect both your information and systems. To start, all inter-service communications use IAM roles, whereas the multi-account option leverages IAM roles with cross-account access. And, all roles follow least-privileged access, that is, they only contain the minimum permissions required so that the service can function properly. Some resources do include tag-based policies to restrict cross-project access to unauthorized resources. In addition, Secrets Manager is used to manage credentials to data sources that are created through the components of this solution, and stored as secrets with highly restrictive access. Finally, AWS KMS is used to leverage customer-managed keys for encrypting secrets in Secrets Manager.

    Read the Security whitepaper 
  • Step Functions, Lambda, EventBridge, and DynamoDB are serverless AWS services, meaning that they ensure high availability at a Region level by default. These services also offer recovery from service failure aligned to service-specific service level agreements (SLAs) to help your workloads perform their intended functions correctly and consistently.

    Read the Reliability whitepaper 
  • When configuring this Guidance, Lambda functions are deployed as close as possible to the data source for improved performance. Additionally, execution logic inside every Lambda function is designed to eliminate redundant operations and to reuse previously created resources, like secrets, when applicable. Lambda supports the core functionality when connecting to data sources for this Guidance, as it is optimized to be lightweight and high performing.

    Read the Performance Efficiency whitepaper 
  • Step Functions, Lambda, EventBridge, DynamoDB, Secrets Manager, and AWS KMS are all serverless AWS services, so you are only charged for what you use. With AWS Glue, you pay only for the time that your extract, transform, and load (ETL) takes to run. There are no resources to manage or upfront costs, nor are you charged for startup or shutdown time.  

    Read the Cost Optimization whitepaper 
  • With Step Functions, Lambda, EventBridge, DynamoDB, Secrets Manager, and AWS KMS being serverless AWS services, they can scale up or down as needed, minimizing the environmental impact of the backend services. For example, EventBridge is an event-driven application that provides near real-time access to data in AWS services, your own applications, or other software as a service (SaaS) applications. With this visibility, you can gain a better understanding of the environmental impacts of the services you are using, quantify those impacts through the entire workload lifecycle, and then apply appropriate design principles to reduce those impacts.

    Read the Sustainability whitepaper 

Implementation Resources

A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

Workshop

Connecting Data Products with Amazon DataZone Workshop

This workshop demonstrates how to extend Amazon DataZone and govern JDBC backed data sources like MySQL, PostgreSQL, Oracle and SQL Server databases.
Blog

Governing data in relational databases using Amazon DataZone

This blog post explains how you can extend the governance capabilities of Amazon DataZone to data assets hosted in relational databases based on MySQL, PostgreSQL, Oracle or SQL Server engines.

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.

Was this page helpful?