Overview
Data Quality has always been an important issue for CDOs and often plagued with complexity and the need for attention to detail that can require significant human resources. Recent developments in Generative AI provided breakthroughs for dealing with Data Quality enabling a level of automation in data quality analysis improving productivity, reducing costs and improving business outcomes.
Our GenAI-powered engine for data quality in data pipelines is built using AWS CDK (Infrastructure as Code) and performs two critical functions: 1) enables automatic checks of tabular data using the rules provided by the user and 2) autonomously generates data quality rules, further enhancing its adaptability and functionality. It also provides a brief report that justifies the rules that were generated and applied, explaining what they intend to do, thus addressing an important issue of explainability. Human-in-the-loop logic system enables an immediate rectification of any errors or discrepancies, preventing any corrupted data from moving forward in the process.
The engine is triggered by Amazon EventBridge, which initiates the GenAI pipeline. It utilizes AWS Lambda for compute tasks, for data pre-processing before it goes Amazon S3 for secured storage of data. Amazon SNS notifies stakeholders of any potential data quality issues. AWS CDK implements a CI/CD pipeline for continuous improvement of the system, ensuring that updates and enhancements are seamlessly integrated.
Sold by | Data Reply UK |
Categories | |
Fulfillment method | Professional Services |
Pricing Information
This service is priced based on the scope of your request. Please contact seller for pricing details.
Support
Access to an AWS POA funding during a POC/Pilot/MVP phase
Post MVP - basic support :documentation, FAQs, and email support during business hours
contact: a.main@reply.com