The Data Cloner

The Data Cloner system designed to create synthetic data by processing data from various sources, identifying relationships, categorizing metadata, and generating realistic but non-sensitive data for cloud storage.

Request private offer

Overview

Try agent mode

Create proposal

Ask question

Key Features

Multi-Source Data Integration: Supports relational databases (Oracle, MySQL, PostgreSQL, SQL Server) and file-based data (CSV, TSV, etc.).
Machine Learning Model: Utilizes ML algorithms to analyze metadata and identify table relationships.
Business Categorization: Determines business relevance of table columns using automated classification techniques.
Validation GUI: Clients can review and validate metadata via an interactive GUI.
Metadata Storage: Captures and stores metadata in a MySQL database for further analysis.
Synthetic Data Generation: Use multiple libraries to generate synthetic datasets for testing and development.
Cloud Storage Integration: Exports processed and synthetic data to cloud-based storage (AWS S3, Blob storage, etc.).

Development Options

• Programming Language: Python

• Data Processing SDKs: Microsoft Presidio (for sensitive data processing and anonymization)

• Database Systems: MySQL, PostgreSQL, Oracle, SQL Server

• GUI Development: Client-side web application

• Synthetic Data Libraries: Faker, Mimesis

• Cloud Storage: AWS S3, Blob Storage

AWS Tools for Logging & Monitoring

• Amazon CloudWatch: Monitors logs, metrics, and alerts for application health tracking.

• AWS Lambda Logging: Logs function executions and errors for debugging.

• Amazon S3 Logging: Tracks access and modification history of stored data.

• AWS IAM Policies: Ensures secure access control for data storage and processing services.

Key Benefits

• Automated Metadata Processing: Reduces manual effort in analyzing database schema and relationships.

• Enhanced Data Privacy: Uses Microsoft Presidio to anonymize sensitive data.

• Scalable Architecture: Supports cloud-based storage for large-scale data processing.

• Faster Insights: Enables quick validation and categorization of metadata.

• Synthetic Data Creation: Generates realistic yet non-sensitive data for testing and development.

• Cloud-Ready Solution: Facilitates seamless data transfer to cloud storage solutions.

How It Works

Source System Integration: Imports data from RDBMS or files (CSV, TSV, etc.).
Metadata Extraction: Collects schema information from the data sources.
Machine Learning Processing: Identifies table relationships and business categories.
Client Validation: Provides an interactive GUI for metadata verification.
Metadata Storage: Saves processed metadata in a MySQL database.
Synthetic Data Generation: Uses Faker and Mimesis to create realistic but anonymized datasets.
Cloud Storage: Transfers synthetic data to AWS S3 or Blob storage.

Highlights

The system integrates machine learning, open-source SDKs, and database management tools to achieve automated metadata processing and synthetic data generation.
Uses AES-256 encryption for secure data storage and transmission. Implements role-based access control (RBAC) for data access. Ensures data handling aligns with global privacy regulations. Tracks user activities and changes for compliance reporting.

Details

Sold by

Altimetrik

Introducing multi-product solutions

You can now purchase comprehensive solutions tailored to use cases and industries.

Learn more

Explore multi-product solutions

Pricing

Custom pricing options

Request private offer

Pricing is based on your specific requirements and eligibility. To get a custom quote for your needs, request a private offer.

How can we make this page better?

Tell us how we can improve this page, or report an issue with this product.

Legal

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Support

Vendor support

The Data Cloner service is customized based on the scope and complexity of each engagement. Contact us for a personalized quote that fits your specific needs.