Listing Thumbnail

    [Cubig] Data Transform System

     Info
    Sold by: CUBIG 
    Deployed on AWS
    DTS (Data Transformation System) is an advanced synthetic data engine designed to generate safe, realistic datasets for AI and analytics. By bridging data gaps and enabling compliant data usage, DTS helps enterprises accelerate model training, improve accuracy, and protect sensitive information. With support for text, images, and tables, it delivers scalable, high-quality data tailored for modern AI workloads.

    Overview

    DTS (Data Transformation System) is a state-of-the-art synthetic data generation engine built for enterprises that require both data utility and privacy. It creates synthetic datasets that preserve up to 99% of the original data's statistical and structural value while eliminating the risk of exposing sensitive information. With multi-modal support for text, images, and tabular data, DTS provides a unified framework for building reliable training datasets across diverse AI and machine learning workloads.

    At its core, DTS integrates advanced privacy-preserving techniques such as differential privacy and automated de-identification to ensure regulatory compliance, even in highly restricted domains like healthcare, finance, and government. This enables organizations to use and share data safely without compromising accuracy or exposing personal identifiers.

    By leveraging scalable GPU infrastructure and optimized data pipelines, DTS accelerates model development, reduces the costs of manual data collection, and addresses critical data gaps that often hinder AI deployment. Whether for simulation, training, or analytics, DTS empowers enterprises with secure, high-quality synthetic data designed for next-generation AI systems.

    Highlights

    • Generate high-quality synthetic data that retains up to 99% of the original dataset value across text, images, and tables. - Ensure privacy and compliance with built-in differential privacy, preventing re-identification of sensitive information. - Accelerate AI training and analysis with scalable synthetic data, reducing costs and overcoming data gaps.

    Details

    Sold by

    Delivery method

    Supported services

    Delivery option
    Docker Container - GPU Required

    Latest version

    Operating system
    Linux

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    [Cubig] Data Transform System

     Info
    Pricing is based on the duration and terms of your contract with the vendor. This entitles you to a specified quantity of use for the contract duration. If you choose not to renew or replace your contract before it ends, access to these entitlements will expire.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    1-month contract (1)

     Info
    Dimension
    Description
    Cost/month
    DTS Server
    DTS pricing varies based on data volume, data type/format, and your consumption needs. Listed prices are placeholders. Please contact our sales team at contact@cubig.ai to request a custom private offer tailored to your requirements.
    $9,500.00

    Vendor refund policy

    Please reach us at contact@cubig.ai  for refund policy.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    Docker Container - GPU Required

    Supported services: Learn more 
    • Amazon ECS
    • Amazon EKS
    Container image

    Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.

    Version release notes

    Release Notes - DTS AI Module

    Version 1.0.2 (2026-02-06) - Current Release

    [FIXED] Bug Fixes

    • Fixed AWS Marketplace compatibility issue: Resolved UnsupportedImageType error by removing Docker BuildKit metadata (provenance and SBOM)
    • Docker images now use single-architecture manifest (linux/amd64 only) as required by AWS Marketplace

    [IMPROVED] Build Improvements

    • Updated build script to explicitly disable BuildKit provenance attestation (--provenance=false)
    • Disabled SBOM (Software Bill of Materials) generation during build (--sbom=false)
    • Ensured consistent single-platform image for AWS ECR deployment

    [DOCS] Documentation

    • Added comprehensive troubleshooting guide for deployment issues
    • Enhanced user documentation for Docker volume mounting and data paths

    Version 1.0.1 (2026-02-06) - Initial Marketplace Submission

    [NEW] Customer Onboarding Enhancement

    [DOCS] Documentation Improvements

    • Enhanced Docker volume mount instructions with detailed examples
    • Added privacy parameter guides (epsilon, delta, iteration explanations)
    • Included modality-specific examples (Image/Text/Tabular)
    • Improved API usage examples with real-world scenarios

    [SECURITY] Security

    • Cython-compiled modules for intellectual property protection (30 modules, 3,202 lines)
    • FastAPI RESTful API with health check endpoint
    • Docker multi-stage build for minimal attack surface

    Version 1.0.0 (2026-02-02) - Initial Development Release

    [NEW] Initial Release Features

    • Multi-modal Synthetic Data Generation: Support for Image, Text, and Tabular data
    • Differential Privacy: Mathematically proven privacy protection with configurable epsilon and delta
    • Iterative Refinement: Progressive quality improvement through multiple iterations
    • GPU Acceleration: CUDA 12.1 support for high-performance generation
    • RESTful API: FastAPI-based interface with automatic documentation

    [ARCH] Architecture

    • Docker containerized deployment with NVIDIA GPU support
    • Multi-stage Docker build (Builder + Runtime)
    • Cython compilation for IP protection
    • FastAPI server with uvicorn

    [FEATURE] Core Capabilities

    • Image Generation: Stable Diffusion/FLUX with LoRA support
    • Text Generation: LLM-based (LLaMA, Qwen, Claude) with domain-specific prompts
    • Tabular Generation: Structured output with column schema support
    • Feature Extraction: CLIP (Image), Sentence-BERT (Text), One-hot encoding (Tabular)
    • DP Selection: Analytic Gaussian Mechanism (AGM) for privacy-preserving selection

    [DEPLOY] Deployment

    • AWS ECR compatible Docker image
    • Health check endpoint for container orchestration
    • Volume mount support for customer data
    • GPU requirement: NVIDIA GPU with CUDA support

    Additional details

    Usage instructions

    DTS AI Module - Usage Instructions

    Quick Start

    Step 1: Run the Container

    docker run -d
    --name dts-ai-module
    --gpus all
    -p 8000:8000
    -v /path/to/your/data:/data
    -v /path/to/output:/results
    709825985650.dkr.ecr.us-east-1.amazonaws.com/cubig-ai/dts-v1:1.0.2

    Replace /path/to/your/data and /path/to/output with your actual host paths.

    Step 2: Verify Health

    Wait 30 seconds for service startup:

    curl http://localhost:8000/health 

    Expected response: { "status": "healthy", "service": "DTS AI Module", "version": "1.0.2" }

    Step 3: Access Documentation


    API Usage Examples

    Example 1: Image Synthetic Data

    Generate 100 synthetic medical X-ray images:

    curl -X POST http://localhost:8000/generate 
    -H "Content-Type: application/json"
    -d '{ "run_id": "medical_xray_001", "modality": "image", "private_data_path": "/data/xray_images", "num_samples": 100, "iteration": 2, "epsilon": 8.0, "delta": 1e-6, "positive_prompt": "a medical chest x-ray image, high quality", "image_width": 512, "image_height": 512, "gpu_id": "0" }'

    Result: Synthetic images saved to /results/medical_xray_001/final/synthetic_final/

    Example 2: Text Synthetic Data

    Generate 1,000 synthetic medical texts:

    curl -X POST http://localhost:8000/generate 
    -H "Content-Type: application/json"
    -d '{ "run_id": "medical_text_001", "modality": "text", "private_data_path": "/data/diagnosis_texts.csv", "num_samples": 1000, "iteration": 3, "epsilon": 8.0, "domain": "medical", "sub_domain": "diagnosis", "gpu_id": "0" }'

    Result: Synthetic texts saved to /results/medical_text_001/final/synthetic_final.csv

    Example 3: Tabular Synthetic Data

    Generate 5,000 synthetic customer records:

    curl -X POST http://localhost:8000/generate 
    -H "Content-Type: application/json"
    -d '{ "run_id": "customer_data_001", "modality": "tabular", "private_data_path": "/data/customer_dataset.csv", "num_samples": 5000, "iteration": 2, "epsilon": 8.0, "numerical_columns": ["age", "income", "credit_score"], "categorical_columns": ["gender", "education", "occupation"], "gpu_id": "0" }'

    Result: Synthetic data saved to /results/customer_data_001/final/synthetic_final.csv


    Privacy Parameters Guide

    Epsilon - Privacy Budget

    • Lower values: Stronger privacy protection, lower data utility
    • Higher values: Weaker privacy protection, higher data utility
    • Recommended: 8.0 for balanced privacy-utility tradeoff
    • Range: 0.1 (very strong) to 10.0 (moderate)

    Delta - Privacy Failure Probability

    • Recommended: 1e-6 (should be less than 1/N, where N is number of samples)

    Iteration - Quality Improvement

    • 1: Fast, lower quality
    • 2-3: Balanced (recommended)
    • 4-5: Slower, higher quality

    Troubleshooting

    Issue: Connection refused to port 8000 Solution: Wait 30-60 seconds for service startup, then retry

    Issue: GPU not found Solution: Verify NVIDIA Container Toolkit: docker run --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

    Issue: private_data_path not found Solution: Use container internal path (/data/...), not host path

    Issue: Out of memory Solution: Reduce num_samples, iteration, or image dimensions


    Support

    For technical support:

    1. Check API documentation at http://localhost:8000/docs 
    2. Review user guide: docker exec dts-ai-module cat /app/USER_GUIDE.md
    3. Contact AWS Marketplace support

    Support

    Vendor support

    Please reach us at contact@cubig.ai  for any assistance or questions.

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Similar products

    Customer reviews

    Ratings and reviews

     Info
    0 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    0%
    0%
    0%
    0%
    0 reviews
    No customer reviews yet
    Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.