Overview
CUBIG DTS is an enterprise synthetic data engine for organizations that cannot freely use original data because of privacy, access, or data-quality constraints. It generates privacy-safe synthetic data across text, tabular, and image formats using differential privacy and zero-access processing, so raw customer records never leave the client environment. Teams can augment scarce datasets, correct class imbalance, replace missing values, and build higher-utility training data for AI and analytics workflows. DTS applies differential privacy at generation time and operates in a zero-access architecture where the synthetic output, not the original records, is what crosses the security boundary. This approach helps teams in finance, healthcare, and the public sector work with representative data without exposing regulated records. Because DTS produces entirely new data points rather than masking or transforming originals, the output is structurally distinct from de-identified or anonymized copies of source data. Common use cases include replacing restricted datasets that cannot leave the security perimeter, generating balanced training sets for fraud detection or medical diagnosis models, filling coverage gaps where minority classes are underrepresented, and supplementing missing values in incomplete records. DTS runs on GPU infrastructure and provides a container-based deployment that integrates with Amazon ECS and Amazon EKS, so teams can scale synthetic data generation within their existing AWS environment.
Highlights
- Multimodal synthetic data across text, tabular, and image workflows
- Differential privacy and zero-access processing so raw records stay inside the client environment
- Augment scarce data, correct class imbalance, and replace missing values for better AI training and analytics
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/month |
|---|---|---|
DTS Server | DTS pricing varies based on data volume, data type/format, and your consumption needs. Listed prices are placeholders. Please contact our sales team at contact@cubig.ai to request a custom private offer tailored to your requirements. | $9,500.00 |
Vendor refund policy
Please reach us at contact@cubig.ai for refund policy.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Docker Container - GPU Required
- Amazon ECS
- Amazon EKS
Container image
Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.
Version release notes
Security Update - SQLite 3.50.2
This release addresses CVE-2025-6965, an integer truncation and memory corruption vulnerability in SQLite. The bundled SQLite library has been upgraded from 3.37.2 to 3.50.2 in the Docker runtime image.
Changes
- Upgraded SQLite to 3.50.2 to resolve CVE-2025-6965
- Added symbolic links to ensure Python loads the patched SQLite library
- Refactored Dockerfile to copy pip packages directly from builder stage, avoiding runtime source-build issues
- No functional changes to the DTS API or synthetic data generation pipeline
- Backward compatible with v1.0.2
Verification
- sqlite3.sqlite_version reports 3.50.2 in the running container
- Health endpoint (GET /health) returns 200 OK
- All core modules (interface, utils, faiss, torch) import successfully
Recommended Action All users should upgrade to v1.0.4 for the security fix.
Additional details
Usage instructions
DTS AI Module - Usage Instructions
Quick Start
Step 1: Run the Container
docker run -d
--name dts-ai-module
--gpus all
-p 8000:8000
-v /path/to/your/data:/data
-v /path/to/output:/results
709825985650.dkr.ecr.us-east-1.amazonaws.com/cubig-ai/dts-v1:1.0.2
Replace /path/to/your/data and /path/to/output with your actual host paths.
Step 2: Verify Health
Wait 30 seconds for service startup:
curl http://localhost:8000/health
Expected response: { "status": "healthy", "service": "DTS AI Module", "version": "1.0.2" }
Step 3: Access Documentation
- Quick Start: http://localhost:8000/
- API Docs: http://localhost:8000/docs (Swagger UI)
- User Guide: docker exec dts-ai-module cat /app/USER_GUIDE.md
API Usage Examples
Example 1: Image Synthetic Data
Generate 100 synthetic medical X-ray images:
curl -X POST http://localhost:8000/generate
-H "Content-Type: application/json"
-d '{
"run_id": "medical_xray_001",
"modality": "image",
"private_data_path": "/data/xray_images",
"num_samples": 100,
"iteration": 2,
"epsilon": 8.0,
"delta": 1e-6,
"positive_prompt": "a medical chest x-ray image, high quality",
"image_width": 512,
"image_height": 512,
"gpu_id": "0"
}'
Result: Synthetic images saved to /results/medical_xray_001/final/synthetic_final/
Example 2: Text Synthetic Data
Generate 1,000 synthetic medical texts:
curl -X POST http://localhost:8000/generate
-H "Content-Type: application/json"
-d '{
"run_id": "medical_text_001",
"modality": "text",
"private_data_path": "/data/diagnosis_texts.csv",
"num_samples": 1000,
"iteration": 3,
"epsilon": 8.0,
"domain": "medical",
"sub_domain": "diagnosis",
"gpu_id": "0"
}'
Result: Synthetic texts saved to /results/medical_text_001/final/synthetic_final.csv
Example 3: Tabular Synthetic Data
Generate 5,000 synthetic customer records:
curl -X POST http://localhost:8000/generate
-H "Content-Type: application/json"
-d '{
"run_id": "customer_data_001",
"modality": "tabular",
"private_data_path": "/data/customer_dataset.csv",
"num_samples": 5000,
"iteration": 2,
"epsilon": 8.0,
"numerical_columns": ["age", "income", "credit_score"],
"categorical_columns": ["gender", "education", "occupation"],
"gpu_id": "0"
}'
Result: Synthetic data saved to /results/customer_data_001/final/synthetic_final.csv
Privacy Parameters Guide
Epsilon - Privacy Budget
- Lower values: Stronger privacy protection, lower data utility
- Higher values: Weaker privacy protection, higher data utility
- Recommended: 8.0 for balanced privacy-utility tradeoff
- Range: 0.1 (very strong) to 10.0 (moderate)
Delta - Privacy Failure Probability
- Recommended: 1e-6 (should be less than 1/N, where N is number of samples)
Iteration - Quality Improvement
- 1: Fast, lower quality
- 2-3: Balanced (recommended)
- 4-5: Slower, higher quality
Troubleshooting
Issue: Connection refused to port 8000 Solution: Wait 30-60 seconds for service startup, then retry
Issue: GPU not found Solution: Verify NVIDIA Container Toolkit: docker run --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
Issue: private_data_path not found Solution: Use container internal path (/data/...), not host path
Issue: Out of memory Solution: Reduce num_samples, iteration, or image dimensions
Support
For technical support:
- Check API documentation at http://localhost:8000/docs
- Review user guide: docker exec dts-ai-module cat /app/USER_GUIDE.md
- Contact AWS Marketplace support
Support
Vendor support
Please reach us at contact@cubig.ai for any assistance or questions.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.