Overview
ZAO is a 4D molecular foundation model developed by SyntheticGestalt for drug discovery and molecular property prediction. Unlike conventional models that represent molecules as 1D sequences or 2D graphs, ZAO processes multiple 3D conformations of each molecule simultaneously, capturing spatial arrangements, molecular flexibility, and charge distributions that flat representations miss.
Send SMILES strings via REST API and receive 2048-dimensional molecular embeddings. These embeddings serve as drop-in features for any downstream ML model, including CatBoost, XGBoost, scikit-learn, and neural networks. All preprocessing, including conformer generation and feature extraction, is handled inside the container with no external tools required.
ZAO embeddings fed into a simple CatBoost model achieve state-of-the-art performance across standard drug discovery benchmarks. On the TDC ADMET Benchmark (22 datasets), ZAO ranks #1 on the leaderboard in 8 datasets, including clearance, solubility, lipophilicity, CYP metabolism, and toxicity prediction, and top 3 in 9 datasets. On TDC DTI BindingDB_Patent, ZAO achieves PCC 0.687 (+0.1 over the previous leaderboard #1)
Processing multiple conformers per molecule is critical for accuracy. Ablation studies show that using 10 conformers improves activity prediction R2 by +12% and property prediction R2 by +4.8% compared to a single conformer, confirming that conformational flexibility information is essential for molecular understanding.
ZAO is designed for drug-like compounds with molecular weight between 100 and 1000 Da. Molecules outside this range return null embeddings.
Volume discounts are available through AWS Marketplace Private Offers. For large-scale or enterprise usage, please contact zao@syntheticgestalt.com to request a Private Offer with custom pricing and terms.
Highlights
- Confidential by deployment: Runs as a SageMaker Model Package entirely within your AWS account, so proprietary SMILES and embeddings never leave your VPC. Access a state-of-the-art molecular foundation model without sharing any compound structures or results with SyntheticGestalt or any third party.
- 4D molecular understanding: Processes multiple 3D conformers per molecule, capturing spatial arrangements and molecular flexibility that 1D/2D models miss. Zero-setup inference with all preprocessing including conformer generation handled inside the container.
- State-of-the-art accuracy: #1 on TDC leaderboard in 8 of 22 ADMET benchmarks including clearance, solubility, lipophilicity, CYP metabolism, and toxicity prediction. Exceeds TDC DTI leaderboard #1 by +0.1 PCC.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.g5.4xlarge Inference (Batch) Recommended | Model inference on the ml.g5.4xlarge instance type, batch mode | $1,296.00 |
ml.g6e.2xlarge Inference (Real-Time) Recommended | Model inference on the ml.g6e.2xlarge instance type, real-time mode | $1,296.00 |
ml.g4dn.xlarge Inference (Batch) | Model inference on the ml.g4dn.xlarge instance type, batch mode | $1,296.00 |
ml.g5.2xlarge Inference (Real-Time) | Model inference on the ml.g5.2xlarge instance type, real-time mode | $1,296.00 |
ml.g5.4xlarge Inference (Real-Time) | Model inference on the ml.g5.4xlarge instance type, real-time mode | $1,296.00 |
ml.p3.2xlarge Inference (Batch) | Model inference on the ml.p3.2xlarge instance type, batch mode | $1,296.00 |
ml.p3.2xlarge Inference (Real-Time) | Model inference on the ml.p3.2xlarge instance type, real-time mode | $1,296.00 |
Vendor refund policy
No refunds. This product is billed based on actual usage (hourly). You can delete your SageMaker endpoint at any time to stop incurring charges. For questions, contact zao@syntheticgestalt.com .
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
Initial release of ZAO - 4D Molecular Foundation Model.
Features:
- 2048-dimensional molecular embeddings from SMILES strings
- Multi-conformer 3D analysis with E(3)-equivariant neural network
- GPU-accelerated conformer generation and feature extraction
- Supports application/json, application/jsonlines, text/plain, and text/csv input formats
- Real-time inference and batch transform
- State-of-the-art performance on TDC ADMET Benchmark (#1 in 8 of 22 datasets) and TDC DTI BindingDB_Patent (+0.1 PCC over leaderboard #1)
Additional details
Inputs
- Summary
ZAO accepts SMILES (Simplified Molecular Input Line Entry System) strings representing drug-like molecules.
Supported formats:
-
application/json: A JSON object with a "smiles" key containing an array of SMILES strings. Example {"smiles": ["CC(=O)Oc1ccccc1C(=O)O", "CC(C)Cc1ccc(C(C)C(=O)O)cc1"]}
-
application/jsonlines: One JSON object per line, each with a "smiles" key. Example: {"smiles": "CC(=O)Oc1ccccc1C(=O)O"} {"smiles": "CC(C)Cc1ccc(C(C)C(=O)O)cc1"}
-
text/plain: One SMILES string per line. Example: CC(=O)Oc1ccccc1C(=O)O CC(C)Cc1ccc(C(C)C(=O)O)cc1
-
text/csv: One SMILES string per line (same as text/plain).
All preprocessing (standardization, 3D conformer generation, feature extraction) is handled inside the container. No external tools or preprocessing are required.
-
- Limitations for input type
- Molecular weight must be between 100 and 1000 Da. Molecules outside this range return null embeddings. Invalid SMILES strings return null embeddings with an error message without failing the entire request. Maximum recommended batch size is 10,000 SMILES per request for real-time endpoints.
- Input MIME type
- application/json, application/jsonlines, text/plain, text/csv
Support
Vendor support
Email: zao@syntheticgestalt.com Response time: 2 business days
Support includes:
- Technical assistance with endpoint deployment and configuration
- Guidance on input/output formats and integration
- Troubleshooting inference errors
- Recommended instance type selection
Documentation and sample Jupyter notebook are included with the product.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.