ZAO - 4D Molecular Foundation Model

4D molecular foundation model that generates 2048-dimensional embeddings from SMILES strings. Processes multiple 3D conformations per molecule for drug discovery property prediction and virtual screening. State-of-the-art on TDC ADMET and DTI benchmarks.

View purchase options

Overview

Try agent mode

Create proposal

Ask question

ZAO is a 4D molecular foundation model developed by SyntheticGestalt for drug discovery and molecular property prediction. Unlike conventional models that represent molecules as 1D sequences or 2D graphs, ZAO processes multiple 3D conformations of each molecule simultaneously, capturing spatial arrangements, molecular flexibility, and charge distributions that flat representations miss.

Send SMILES strings via REST API and receive 2048-dimensional molecular embeddings. These embeddings serve as drop-in features for any downstream ML model, including CatBoost, XGBoost, scikit-learn, and neural networks. All preprocessing, including conformer generation and feature extraction, is handled inside the container with no external tools required.

ZAO embeddings fed into a simple CatBoost model achieve state-of-the-art performance across standard drug discovery benchmarks. On the TDC ADMET Benchmark (22 datasets), ZAO ranks #1 on the leaderboard in 8 datasets, including clearance, solubility, lipophilicity, CYP metabolism, and toxicity prediction, and top 3 in 9 datasets. On TDC DTI BindingDB_Patent, ZAO achieves PCC 0.687 (+0.1 over the previous leaderboard #1)

Processing multiple conformers per molecule is critical for accuracy. Ablation studies show that using 10 conformers improves activity prediction R2 by +12% and property prediction R2 by +4.8% compared to a single conformer, confirming that conformational flexibility information is essential for molecular understanding.

ZAO is designed for drug-like compounds with molecular weight between 100 and 1000 Da. Molecules outside this range return null embeddings.

Volume discounts are available through AWS Marketplace Private Offers. For large-scale or enterprise usage, please contact zao@syntheticgestalt.com to request a Private Offer with custom pricing and terms.

Highlights

Confidential by deployment: Runs as a SageMaker Model Package entirely within your AWS account, so proprietary SMILES and embeddings never leave your VPC. Access a state-of-the-art molecular foundation model without sharing any compound structures or results with SyntheticGestalt or any third party.
4D molecular understanding: Processes multiple 3D conformers per molecule, capturing spatial arrangements and molecular flexibility that 1D/2D models miss. Zero-setup inference with all preprocessing including conformer generation handled inside the container.
State-of-the-art accuracy: #1 on TDC leaderboard in 8 of 22 ADMET benchmarks including clearance, solubility, lipophilicity, CYP metabolism, and toxicity prediction. Exceeds TDC DTI leaderboard #1 by +0.1 PCC.

Details

Sold by

SyntheticGestalt

Introducing multi-product solutions

You can now purchase comprehensive solutions tailored to use cases and industries.

Learn more

Explore multi-product solutions

Features and programs

Financing for AWS Marketplace purchases

AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.

View financing details

Pricing

ZAO - 4D Molecular Foundation Model

Info

View purchase options

Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.

Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator to estimate your infrastructure costs.

Usage costs (7)

Info

Dimension	Description	Cost/host/hour
ml.g5.4xlarge Inference (Batch) Recommended	Model inference on the ml.g5.4xlarge instance type, batch mode	$1,296.00
ml.g6e.2xlarge Inference (Real-Time) Recommended	Model inference on the ml.g6e.2xlarge instance type, real-time mode	$1,296.00
ml.g4dn.xlarge Inference (Batch)	Model inference on the ml.g4dn.xlarge instance type, batch mode	$1,296.00
ml.g5.2xlarge Inference (Real-Time)	Model inference on the ml.g5.2xlarge instance type, real-time mode	$1,296.00
ml.g5.4xlarge Inference (Real-Time)	Model inference on the ml.g5.4xlarge instance type, real-time mode	$1,296.00
ml.p3.2xlarge Inference (Batch)	Model inference on the ml.p3.2xlarge instance type, batch mode	$1,296.00
ml.p3.2xlarge Inference (Real-Time)	Model inference on the ml.p3.2xlarge instance type, real-time mode	$1,296.00

Vendor refund policy

No refunds. This product is billed based on actual usage (hourly). You can delete your SageMaker endpoint at any time to stop incurring charges. For questions, contact zao@syntheticgestalt.com .

How can we make this page better?

Tell us how we can improve this page, or report an issue with this product.

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Delivery details

Amazon SageMaker model

An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.

Deploy the model on Amazon SageMaker AI using the following options:

Real-time inference

Deploy the model as an API endpoint for your applications. When you send data to the endpoint, SageMaker processes it and returns results by API response. The endpoint runs continuously until you delete it. You're billed for software and SageMaker infrastructure costs while the endpoint runs. AWS Marketplace models don't support Amazon SageMaker Asynchronous Inference. For more information, see Deploy models for real-time inference .

Batch transform

Deploy the model to process batches of data stored in Amazon Simple Storage Service (Amazon S3). SageMaker runs the job, processes your data, and returns results to Amazon S3. When complete, SageMaker stops the model. You're billed for software and SageMaker infrastructure costs only during the batch job. Duration depends on your model, instance type, and dataset size. AWS Marketplace models don't support Amazon SageMaker Asynchronous Inference. For more information, see Batch transform for inference with Amazon SageMaker AI .

Version release notes

Initial release of ZAO - 4D Molecular Foundation Model.

Features:

2048-dimensional molecular embeddings from SMILES strings
Multi-conformer 3D analysis with E(3)-equivariant neural network
GPU-accelerated conformer generation and feature extraction
Supports application/json, application/jsonlines, text/plain, and text/csv input formats
Real-time inference and batch transform
State-of-the-art performance on TDC ADMET Benchmark (#1 in 8 of 22 datasets) and TDC DTI BindingDB_Patent (+0.1 PCC over leaderboard #1)

Additional details

Inputs

Summary: ZAO accepts SMILES (Simplified Molecular Input Line Entry System) strings representing drug-like molecules.

Supported formats:

application/json: A JSON object with a "smiles" key containing an array of SMILES strings. Example {"smiles": ["CC(=O)Oc1ccccc1C(=O)O", "CC(C)Cc1ccc(C(C)C(=O)O)cc1"]}

application/jsonlines: One JSON object per line, each with a "smiles" key. Example: {"smiles": "CC(=O)Oc1ccccc1C(=O)O"} {"smiles": "CC(C)Cc1ccc(C(C)C(=O)O)cc1"}

text/plain: One SMILES string per line. Example: CC(=O)Oc1ccccc1C(=O)O CC(C)Cc1ccc(C(C)C(=O)O)cc1

text/csv: One SMILES string per line (same as text/plain).

All preprocessing (standardization, 3D conformer generation, feature extraction) is handled inside the container. No external tools or preprocessing are required.

Limitations for input type: Molecular weight must be between 100 and 1000 Da. Molecules outside this range return null embeddings. Invalid SMILES strings return null embeddings with an error message without failing the entire request. Maximum recommended batch size is 10,000 SMILES per request for real-time endpoints.

Input MIME type: application/json, application/jsonlines, text/plain, text/csv

Real-time inference sample input data

{"smiles": ["CC(=O)Oc1ccccc1C(=O)O", "CC(C)Cc1ccc(C(C)C(=O)O)cc1", "CN1C=NC2=C1C(=O)N(C(=O)N2C)C"]}

Batch transform sample input data

{"smiles": "CC(=O)Oc1ccccc1C(=O)O"} {"smiles": "CC(C)Cc1ccc(C(C)C(=O)O)cc1"} {"smiles": "CN1C=NC2=C1C(=O)N(C(=O)N2C)C"}

Support

Vendor support

Email: zao@syntheticgestalt.com Response time: 2 business days

Support includes:

Technical assistance with endpoint deployment and configuration
Guidance on input/output formats and integration
Troubleshooting inference errors
Recommended instance type selection

Documentation and sample Jupyter notebook are included with the product.

AWS infrastructure support

AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Customer reviews

Leave a review

Ratings and reviews

Info

0 ratings

5 star

4 star

3 star

2 star

1 star

0 reviews

No customer reviews yet

Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.