Sign in
Migration Mapping Assistant Your Saved List Partners Sell in AWS Marketplace Amazon Web Services Home Help


By: Subtree Latest Version: DotScience CloudFormation Stack 0.3.1

Product Overview

End-to-End Data Engineering & Machine Learning Features

Run Tracker

Dotscience tracks, packages and links together together every run that goes into the data engineering and model creation process. Discover previous work and see exactly how it was built by tracking every version of every element in the model development phase.

Data Versioning

As part of tracking runs, Dotscience bundles with each run a complete snapshot of the project workspace filesystem and any dependent datasets, using copy-on-write technology to ensure that no more disk space is used than absolutely required to ensure reproducibility.

Provenance Graph

Trace from a model to its training data and back from that to the raw data, so that if stakeholders contend decisions made by a model, you can forensically reproduce inferences, and if there are issues, isolate and fix them.

Metric Explorer

Dotscience gives data science and ML engineering teams the unique ability to collaboratively track, record and share run metrics. Explore historic runs, and see relationships between hyperparameters & metrics so that you can gain insights into which hyperparameters to tune next, and make better decisions about where to invest time & effort.

S3 Datasets

Attach versioned S3 buckets as Dotscience Datasets while still tracking reproducibility & provenance all the way to the source. Dotscience mirrors the S3 dataset to local storage for extreme performance and reduced latency, and keeps track of which versions are accessed during data engineering and model training for you.

Bring Your Own Compute

Attach any compute: laptop, GPU rig, enterprise data center or cloud instances as runners. Be productive on a new runner in seconds, as Dotscience ensure an identical development environments even when you switch runner. Dotscience handles the storage and network complexity, all you need is an internet connection and Docker.

Auto Scaling

Simplify resource management by automatically provisioning runners on-demand from cloud infrastructure as users need them, enabling data scientists to self-serve Jupyter and script execution environments. Switch from CPU runners to GPU and back again seamlessly as requirements dictate. Optimize cloud spend with automatic shutdown when runners are idle. All changes are backed up to the hub when runners are shut down, as well as every time a run is recorded.


DotScience CloudFormation Stack 0.3.1



Delivery Methods

  • CloudFormation Template

Pricing Information

Usage Information

Support Information