Cedana AI Compute Fabric

Cedana AI Compute Fabric automatically checkpoints and migrates live GPU workloads across Amazon EKS and Slurm without losing progress. Achieve up to 2x higher AI job throughput per GPU while increasing utilization, improving reliability, and enabling dynamic prioritization.

View purchase options

Overview

Try agent mode

Create proposal

Ask question

Product video

Cedana AI Compute Fabric provides system-level checkpointing and migration for GPU workloads running on Amazon EKS and Slurm.

Cedana makes execution state portable across nodes and instances, allowing AI training, fine-tuning, inference, and distributed workloads to pause, move, and resume without losing progress.

Unlike application-level checkpoints, Cedana operates transparently at the system layer, requiring no code changes while preserving full process state, GPU memory, and distributed context.

By decoupling AI workloads from fixed infrastructure, Cedana increases GPU utilization and delivers up to 2x higher AI job throughput per GPU. Workloads automatically recover from node failures, spot interruptions, and maintenance events without restarting from scratch.

Teams can dynamically reprioritize jobs, rebalance clusters, consolidate underutilized GPUs, and safely run long jobs on Spot instances.

The result:

Improved reliability
Reduced wasted compute
Lower cloud costs
Shorter queue times
Higher productivity per $/GPU

Cedana integrates in minutes with Amazon EKS, and Slurm environments and supports single-node and distributed multi-GPU/CPU workloads.

Ideal for AI startups, research labs, enterprises, and platform teams operating multi-tenant GPU clusters, Cedana enables infrastructure automation, spot resilience, SLA enforcement, and efficient AI factory operations across AWS.

Highlights

Cloud-Native GPU Checkpointing for Amazon EKS Automatically checkpoint and migrate AI workloads across Amazon EKS without code changes. Preserve full execution state, including GPU memory and distributed processes, enabling seamless recovery from node failures, spot interruptions, and autoscaling events.
Increase Throughput 2x and Reduce GPU Wait Times Boost AI training and inference throughput by eliminating lost work from failures and preemptions. Cedana improves GPU utilization, enables dynamic job prioritization on Amazon EKS and Slurm, and reduces queue times across multi-tenant GPU clusters.
Automate Spot Instances for Long-Running AI Jobs Run training and stateful inference workloads reliably on Amazon EC2 Spot Instances without losing progress. Cedana automatically checkpoints and resumes GPU workloads across interruptions, enabling resilient Spot usage, lower cloud costs, and significantly higher throughput per $/GPU on Amazon EKS.

Details

Sold by

Cedana

Introducing multi-product solutions

You can now purchase comprehensive solutions tailored to use cases and industries.

Learn more

Explore multi-product solutions

Features and programs

Financing for AWS Marketplace purchases

AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.

View financing details

Pricing

Cedana AI Compute Fabric

Info

View purchase options

Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.

Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator to estimate your infrastructure costs.

Usage costs (1)

Info

Dimension	Description	Cost/unit
$/GB	Instance Memory under Management	$2.00

Vendor refund policy

Contact our support team for refund information.

How can we make this page better?

Tell us how we can improve this page, or report an issue with this product.

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Delivery details

Software as a Service (SaaS)

SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

Resources

Vendor resources

Documentation

Support

Vendor support

Please email support@cedana.ai

AWS infrastructure support

AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Customer reviews

Leave a review

Ratings and reviews

Info

0 ratings

5 star

4 star

3 star

2 star

1 star

0 reviews

No customer reviews yet

Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.