Machine Learning Service – Amazon SageMaker Pricing

Amazon SageMaker helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning (ML) models quickly by bringing together a broad set of capabilities purpose-built for ML. SageMaker supports the leading ML frameworks, toolkits, and programming languages.

With SageMaker, you pay only for what you use. You have two choices for payment: an On-Demand Pricing that offers no minimum fees and no upfront commitments, and the SageMaker Savings Plans that offer a flexible, usage-based pricing model in exchange for a commitment to a consistent amount of usage.

Amazon SageMaker Free Tier

Amazon SageMaker is free to try. As part of the AWS Free Tier, you can get started with Amazon SageMaker for free. Your free tier starts from the first month when you create your first SageMaker resource. The details of the free tier for Amazon SageMaker are in the table below.

Amazon SageMaker capability	Free Tier usage per month for the first 2 months
Studio notebooks, and notebook instances	250 hours of ml.t3.medium instance on Studio notebooks OR 250 hours of ml.t2 medium instance or ml.t3.medium instance on notebook instances
RStudio on SageMaker	250 hours of ml.t3.medium instance on RSession app AND free ml.t3.medium instance for RStudioServerPro app
Data Wrangler	25 hours of ml.m5.4xlarge instance
Feature Store	10 million write units, 10 million read units, 25 GB storage (standard online store)
Training	50 hours of m4.xlarge or m5.xlarge instances
Amazon SageMaker with TensorBoard	300 hours of ml.r5.large instance
Real-Time Inference	125 hours of m4.xlarge or m5.xlarge instances
Serverless Inference	150,000 seconds of on-demand inference duration
Canvas	160 hours/month for session time
HyperPod	50 hours of m5.xlarge instance

AWS Pricing Calculator

Calculate your Amazon SageMaker and architecture cost in a single estimate.

Create your custom estimate now »

On-Demand Pricing

Studio Classic
Amazon SageMaker Studio Classic
Studio Classic offers one-step Jupyter notebooks in our legacy IDE experience. The underlying compute resources are fully elastic and the notebooks can be easily shared with others, allowing seamless collaboration. You are charged for the instance type you choose, based on the duration of use.
JupyterLab
Amazon SageMaker JupyterLab
Launch fully managed JupyterLab in seconds. Use the latest web-based interactive development environment for notebooks, code, and data. You are charged for the instance type you choose, based on the duration of use.
Code Editor
Amazon SageMaker Code Editor
Code Editor, based on Code-OSS (Visual Studio Code – Open Source), enables you to write, test, debug, and run your analytics and ML code. It is fully integrated with SageMaker Studio and supports IDE extensions available in the Open VSX extension registry.
RStudio
RStudio
RStudio offers on-demand cloud compute resources to accelerate model development and improve productivity. You are charged for the instance types you choose to run the RStudio Session app and the RStudio Server Pro app.

RStudioServerPro App
Notebook Instances
Notebook Instances
Notebook instances are compute instances running the Jupyter notebook app. You are charged for the instance type you choose, based on the duration of use.
Processing
Amazon SageMaker Processing
Amazon SageMaker Processing lets you easily run your pre-processing, post-processing, and model evaluation workloads on fully managed infrastructure. You are charged for the instance type you choose, based on the duration of use.
TensorBoard
Amazon SageMaker with TensorBoard
Amazon SageMaker with TensorBoard provides a hosted TensorBoard experience to to visualize and debug model convergence issues for Amazon SageMaker training jobs.
Data Wrangler
Amazon SageMaker Data Wrangler

Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for machine learning from weeks to minutes. You pay for the time used to cleanse, explore, and visualize data. Customer running SageMaker Data Wrangler instances are subject to the pricing below.* Customers running SageMaker Data Wrangler on SageMaker Canvas workspace instances are subject to SageMaker Canvas pricing. See SageMaker Canvas pricing page for more details.

Amazon SageMaker Data Wrangler Jobs

An Amazon SageMaker Data Wrangler job is created when a data flow is exported from SageMaker Data Wrangler. With SageMaker Data Wrangler jobs, you can automate your data preparation workflows. SageMaker Data Wrangler jobs help you reapply your data preparation workflows on new datasets to help save you time, and are billed by the second.
Feature Store
Amazon SageMaker Feature Store
Amazon SageMaker Feature Store is a central repository to ingest, store and serve features for machine learning. You are charged for feature group writes, reads, and data storage in SageMaker Feature Store, with different pricing for the standard online store and in-memory online store.

For the standard online store, data storage is charged per GB per month. For throughput, you can choose between on-demand or provisioned capacity mode. For on-demand, writes are charged as write request units per KB and reads are charged as read request units per 4 KB. For provisioned capacity mode, you specify the read and write capacity that you expect your application to require. Sagemaker Feature Store charges one WCU for each write per second (upto 1 KB) and and one RCU for each read per second (up to 4 KB). You will be charged for the throughput capacity (reads and writes) you provision for your feature group, even if you do not fully utilize the provisioned capacity.

For the in-memory online store, writes are charged as write request units per KB with a minimum of 1 unit per write, reads are charged as read request units per KB with a minimum of 1 unit per read, and data storage is charged per GB per hour. There is a minimum data storage charge of 5 GiB (5.37 GB) per hour for the in-memory online store.
Training
Amazon SageMaker Training
Amazon SageMaker makes it easy to train machine learning (ML) models by providing everything you need to train, tune, and debug models. You are charged for usage of the instance type you choose. When you use Amazon SageMaker Debugger to debug issues and monitor resources during training, you can use built-in rules to debug your training jobs or write your own custom rules. There is no charge to use built-in rules to debug your training jobs. For custom rules, you are charged for the instance type you choose, based on the duration of use.
MLflow
Amazon SageMaker with MLflow
Amazon SageMaker with MLflow allows customers to pay only for what you use. Customers pay for MLflow Tracking Servers based on compute and storage costs.

Customers will pay for compute based on the size of the Tracking Server and number of hours it was running. In addition, customers will pay for any metadata stored on the MLflow Tracking Server.
Real-Time Inference
Amazon SageMaker Hosting: Real-Time Inference
Amazon SageMaker provides real-time inference for your use cases needing real-time predictions. You are charged for usage of the instance type you choose. When you use Amazon SageMaker Model Monitor to maintain highly accurate models providing real-time inference, you can use built-in rules to monitor your models or write your own custom rules. For built-in rules, you get up to 30 hours of monitoring at no charge. Additional charges will be based on duration of usage. You are charged separately when you use your own custom rules.
Asynchronous Inference
Amazon SageMaker Asynchronous Inference:
Amazon SageMaker Asynchronous Inference is a near-real time inference option that queues incoming requests and processes them asynchronously. Use this option when you need to process large payloads as the data arrives or run models that have long inference processing times and do not have sub-second latency requirements. You are charged for the type of instance you choose.
Batch Transform
Amazon SageMaker Batch Transform
Using Amazon SageMaker Batch Transform, there is no need to break down your data set into multiple chunks or manage real-time endpoints. SageMaker Batch Transform allows you to run predictions on large or small batch datasets. You are charged for the instance type you choose, based on the duration of use.
Serverless Inference
Amazon SageMaker Serverless Inference
Amazon SageMaker Serverless Inference enables you to deploy machine learning models for inference without configuring or managing any of the underlying infrastructure. You can either use on-demand Serverless Inference or add Provisioned Concurrency to your endpoint for predictable performance.

With on-demand Serverless Inference, you only pay for the compute capacity used to process inference requests, billed by the millisecond, and the amount of data processed. The compute charge depends on the memory configuration you choose.

Provisioned Concurrency

Optionally, you can also enable Provisioned Concurrency for your serverless endpoints. Provisioned Concurrency allows you to deploy models on serverless endpoints with predictable performance, and high scalability by keeping your endpoints warm for specified number of concurrent requests and specified time. As with on-demand Serverless Inference, when Provisioned Concurrency is enabled, you pay for the compute capacity used to process inference requests, billed by the millisecond, and the amount of data processed. You also pay for Provisioned Concurrency usage, based on the memory configured, duration provisioned, and amount of concurrency enabled.
JumpStart
Amazon SageMaker JumpStart
Amazon SageMaker JumpStart helps you quickly and easily get started with machine learning with one-click access to popular model collections (also known as “model zoos”). Jumpstart also offers end-to-end solutions that solve common ML use cases which can be customized for your needs. There is no additional charge for using JumpStart models or solutions. You will be charged for the underlying Training and Inference instance hours used the same as if you had created them manually.
Profiler
Amazon SageMaker Profiler collects system-level data for visualization of high-resolution CPU and GPU trace plots. This tool is designed to help data scientists and engineers identify hardware related performance bottlenecks in their deep learning models, saving end to end training time and cost. Currently SageMaker Profiler only supports profiling of training jobs leveraging ml.g4dn.12xlarge, ml.p3dn.24xlarge and ml.p4d.24xlarge training compute instance types.

Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Europe (Frankfurt), Europe (Ireland), and Israel (Tel Aviv).

Amazon SageMaker Profiler is currently in preview and available without cost to customers in supported regions.
HyperPod
Amazon SageMaker HyperPod
Amazon SageMaker HyperPod is purpose-built to accelerate foundation model (FMs) development. To make FM training more resilient, it continuously monitors cluster health, repairs and replaces faulty nodes on-the-fly, and saves frequent checkpoints to automatically resume training without losing progress. SageMaker HyperPod is pre-configured with SageMaker distributed training libraries that enable you to improve FM training performance while fully utilizing the cluster’s compute and network infrastructure
Inference optimization
Inference optimization toolkit makes it easy for you to implement the latest inference optimization techniques in order to achieve state-of-the-art (SOTA) cost performance on Amazon SageMaker, while saving months of developer time. You can choose from a menu of popular optimization techniques provided by SageMaker and run optimization jobs ahead of time, benchmark the model for performance and accuracy metrics, and then deploy the optimized model to a SageMaker endpoint for inference.

Instance details

Amazon SageMaker P5 instance product details

Instance Size	vCPUs	Instance Memory (TiB)	GPU Model	GPU	Total GPU memory (GB)	Memory per GPU (GB)	Network Bandwidth (Gbps)	GPUDirect RDMA	GPU Peer to Peer	Instance Storage (TB)	EBS Bandwidth (Gbps)
ml.p5.48xlarge	192	2	NVIDIA H100	8	640 HBM3	80	3200 EFAv2	Yes	900 GB/s NVSwitch	8x3.84 NVMe SSD	80

Amazon SageMaker P4d instance product details

Instance Size	vCPUs	Instance Memory (GiB)	GPU Model	GPUs	Total GPU memory (GB)	Memory per GPU (GB)	Network Bandwidth (Gbps)	GPUDirect RDMA	GPU Peer to Peer	Instance Storage (GB)	EBS Bandwidth (Gbps)
ml.p4d.24xlarge	96	1152	NVIDIA A100	8	320 HBM 2	40	400 ENA AND EFA	Yes	600 GB/s NVSwitch	8x1000 NVMe SSD	19
ml.p4de.24xlarge	96	1152	NVIDIA A100	8	640 HNM2e	80	400 ENA and EFA	Yes	600 GB/s NVSwitch	8X1000 NVMe SSD	19

Amazon SageMaker P3 instance product details

Instance Size	vCPUs	Instance Memory (GiB)	GPU Model	GPUs	Total GPU memory (GB)	Memory per GPU (GB)	Network Bandwidth (Gbps)	GPU Peer to Peer	Instance Storage (GB)	EBS Bandwidth (Gbps)
ml.p3.2xlarge	8	61	NVIDIA V100	1	16	16	Up to 10	N/A	EBS-Only	1.5
ml.p3.8xlarge	32	244	NVIDIA V100	4	64	16	10	NVLink	EBS-Only	7
ml.p3.16xlarge	64	488	NVIDIA V100	8	128	16	25	NVLink	EBS-Only	14
ml.p3dn.24xlarge	96	768	NVIDIA V100	8	256	32	100	NVLink	2 x 900 NVMeSSD	19

Amazon SageMaker P2 instance product details

Instance Size	vCPUs	Instance Memory (GiB)	GPU Model	GPUs	Total GPU memory (GB)	Memory per GPU (GB)	Network Bandwidth (Gbps)	EBS Bandwidth (Gbps)
ml.p2.xlarge	4	61	NVIDIA K80	1	12	12	Up to 10	High
ml.p2.8xlarge	32	488	NVIDIA K80	8	96	12	10	10
ml.p2.16xlarge	64	732	NVIDIA K80	16	192	12	25	20

Amazon SageMaker G4 instance product details

Instance Size	vCPUs	Instance Memory (GiB)	GPU Model	GPUs	Total GPU memory (GB)	Memory per GPU (GB)	Network Bandwidth (Gbps)	Instance Storage (GB)	EBS Bandwidth (Gbps)
ml.g4dn.xlarge	4	16	NVIDIA T4	1	16	16	Up to 25	1 x 125 NVMe SSD	Up to 3.5
ml.g4dn.2xlarge	8	32	NVIDIA T4	1	16	16	Up to 25	1 x 125 NVMe SSD	Up to 3.5
ml.g4dn.4xlarge	16	64	NVIDIA T4	1	16	16	Up to 25	1 x 125 NVMe SSD	4.75
ml.g4dn.8xlarge	32	128	NVIDIA T4	1	16	16	50	1 x 900 NVMe SSD	9.5
ml.g4dn.16xlarge	64	256	NVIDIA T4	1	16	16	50	1 x 900 NVMe SSD	9.5
ml.g4dn.12xlarge	48	192	NVIDIA T4	4	64	16	50	1 x 900 NVMe SSD	9.5

Amazon SageMaker G5 instance product details

Instance Size	vCPUs	Instance Memory (GiB)	GPU Model	GPUs	Total GPU Memory (GB)	Memory per GPU (GB)	Network Bandwidth (Gbps)	EBS Bandwidth (Gbps)	Instance Storage (GB)
ml.g5n.xlarge	4	16	NVIDIA A10G	1	24	24	Up to 10	Up to 3.5	1x250
ml.g5.2xlarge	8	32	NVIDIA A10G	1	24	24	Up to 10	Up to 3.5	1x450
ml.g5.4xlarge	16	64	NVIDIA A10G	1	24	24	Up to 25	8	1x600
ml.g5.8xlarge	32	128	NVIDIA A10G	1	24	24	25	16	1x900
ml.g5.16xlarge	64	256	NVIDIA A10G	1	24	24	25	16	1x1900
ml.g5.12xlarge	48	192	NVIDIA A10G	4	96	24	40	16	1x3800
ml.g5.24xlarge	96	384	NVIDIA A10G	4	96	24	50	19	1x3800
ml.g5.48xlarge	192	768	NVIDIA A10G	8	192	24	100	19	2x3800

Amazon SageMaker Trn1 instance product details

Instance Size	vCPUs	Memory (GiB)	Trainium Accelerators	Total Accelerator Memory (GB)	Memory per Accelerator (GB)	Instance Storage (GB)	Network Bandwidth (Gbps)	EBS Bandwidth (Gbps)
ml.trn1.2xlarge	8	32	1	32	32	1 x 500 NVMe SSD	Up to 12.5	Up to 20
ml.trn1.32xlarge	128	512	16	512	32	4 x 2000 NVMe SSD	800	80

Amazon SageMaker Inf1 instance product details

Instance Size	vCPUs	Memory (GiB)	Inferentia Accelerators	Total Accelerator Memory (GB)	Memory per Accelerator (GB)	Instance Storage	Inter-accelerator Interconnect	Network Bandwidth (Gbps)	EBS Bandwidth (Gbps)
ml.inf1.xlarge	4	8	1	8	8	EBS only	N/A	Up to 25	Up to 4.75
ml.inf1.2xlarge	8	16	1	8	8	EBS only	N/A	Up to 25	Up to 4.75
ml.inf1.6xlarge	24	48	4	32	8	EBS only	Yes	25	4.75
ml.inf1.24xlarge	96	192	16	128	8	EBS only	yes	100	19

Amazon SageMaker Inf2 instance product details

Instance Size	vCPUs	Memory (GiB)	Inferentia Accelerators	Total Accelerator Memory (GB)	Memory per Accelerator (GB)	Instance Storage	Inter-accelerator Interconnect	Network Bandwidth (Gbps)	EBS Bandwidth (Gbps)
ml.inf2.xlarge	4	16	1	32	32	EBS only	N/A	Up to 25	Up to 10
ml.inf2.8xlarge	32	128	1	32	32	EBS only	N/A	Up to 25	10
ml.inf2.24xlarge	96	384	6	196	32	EBS only	Yes	50	30
ml.inf2.48xlarge	192	768	12	384	32	EBS only	Yes	100	60

Amazon SageMaker Studio

Amazon SageMaker Studio is a single web-based interface for complete ML development, offering a choice of fully managed integrated development environments (IDEs) and purpose-built tools. You can access SageMaker Studio free of charge. You are only charged for the underlying compute and storage that you use for different IDEs and ML tools within SageMaker Studio.

You can use many services from SageMaker Studio, AWS SDK for Python (Boto3), or AWS Command Line Interface (AWS CLI), including the following:

IDEs on SageMaker Studio to perform complete ML development with a broad set of fully managed IDEs, including JupyterLab, Code Editor based on Code-OSS (Visual Studio Code – Open Source), and RStudio
SageMaker Pipelines to automate and manage ML workflows
SageMaker Autopilot to automatically create ML models with full visibility
SageMaker Experiments to organize and track your training jobs and versions
SageMaker Debugger to debug anomalies during training
SageMaker Model Monitor to maintain high-quality models
SageMaker Clarify to better explain your ML models and detect bias
SageMaker JumpStart to easily deploy ML solutions for many use cases. You may incur charges from other AWS services used in the solution for the underlying API calls made by Amazon SageMaker on your behalf.
SageMaker Inference Recommender to get recommendations for the right endpoint configuration

You pay only for the underlying compute and storage resources within SageMaker or other AWS services, based on your usage.

To use Amazon Q Developer in JupyterLab, you must subscribe to Amazon Q Developer Pro. Amazon Q Developer pricing is available here.

Foundation model evaluations

SageMaker Clarify supports foundation model evaluations with both automatic and human-based evaluation methods. Each of these has different pricing. If you are evaluating a foundation model from Amazon SageMaker JumpStart that is not yet deployed to your account, SageMaker will temporarily deploy the JumpStart model on a SageMaker instance for the duration of the inference. The specific instance will conform to the instance recommendation provided by JumpStart for that model.

Automatic evaluation:
Foundation model evaluations run as SageMaker processing job. The evaluation job will invoke SageMaker Inference. Customers are charged for the inference and for the evaluation job. Customers are charged only for the duration of the evaluation job. The cost of the evaluation job would be the sum of the cost per hour of the evaluation instance and the sum of the cost per hour of the hosting instance.

Human-based evaluation:
When you use the human-based evaluation feature where you bring your own workforce, you are charged for three items: 1) SageMaker instance used for inference, 2) the instance used to run the SageMaker Processing Job that hosts the human evaluation, and 3) a charge of $0.21 per completed human evaluation task. A human task is defined as an occurrence of a human worker submitting an evaluation of a single prompt and its associated inference responses in the human evaluation user interface. The price is the same whether you have 1 or 2 models in your evaluation job or you bring your own inference and also the same regardless of how many evaluation dimensions and rating methods you include. The $0.21 per task pricing is the same for all AWS regions. There is no separate charge for the workforce, as the workforce is supplied by you.

AWS-managed evaluation:
For an AWS-managed expert evaluation, pricing is customized for your evaluation needs in a private engagement while working with the AWS expert evaluations team.

Amazon SageMaker Studio Lab

You can build and train ML models using Amazon SageMaker Studio Lab for free. SageMaker Studio Lab offers developers, academics, and data scientists a no-configuration development environment to learn and experiment with ML at no additional charge.

Amazon SageMaker Canvas

Amazon SageMaker Canvas expands ML access by providing business analysts the ability to generate accurate ML predictions using a visual point-and-click interface—no coding or ML experience required.

Amazon SageMaker Data Labeling

Amazon SageMaker Data Labeling provides two data labeling offerings, Amazon SageMaker Ground Truth Plus and Amazon SageMaker Ground Truth. You can learn more about Amazon SageMaker Data Labeling, a fully managed data labeling service that makes it easy to build highly accurate training datasets for ML.

Amazon SageMaker shadow testing

SageMaker helps you run shadow tests to evaluate a new ML model before production release by testing its performance against the currently deployed model. There is no additional charge for SageMaker shadow testing other than usage charges for the ML instances and ML storage provisioned to host the shadow model. The pricing for ML instances and ML storage dimensions is the same as the real-time inference option specified in the preceding pricing table. There is no additional charge for data processed in and out of shadow deployments.

Amazon SageMaker Edge

Learn more about pricing for Amazon SageMaker Edge to optimize, run, and monitor ML models on fleets of edge devices.

Amazon SageMaker Savings Plans

Amazon SageMaker Savings Plans help to reduce your costs by up to 64%. The plans automatically apply to eligible SageMaker ML instance usage, including SageMaker Studio notebooks, SageMaker notebook instances, SageMaker Processing, SageMaker Data Wrangler, SageMaker Training, SageMaker Real-Time Inference, and SageMaker Batch Transform regardless of instance family, size, or Region. For example, you can change usage from a CPU instance ml.c5.xlarge running in US East (Ohio) to a ml.Inf1 instance in US West (Oregon) for inference workloads at any time and automatically continue to pay the Savings Plans price.

Learn more »

Total cost of ownership (TCO) with Amazon SageMaker

Amazon SageMaker offers at least 54% lower total cost of ownership (TCO) over a three-year period compared to other cloud-based self-managed solutions. Learn more with the complete TCO analysis for Amazon SageMaker.

Pricing examples

Pricing example #1: JupyterLab

As a data scientist, you spend 20 days using JupyterLab for quick experimentation on notebooks, code, and data for 6 hours per day on an ml.g4dn.xlarge instance. You create and then run a JupyterLab space to access the JupyterLab IDE. The compute is only charged for the instance used when the JupyterLab space is running. Storage charges for a JupyterLab space accrued until it is deleted.

Compute

Instance	Duration	Days	Total duration	Cost per hour	Total
ml.g4dn.xlarge	6 hours	20	6 * 20 = 120 hours	$0.7364	$88.368

Storage

You will be using General Purpose SSD storage for 480 hours (24 hours * 20 days). In a Region that charges $0.1125 per GB-month:
$0.112 per GB-month * 5 GB * 480 / (24 hours/day * 30-day month) = $0.373

Pricing example #2: Code Editor

As an ML engineer, you spend 20 days using Code Editor for ML production code editing, execution, and debugging for 6 hours per day on an ml.g4dn.xlarge instance. You create and then run a Code Editor space to access the Code Editor IDE. The compute is only charged for the instance used when the Code Editor space is running. Storage charges for a Code Editor space accrued until it is deleted.

Compute

Instance	Duration	Days	Total duration	Cost per hour	Total
ml.g4dn.xlarge	6 hours	20	6 * 20 = 120 hours	$0.7364	$88.368

Storage

Pricing example #3: Studio Classic

A data scientist goes through the following sequence of actions while using notebooks in Amazon SageMaker Studio Classic.

Opens notebook 1 in a TensorFlow kernel on an ml.c5.xlarge instance and then works on this notebook for 1 hour.
Opens notebook 2 on an ml.c5.xlarge instance. It will automatically open in the same ml.c5.xlarge instance that is running notebook 1.
Works on notebook 1 and notebook 2 simultaneously for 1 hour.
The data scientist will be billed for a total of 2 hours of ml.c5.xlarge usage. For the overlapped hour where she worked on notebook 1 and notebook 2 simultaneously, each kernel application will be metered for 0.5 hours and she will be billed for 1 hour.

Kernel application	Notebook instance	Hours	Cost per hour	Total
TensorFlow	ml.c5.xlarge	1	$0.204	$0.204
TensorFlow	ml.c5.xlarge	0.5	$0.204	$0.102
Data Science	ml.c5.xlarge	0.5	$0.204	$0.102
				$0.408

Pricing example #4: RStudio

A data scientist goes through the following sequence of actions while using RStudio:

Launches RSession 1 on an ml.c5.xlarge instance, then works on this notebook for 1 hour.
Launches RSession 2 on an ml.c5.xlarge instance. It will automatically open in the same ml.c5.xlarge instance that is running RSession 1.
Works on RSesssion 1 and RSession 2 simultaneously for 1 hour.
The data scientist will be billed for a total of two (2) hours of ml.c5.xlarge usage. For the overlapped hour where she worked on RSession 1 and RSession 2 simultaneously, each RSession application will be metered for 0.5 hour and she will be billed for 1 hour.

Meanwhile, the RServer is running 24/7 no matter whether there are running RSessions or not. If the admin chooses “Small” (ml.t3.medium), then it is free of charge. If the admin chooses “Medium” (ml.c5.4xlarge) or “Large” (ml.c5.9xlarge), then it is charged hourly as far as RStudio is enabled for the SageMaker Domain.

RSession app	RSession instance	Hours	Cost per hour	Total
Base R	ml.c5.xlarge	1	$0.204	$0.204
Base R	ml.c5.xlarge	0.5	$0.204	$0.102
Base R	ml.c5.xlarge	0.5	$0.204	$0.102
				$0.408

Pricing example #5: Processing

Amazon SageMaker Processing only charges you for the instances used while your jobs are running. When you provide the input data for processing in Amazon S3, Amazon SageMaker downloads the data from Amazon S3 to local file storage at the start of a processing job.

The data analyst runs a processing job to preprocess and validate data on two ml.m5.4xlarge instances for a job duration of 10 minutes. She uploads a dataset of 100 GB in S3 as input for the processing job, and the output data (which is roughly the same size) is stored back in S3.

Hours	Processing instances	Cost per hour	Total
1 * 2 * 0.167 = 0.334	ml.m5.4xlarge	$0.922	$0.308

General purpose (SSD) storage (GB)	Cost per hour	Total
100 GB * 2 = 200	$0.14	$0.0032

The subtotal for Amazon SageMaker Processing job = $0.308.
The subtotal for 200 GB of general purpose SSD storage = $0.0032.
The total price for this example would be $0.3112.

Pricing example #6: Data Wrangler

From the table, you use Amazon SageMaker Data Wrangler for a total of 18 hours over 3 days to prepare your data. Additionally, you create a SageMaker Data Wrangler job to prepare updated data on a weekly basis. Each job lasts 40 minutes, and the job runs weekly for one month.

Total monthly charges for using Data Wrangler = $16.596 + $2.461 = $19.097

Application	SageMaker Studio instance	Days	Duration	Total duration	Cost per hour	Cost sub-total
SageMaker Data Wrangler	ml.m5.4xlarge	3	6 hours	18 hours	$0.922	$16.596
SageMaker Data Wrangler job	ml.m5.4xlarge	-	40 minutes	2.67 hours	$0.922	$2.461

As a data scientist, you spend three days using Amazon SageMaker Data Wrangler to cleanse, explore, and visualize your data for 6 hours per day. To execute your data preparation pipeline, you then initiate a SageMaker Data Wrangler job that is scheduled to run weekly.

The table below summarizes your total usage for the month and the associated charges for using Amazon SageMaker Data Wrangler.

Pricing example #7: Feature Store

++ All fractional read units are rounded to the next whole number

Data storage
Total data stored = 31.5 GB
Monthly charges for data storage = 31.5 GB * $0.45 = $14.175

Total monthly charges for Amazon SageMaker Feature Store = $56.875 + $3.185 + $14.175 = $74.235

Day of the month	Total writes	Total write units	Total reads	Total read units
Days 1 to 10	100,000 writes (10,000 writes * 10 days)	2,500,000 (100,000 * 25KB )	100,000 (10,000 * 10 days)	700,000++ (100,000 * 25/4 KB )

Day 11	200,000 writes	5,000,000 (200,000* 25KB)	200,000 reads	1,400,000++ (200,000* 25/4KB)

Days 12 to 30	1,520,000 writes (80,000 * 19 days)	38,000,000 (1,520,000 * 25KB)	1,520,000 writes (80,000 * 19 days)	10,640,000++ (1,520,000 * 25/4KB)

Total chargeable units		45,500,000 write units		12,740,000 read units
Monthly charges for writes and reads		$56.875 (45.5 million write units * $1.25 per million writes)		$3.185 (12.74M read units * $0.25 per million reads)

You have a web application that issues reads and writes of 25 KB each to the Amazon SageMaker Feature Store. For the first 10 days of a month, you receive little traffic to your application, resulting in 10,000 writes and 10,000 reads each day to the SageMaker Feature Store. On day 11 of the month, your application gains attention on social media and application traffic spikes to 200,000 writes and 200,000 reads that day. Your application then settles into a more regular traffic pattern, averaging 80,000 writes and 80,000 reads each day through the end of the month.

The table below summarizes your total usage for the month and the associated charges for using Amazon SageMaker Feature Store.

Pricing example #8: Training

The total charges for training and debugging in this example are $2.38. The compute instances and general purpose storage volumes used by Amazon SageMaker Debugger built-in rules do not incur additional charges.

	General purpose (SSD) storage for training (GB)	General purpose (SSD) storage for debugger built-in rules (GB)	General purpose (SSD) storage for debugger custom rules (GB)	Cost per GB-month	Subtotal
Capacity used	3	2	1
Cost	$0	No additional charges for built-in rule storage volumes	$0	$0.10	$0

Hours	Training instance	Debug instance	Cost per hour	Subtotal
4 * 0.5 = 2.00	ml.m4.4xlarge	n/a	$0.96	$1.92
4 * 0.5 * 2 = 4	n/a	No additional charges for built-in rule instances	$0	$0
4 * 0.5 = 2	ml.m5.xlarge	n/a	$0.23	$0.46
				-------
				$2.38

A data scientist has spent a week working on a model for a new idea. She trains the model 4 times on an ml.m4.4xlarge for 30 minutes per training run with Amazon SageMaker Debugger enabled using 2 built-in rules and 1 custom rule that she wrote. For the custom rule, she specified ml.m5.xlarge instance. She trains using 3 GB of training data in Amazon S3, and pushes 1 GB model output into Amazon S3. SageMaker creates general-purpose SSD (gp2) volumes for each training instance. SageMaker also creates general-purpose SSD (gp2) volumes for each rule specified. In this example, a total of 4 general-purpose SSD (gp2) volumes will be created. SageMaker Debugger emits 1 GB of debug data to the customer’s Amazon S3 bucket.

Pricing example #9: MLflow

You have two teams of data scientists. One team with 10 data scientists and the other team with 40 data scientists. To accommodate these two teams, you choose to enable two different MLflow Tracking Servers: one Small, and one Medium. Each team is conducting machine learning (ML) experiments and need to record the metrics, parameters, and artifacts produced by their training attempts. They want to use MLflow Tracking Servers for 160 hours per month. Assuming each Data Science team stores 1 GB of metadata to track runs in experiments. The bill at the end of month would be calculated as follows:

Compute charges for Small Instance: 160 * $0.60 = $96
Compute charges for Medium Instance: 160 * $1. 40 = $166.4
Storage charges for two teams: 2 * 1 * 0.10 = $0.20

Total = $262.60

Pricing example #10: Real-time inference

The subtotal for training, hosting, and monitoring = $305.827. The subtotal for 3,100 MB of data processed In and 310MB of data processed Out for hosting per month = $0.054. The total charges for this example would be $305.881 per month.

Note, for built-in rules with ml.m5.xlarge instance, you get up to 30 hours of monitoring aggregated across all endpoints each month, at no charge.

Data In per month - hosting	Data Out per month - hosting	Cost per GB In or Out	Total
100 MB * 31 = 3,100 MB		$0.016	$0.0496
	10 MB * 31 = 310 MB	$0.016	$0.00496

Hours per month	Hosting instances	Model Monitor instances	Cost per hour	Total
24 * 31 * 2 = 1488	ml.c5.xlarge		$0.204	$303.522
31*0.08 = 2.5		ml.m5.4xlarge	$0.922	$2.305

The model in example #5 is then deployed to production to two (2) ml.c5.xlarge instances for reliable multi-AZ hosting. Amazon SageMaker Model Monitor is enabled with one (1) ml.m5.4xlarge instance and monitoring jobs are scheduled once per day. Each monitoring job take 5 minutes to complete. The model receives 100 MB of data per day, and inferences are 1/10 the size of the input data.

Pricing example #11: Asynchronous Inference

The subtotal for SageMaker Asynchronous Inference = $15.81 + $0.56 + 2 * .0048 = $16.38. The total Asynchronous Inference charges for this example would be $16.38 per month.

Data In per month	Data Out per month	Cost per GB In or Out	Total
10 KB * 1,024 * 31 = 310 MB	10 KB * 1,024 * 31 = 310 MB	$0.02	0.0048
	10 KB * 1,024 * 31 = 310 MB	$0.02	0.0048

General-purpose (SSD) storage (GB)	Cost per Gb-month	Total
4	$0.14	$0.56

Hours per month	Hosting instances	Cost per hour	Total
2.5 * 31 * 1 = 77.5	ml.c5.xlarge	$0.20	$15.81

Amazon SageMaker Asynchronous Inference charges you for instances used by your endpoint. When not actively processing requests, you can configure auto-scaling to scale the instance count to zero to save on costs. For input payloads in Amazon S3, there is no cost for reading input data from Amazon S3 and writing the output data to S3 in the same Region.

The model in example #5 is used to run an SageMaker Asynchronous Inference endpoint. The endpoint is configured to run on 1 ml.c5.xlarge instance and scale down the instance count to zero when not actively processing requests. The ml.c5.xlarge instance in the endpoint has a 4 GB general-purpose (SSD) storage attached to it. In this example, the endpoint maintains an instance count of 1 for 2 hours per day and has a cooldown period of 30 minutes, after which it scales down to an instance count of zero for the rest of the day. Therefore, you are charged for 2.5 hours of usage per day.

The endpoint processes 1,024 requests per day. The size of each invocation request/response body is 10 KB, and each inference request payload in Amazon S3 is 100 MB. Inference outputs are 1/10 the size of the input data, which are stored back in Amazon S3 in the same Region. In this example, the data processing charges apply to the request and response body, but not to the data transferred to/from Amazon S3.

Pricing example #12: Batch Transform

The total charges for inference in this example would be $2.88.

Hours	Hosting instances	Cost per hour	Total
3 * 0.25 * 4 = 3 hours	ml.m4.4xlarge	$0.96	$2.88

The model in example #5 is used to run SageMaker Batch Transform. The data scientist runs four separate SageMaker Batch Transform jobs on 3 ml.m4.4xlarge for 15 minutes per job run. She uploads an evaluation dataset of 1 GB in S3 for each run, and inferences are 1/10 the size of the input data, which are stored back in S3.

Pricing example #13: On-demand Serverless Inference

Monthly data process charges

Data processing (GB)	Cost per GB In or Out	Monthly data processing charge
10 GB	$0.016	$0.16

The subtotal for on-demand SageMaker Serverless Inference duration charge = $40. The subtotal for 10 GB data processing charge = $0.16. The total charges for this example would be $40.16.

Monthly compute charges

Number of requests	Duration of each request	Total inference duration (sec)	Cost per sec	Monthly inference duration charge
10 M	100 ms	1M	$0.00004	$40

With on-demand Serverless Inference, you only pay for the compute capacity used to process inference requests, billed by the millisecond, and the amount of data processed. The compute charge depends on the memory configuration you choose.

If you allocated 2 GB of memory to your endpoint, executed it 10 million times in one month and it ran for 100 ms each time, and processed 10 GB of Data-In/Out total, your charges would be calculated as follows:

Pricing example #14: Provisioned Concurrency on Serverless Inference

Let’s assume you are running a chat-bot service for a payroll processing company. You expect a spike in customer inquiries at the end of March, before tax filing deadline. However, for rest of the month, the traffic is expected to be low. So, you deploy a serverless endpoint with 2GB memory and add Provisioned Concurrency of 100 for the last 5 days of the month for 9am-5pm(8hrs), during which your endpoint processes 10M requests and 10GBs of Data-In/Out total. Rest of the month, the chat-bot runs on on-demand Serverless Inference and processes 3 M requests and 3GB of Data-In/Out. Let’s assume duration of each request to be 100ms.

Provisioned Concurrency(PC) charges
PC price is $ 0.000010/ sec
PC usage duration (sec) = 5days* 100 PC* 8 hrs* 3600sec = 14,400,000 secs
PC usage charge = 14,400,000 secs* $ 0.000010/ sec = $144.

Inference duration charges for traffic served by Provisioned Concurrency
Inference duration price is $ 0.000023/sec
Total Inference duration for PC (sec)= 10M*(100ms) /1000= 1M seconds.
Inference duration charges for PC= 1,000,000 sec * $ 0.000023/sec =$23

On-demand inference duration charges
The monthly compute price is $0.00004/sec and the free tier provides 150k sec.
Total compute (sec) = (3) M * (100ms) /1000= 0.3M seconds.
Total compute – Free tier compute = Monthly billable compute in secs
0.3M sec – 150k sec = 150k sec
Monthly compute charges = 150k *$0.00004= $6

Data Processing
Cost/GB of Data Processed In/Out = $0.016
Total GBs processed= 10+3=13
Total Cost= $0.016*13= $0.208

Total charges for March
Total charges = Provisioned Concurrency charges+ Inference duration for Provisioned Concurrency + Inference duration for On-demand compute + Data Processing charges
= $144+$23+ $6+ $0.208= $173.2

Pricing example #15: Jumpstart

Customer uses JumpStart to deploy a pre-trained BERT Base Uncased model to classify customer review sentiment as positive or negative.

The customer deploys the model to two (2) ml.c5.xlarge instances for reliable multi-AZ hosting. The model receives 100 MB of data per day, and inferences are 1/10 the size of the input data.

Hours per month	Hosting instances	Cost per hour	Total
24 * 31 * 2 = 1488	ml.c5.xlarge	$0.204	$303.55

Data In per month - Hosting	Data Out per month - Hosting	Cost per GB In or Out	Total
100 MB * 31 = 3,100 MB		$0.02	$0.06
	10 MB * 31 = 310 MB	$0.02	$0.01

The subtotal for training, hosting, and monitoring = $305.827. The sub-total for 3,100 MB of data processed In and 310 MB of data processed Out for Hosting per month = $0.06. The total charges for this example would be $305.887 per month.

Pricing example #16: HyperPod Cluster

Let's say you wanted to provision a cluster of 16 ml.g5.24xlarge for 1 month (30 days) with an additional 100 GB of storage per instance to support model development. The total charges for the cluster and additional storage in this example is 7,553.60.

Compute

Instance	Duration	Cost per hour	Subtotal
ml.p5.48xlarge	30 days * 24 hours = 720 hours	$10.18	$7,329.60

Storage

Genral purpose (SSD) storage	Duration	Instances	Cost per GB-month	Subtotal
100 GB	30 days * 24 hours = 720 hours	16	$0.14	$224.00

Pricing Example #17: Foundation model evaluations (automatic evaluation)

Foundation model evaluations with SageMaker Clarify only charges you for the instances used while your automatic evaluation jobs are running. When you select an automatic evaluation task and dataset, SageMaker loads the prompt dataset from Amazon S3 onto a SageMaker evaluations instance.

In the following example, an ML engineer runs a evaluation of Llama2 7B model in US-East (N. Virginia) for summarization task accuracy. The recommended instance type for inference for Llama 2 7B is ml.g5.2xlarge. The recommended minimum instance for an evaluation is ml.m5.2xlarge. In this example, the job runs for 45 minutes (depending on the size of the dataset). In this example, the cost would be $1.48 for the evaluation job and detailed results.

Processing Job Hours (example)	Region	Instance Type	Instance	Cost per hour	Cost
0.45	US-east-1	LLM hosting	ml.g5.2xlarge	$1.52	$1.14
0.45	US-east-1	evaluation	ml.m5.2xlarge	$0.46	$0.35
Total					$1.48

In the next example, the same engineer in Virginia runs another evaluation job for summarization task accuracy, but uses a customized version of Llama 2 7B that is deployed to their account and up and running. In this case, because the model is already deployed to their account, the only incremental cost would be for the evaluation instance.

Processing Job Hours	Region	Instance Type	Instance	Cost per hour	Cost
0.45	US-east-1	evaluation	ml.m5.2xlarge	$0.46	$0.35
Total					$0.35

Pricing Example #18: Foundation model evaluations (human-based evaluation)

In the following example, a machine learning engineer in US East (N. Virginia) runs a human-based evaluation of Llama-2-7B for summarization task accuracy and uses their own private workforce to the evaluation. The recommended instance type for Llama-2-7B is ml.g5.2xlarge. The recommended minimum instance for a human-based evaluation Processing Job is ml.t3.medium. Inference on Llama-2-7B runs for 45 minutes (depends on size of dataset). The dataset contains 50 prompts, and the developer requires 2 workers to rate each prompt-response set (configurable in the evaluation job creation as “workers per prompt” parameter). There will be 100 tasks in this evaluation job (1 task for each prompt-response pair per each worker: 2 workers x 50 prompt-response sets = 100 human tasks). The human workforce takes one day (24 hours) to complete all 100 human evaluation tasks in the evaluation job (depends on number and skill level of workers, and the length/complexity of prompts and inference responses).

Compute Hours	Human tasks	Region	Instance Type	Instance	Cost per hour	Cost per human task	Total Cost
0.45		US East (N Virginia)	LLM hosting	ml.g5.2xlarge	$1.52		$1.14
24		US East (N Virginia)	Processing Job	ml.t3.medium	$0.05		$1.20
	100	Any				$0.21	$21.00
Total							$23.34

In the next example, the same engineer in US East (N. Virginia) runs the same evaluation job but uses Llama-2-7B already deployed to their account and up and running. In this case, the only incremental cost would be for the evaluation processing job and human tasks.

Compute Hours	Human tasks	Region	Instance Type	Instance	Cost per hour	Cost per human task	Total Cost
24		US East (N Virginia)	Processing Job	ml.t3.medium	$0.05		$1.20
	100	Any				$0.21	$21.00
Total							$22.20

Next steps

Features page

Discover a wide range of SageMaker features

Learn more

Console

Amazon SageMaker pricing

Amazon SageMaker Free Tier

AWS Pricing Calculator

On-Demand Pricing

Instance details

Amazon SageMaker Studio

Foundation model evaluations

Amazon SageMaker Studio Lab

Amazon SageMaker Canvas

Amazon SageMaker Data Labeling

Amazon SageMaker shadow testing

Amazon SageMaker Edge

Amazon SageMaker Savings Plans

Total cost of ownership (TCO) with Amazon SageMaker

Pricing examples

Pricing example #1: JupyterLab

Pricing example #2: Code Editor

Pricing example #3: Studio Classic

Pricing example #4: RStudio

Pricing example #5: Processing

Pricing example #6: Data Wrangler

Pricing example #7: Feature Store

Pricing example #8: Training

Pricing example #9: MLflow

Pricing example #10: Real-time inference

Pricing example #11: Asynchronous Inference

Pricing example #12: Batch Transform

Pricing example #13: On-demand Serverless Inference

Pricing example #14: Provisioned Concurrency on Serverless Inference

Pricing example #15: Jumpstart

Pricing example #16: HyperPod Cluster

Pricing Example #17: Foundation model evaluations (automatic evaluation)

Pricing Example #18: Foundation model evaluations (human-based evaluation)

Next steps

Discover a wide range of SageMaker features

Get started building with SageMaker in the AWS Management Console

Amazon SageMaker pricing

Amazon SageMaker Free Tier

AWS Pricing Calculator

On-Demand Pricing

Instance details

Amazon SageMaker Studio

Foundation model evaluations

Amazon SageMaker Studio Lab

Amazon SageMaker Canvas

Amazon SageMaker Data Labeling

Amazon SageMaker shadow testing

Amazon SageMaker Edge

Amazon SageMaker Savings Plans

Total cost of ownership (TCO) with Amazon SageMaker

Pricing examples

Pricing example #1: JupyterLab

Pricing example #2: Code Editor

Pricing example #3: Studio Classic

Pricing example #4: RStudio

Pricing example #5: Processing

Pricing example #6: Data Wrangler

Pricing example #7: Feature Store

Pricing example #8: Training

Pricing example #9: MLflow

Pricing example #10: Real-time inference

Pricing example #11: Asynchronous Inference

Pricing example #12: Batch Transform

Pricing example #13: On-demand Serverless Inference

Pricing example #14: Provisioned Concurrency on Serverless Inference

Pricing example #15: Jumpstart

Pricing example #16: HyperPod Cluster

Pricing Example #17: Foundation model evaluations (automatic evaluation)

Pricing Example #18: Foundation model evaluations (human-based evaluation)

Next steps

Discover a wide range of SageMaker features

Get started building with SageMaker in the AWS Management Console

Ending Support for Internet Explorer