Introducing SageMaker Core: A new object-oriented Python SDK for Amazon SageMaker

February 2025: This post was reviewed and updated for accuracy.

We’re excited to announce the release of SageMaker Core, a new Python SDK from Amazon SageMaker designed to offer an object-oriented approach for managing the machine learning (ML) lifecycle. This new SDK streamlines data processing, training, and inference and features resource chaining, intelligent defaults, and enhanced logging capabilities. With SageMaker Core, managing ML workloads on SageMaker becomes simpler and more efficient. The SageMaker Core SDK comes bundled as part of the SageMaker Python SDK version 2.231.0 and above.

In this post, we show how the SageMaker Core SDK simplifies the developer experience while providing API for seamlessly executing various steps in a general ML lifecycle. We also discuss the main benefits of using this SDK along with sharing relevant resources to learn more about this SDK.

Traditionally, developers have had two options when working with SageMaker: the AWS SDK for Python, also known as boto3, or the SageMaker Python SDK. Although both provide comprehensive APIs for ML lifecycle management, they often rely on loosely typed constructs such as hard-coded constants and JSON dictionaries, mimicking a REST interface. For instance, to create a training job, Boto3 offers a create_training_job API, but retrieving job details requires the describe_training_job API.

While using boto3, developers face the challenge of remembering and crafting lengthy JSON dictionaries, ensuring that all keys are accurately placed. Let’s take a closer look at the create_training_job method from boto3:

response = client.create_training_job(
    TrainingJobName='string',
    HyperParameters={
        'string': 'string'
    },
    AlgorithmSpecification={
            .
            .
            .
    },
    RoleArn='string',
    InputDataConfig=[
        {
            .
            .
            .
        },
    ],
    OutputDataConfig={
            .
            .
            .
    },
    ResourceConfig={
            .
            .
            .    
    },
    VpcConfig={
            .
            .
            .
    },
    .
    .
    .
    .
# All arguments/fields are not shown for brevity purposes.

)

If we observe carefully, for arguments such as AlgorithmSpecification, InputDataConfig, OutputDataConfig, ResourceConfig, or VpcConfig, we need to write verbose JSON dictionaries. Because it contains many string variables in a long dictionary field, it’s very easy to have a typo somewhere or a missing key. There is no type checking possible, and as for the compiler, it’s just a string.
Similarly in SageMaker Python SDK, it requires us to create an estimator object and invoke the fit() method on it. Although these constructs work well, they aren’t intuitive to the developer experience. It’s hard for developers to map the meaning of an estimator to something that can be used to train a model.

Introducing SageMaker Core SDK

SageMaker Core SDK offers to solve this problem by replacing such long dictionaries with object-oriented interfaces, so developers can work with object-oriented abstractions, and SageMaker Core will take care of converting those objects to dictionaries and executing the actions on the developer’s behalf.

The following are the key features of SageMaker Core:

Object-oriented interface – It provides object-oriented classes for tasks such as processing, training, or deployment. Providing such interface can enforce strong type checking, make the code more manageable and promote reusability. Developers can benefit from all features of object-oriented programming.
Resource chaining – Developers can seamlessly pass SageMaker resources as objects by supplying them as arguments to different resources. For example, we can create a model object and pass that model object as an argument while setting up the endpoint. In contrast, while using Boto3, we need to supply ModelName as a string argument.
Abstraction of low-level details – It automatically handles resource state transitions and polling logics, freeing developers from managing these intricacies and allowing them to focus on higher value tasks.
Support for intelligent defaults – It supports SageMaker intelligent defaults, allowing developers to set default values for parameters such as AWS and Identity and Access Management (IAM) roles and virtual private cloud (VPC) configurations. This streamlines the setup process, and SageMaker Core API will pick the default settings automatically from the environment.
Auto code completion – It enhances the developer experience by offering real-time suggestions and completions in popular integrated development environments (IDEs), reducing chances of syntax errors and speeding up the coding process.
Full parity with SageMaker APIs, including generative AI – It provides access to the SageMaker capabilities, including generative AI, through the core SDK, so developers can seamlessly use SageMaker Core without worrying about feature parity with Boto3.
Comprehensive documentation and type hints – It provides robust and comprehensive documentation and type hints so developers can understand the functionalities of the APIs and objects, write code faster, and reduce errors.

For this walkthrough, we use a straightforward generative AI lifecycle involving data preparation, fine-tuning, and a deployment of Meta’s Llama-3-8B LLM. We use the SageMaker Core SDK to execute all the steps.

Prerequsities

To get started with SageMaker Core, make sure Python 3.8 or greater is installed in the environment. There are two ways to get started with SageMaker Core:

If not using SageMaker Python SDK, install the sagemaker-core SDK using the following code example.
```
%pip install sagemaker-core
```
If you’re already using SageMaker Python SDK, upgrade it to a version greater than or matching version 2.231.0. Any version above 2.231.0 has SageMaker Core preinstalled. The following code example shows the command for upgrading the SageMaker Python SDK.
```
%pip install –upgrade sagemaker>=2.231.0
```

Solution walkthrough

To manage your ML workloads on SageMaker using SageMaker Core, use the steps in the following sections.

Data preparation

In this phase, prepare the training and test data for the use-case. Here, we use a synthetic dataset that AWS provides for customer churn prediction. The following code creates a ProcessingJob object using the static method create:

from sagemaker_core.resources import ProcessingJob

# Initialize a ProcessingJob resource

processing_job = ProcessingJob.create(
    processing_job_name=f"sagemaker-core-data-prep-{formatted_timestamp}",
    processing_resources=ProcessingResources(
        cluster_config=ProcessingClusterConfig(
            instance_count=1,
            instance_type="ml.m5.xlarge",
            volume_size_in_gb=20
        )
    ),
    app_specification=AppSpecification(
        image_uri=f"683313688378.dkr.ecr.{region}.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3",
        container_entrypoint=["python3", "/opt/ml/processing/code/preprocess.py"]
    ),
    role_arn=role,
    processing_inputs=[
        ProcessingInput(
            input_name="input",
            s3_input=ProcessingS3Input(
                s3_uri=f"s3://sagemaker-example-files-prod-{region}/datasets/tabular/synthetic/churn.txt",
                s3_data_type="S3Prefix",
                local_path="/opt/ml/processing/input",
                s3_input_mode="File"
            ),
        ),
        ProcessingInput(
            input_name="code",
            s3_input=ProcessingS3Input(
                s3_uri=pre_processing_code_s3_uri,
                s3_data_type="S3Prefix",
                local_path="/opt/ml/processing/code",
                s3_input_mode="File"
            ),
        )
    ],
    processing_output_config= ProcessingOutputConfig(
            outputs=[
                ProcessingOutput(
                    output_name="train",
                    s3_output=ProcessingS3Output(
                        s3_uri=processed_train_data_uri,
                        s3_upload_mode="EndOfJob",
                        local_path="/opt/ml/processing/output/train"
                    )
                ),
                ProcessingOutput(
                    output_name="validation",
                    s3_output=ProcessingS3Output(
                        s3_uri=processed_validation_data_uri,
                        s3_upload_mode="EndOfJob",
                        local_path="/opt/ml/processing/output/validation"
                    )
                ),
                ProcessingOutput(
                    output_name="test",
                    s3_output=ProcessingS3Output(
                        s3_uri=processed_test_data_uri,
                        s3_upload_mode="EndOfJob",
                        local_path="/opt/ml/processing/output/test"
                    )
                )
            ]
        )
)

# Wait for the ProcessingJob to complete
processing_job.wait()

In the above code, you create the following objects:

ProcessingClusterConfig : It contains the infrastructure details to run the processing job.
AppSpecification: It contains details about SageMaker managed Scikit-learn Docker container which will run the preprocess.py as the entrypoint.
Two objects of type ProcessingInput which provide details about the raw input data and processing code. Note that the processing code location is the same as specified in the container_entrypoint argument in the previous step.
Three objects of ProcessingOutput which provide details about processed training, validation and test data.

And, provide all of the above objects to the create method which starts the SageMaker Processing job.

Training

In this step, you train an XGBoost model on the prepared data from the previous step. The following code snippet shows the training API. You create a TrainingJob object using the create method, specifying the training script, source directory, instance type, instance count, output path, and hyper-parameters.

from sagemaker_core.resources import TrainingJob
from sagemaker_core.shapes import (
    AlgorithmSpecification,
    Channel,
    DataSource,
    S3DataSource,
    ResourceConfig,
    StoppingCondition,
    OutputDataConfig,
)

# Create training job
training_job = TrainingJob.create(
    training_job_name=job_name,
    hyperparameters=hyperparameters,
    algorithm_specification=AlgorithmSpecification(
        training_image=image,
        training_input_mode="File",
    ),
    role_arn=role,
    input_data_config=[
        Channel(
            channel_name="train",
            content_type="csv",
            data_source=DataSource(
                s3_data_source=S3DataSource(
                    s3_data_type="S3Prefix",
                    s3_uri=processed_train_data_uri,
                    s3_data_distribution_type="FullyReplicated",
                )
            ),
        ),
        Channel(
            channel_name="validation",
            content_type="csv",
            data_source=DataSource(
                s3_data_source=S3DataSource(
                    s3_data_type="S3Prefix",
                    s3_uri=processed_validation_data_uri,
                    s3_data_distribution_type="FullyReplicated",
                )
            ),
        ),
    ],
    output_data_config=OutputDataConfig(s3_output_path=s3_output_path),
    resource_config=ResourceConfig(
        instance_type=INSTANCE_TYPE,
        instance_count=INSTANCE_COUNT,
        volume_size_in_gb=VOLUME_SIZE_IN_GB,
    ),
    stopping_condition=StoppingCondition(max_runtime_in_seconds=MAX_RUNTIME_IN_SECONDS),
)

# Wait for the training job to complete
training_job.wait()

In the above code, we create the following objects:

AlgorithmSpecification : It contains URI of the SageMaker managed XGBoost container which will be used to run the training job.
Two objects of type Channel which provide details about the training and validation data.
An object of OutputDataConfig which provides details about S3 location where the model will be stored.
An object of type ResourceConfig which specifies the infrastructure details of the training cluster.
An object of StoppingCondition which specifies the maximum runtime allowed for the training job.

And, provide all of the above objects to the create method which starts the SageMaker Training job.

Model creation and deployment

Deploying a model on a SageMaker endpoint consists of three steps:

Create a SageMaker model object
Create the endpoint configuration
Create the endpoint

SageMaker Core provides an object-oriented interface for all three steps.

Create a SageMaker model object

The following code snippet shows the model creation experience in SageMaker Core.

from sagemaker_core.resources import Model
from sagemaker_core.shapes import ContainerDefinition

model_s3_uri = training_job.model_artifacts.s3_model_artifacts

# Create SageMaker model: An image along with the model artifact to use.
customer_churn_model = Model.create(
    model_name="customer-churn-xgboost",
    primary_container=ContainerDefinition(
        image=image,
        model_data_url=model_s3_uri,
    ),
    execution_role_arn=role,
)

Note, here we create an object of ContainerDefinition. Similar to the processing and training steps, you have used create method from Model class. The container definition is an object now, specifying the container definition that uses the same XGBoost container model output location from the training job. You can also observe resource chaining in action where you pass the output of the TrainingJob as input data to the model.

Create the endpoint configuration

Create the endpoint configuration. The following code snippet shows the experience in SageMaker Core.

from sagemaker_core.resources import Endpoint, EndpointConfig
from sagemaker_core.shapes import ProductionVariant

endpoint_config_name = "churn-prediction-endpoint-config"  # Name of endpoint configuration
model_name = customer_churn_model.get_name()  # Get name of SageMaker model created in previous step
endpoint_name = "customer-churn-endpoint"  # Name of SageMaker endpoint

endpoint_config = EndpointConfig.create(
    endpoint_config_name=endpoint_config_name,
    production_variants=[
        ProductionVariant(
            variant_name="AllTraffic",
            model_name=model_name,
            instance_type=instance_type,
            initial_instance_count=1,
        )
    ],
) Note that ProductionVariant is an object now.

Create the endpoint

Create the endpoint using the following code snippet.

sagemaker_endpoint = Endpoint.create(
    endpoint_name=endpoint_name,
    endpoint_config_name=endpoint_config.get_name(),
)

The entire code for this post is available on Github.

As we have shown in these steps, SageMaker Core simplifies the development experience by providing an object-oriented interface for interacting with SageMaker resources. The use of intelligent defaults and resource chaining reduces the amount of boilerplate code and manual parameter specification, resulting in more readable and maintainable code.

Cleanup

Any endpoint created using the code in this post will incur charges. Shut down any unused endpoints by using the delete() method.

A note on existing SageMaker Python SDK

SageMaker Python SDK will be using the SageMaker Core as its foundation and will benefit from the object-oriented interfaces created as part of SageMaker Core. Customers can choose to use the object-oriented approach while using the SageMaker Python SDK going forward.

Benefits

The SageMaker Core SDK offers several benefits:

Simplified development – By abstracting low-level details and providing intelligent defaults, developers can focus on building and deploying ML models without getting slowed down by repetitive tasks. It also relieves the developers of the cognitive overload of having to remember long and complex multilevel dictionaries. They can instead work on the object-oriented paradigm that developers are most comfortable with.
Increased productivity – Features like automatic code completion and type hints help developers write code faster and with fewer errors.
Enhanced readability – Dedicated resource classes and resource chaining result in more readable and maintainable code.
Lightweight integration with AWS Lambda – Because this SDK is lightweight (about 8 MB when unzipped), it is straightforward to build an AWS Lambda layer for SageMaker Core and use it for executing various steps in the ML lifecycle through Lambda functions.

Conclusion

SageMaker Core is a powerful addition to Amazon SageMaker, providing a streamlined and efficient development experience for ML practitioners. With its object-oriented interface, resource chaining, and intelligent defaults, SageMaker Core empowers developers to focus on building and deploying ML models without getting slowed down by complex orchestration of JSON structures. Check out the following resources to get started today on SageMaker Core:

About the authors

Vikesh Pandey is a Principal GenAI/ML Specialist Solutions Architect at AWS, helping customers from financial industries design, build and scale their GenAI/ML workloads on AWS. He carries an experience of more than a decade and a half working on entire ML and software engineering stack. Outside of work, Vikesh enjoys trying out different cuisines and playing outdoor sports.

Shweta Singh is a Senior Product Manager in the Amazon SageMaker Machine Learning (ML) platform team at AWS, leading SageMaker Python SDK. She has worked in several product roles in Amazon for over 5 years. She has a Bachelor of Science degree in Computer Engineering and Masters of Science in Financial Engineering, both from New York University.

Artificial Intelligence

Introducing SageMaker Core: A new object-oriented Python SDK for Amazon SageMaker

Introducing SageMaker Core SDK

Prerequsities

Solution walkthrough

Data preparation

Training

Model creation and deployment

Cleanup

A note on existing SageMaker Python SDK

Benefits

Conclusion

About the authors

Resources

Blog Topics

Follow

Learn

Resources

Developers

Help