Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Sign in
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Amazon Sagemaker

Amazon SageMaker is a fully-managed platform that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. With Amazon SageMaker, all the barriers and complexity that typically slow down developers who want to use machine learning are removed. The service includes models that can be used together or independently to build, train, and deploy your machine learning models.

product logo

Active Learning for Text Classification

Latest Version:
1.4
This algorithm trains a supervised text classification model and uses Active Learning to provide the most relevant samples for tagging

    Product Overview

    Active Learning for Text Classification trains a text classification model using a small corpus of training data and provides the most appropriate samples from a huge corpus of unlabeled data to be annotated in order to improve the model accuracy significantly. Using Active Learning this algorithm helps in identifying the most effective data sample to be tagged first thus reducing the time and effort to build a usable Machine learning model.

    Key Data

    Type
    Algorithm
    Fulfillment Methods
    Amazon SageMaker

    Highlights

    • Active Learning for Text Classification can be used for prioritizing the data labeling task and thereby drastically reduce the data tagging effort required to build a working Machine Learning model.

    • This solution can be used to iteratively sample the right data points & train the Machine Learning model to build a supervised machine learning algorithm. It helps in identifying which samples to label first based on the rules learned by Machine Learning model. It uses Active Learning methodologies to select the right sample data points from unlabeled data to build a better performing machine learning model much faster.

    • PACE - ML is Mphasis Framework and Methodology for end-to-end machine learning development and deployment. PACE-ML enables organizations to improve the quality & reliability of the machine learning solutions in production and helps automate, scale, and monitor them. Need customized Machine Learning and Deep Learning solutions? Get in touch!

    Not quite sure what you’re looking for? AWS Marketplace can help you find the right solution for your use case. Contact us

    Pricing Information

    Use this tool to estimate the software and infrastructure costs based your configuration choices. Your usage and costs might be different from this estimate. They will be reflected on your monthly AWS billing reports.

    Contact us to request contract pricing for this product.


    Estimating your costs

    Choose your region and launch option to see the pricing details. Then, modify the estimated price by choosing different instance types.

    Version
    Region

    Software Pricing

    Algorithm Training$10/hr

    running on ml.m5.4xlarge

    Model Realtime Inference$10.00/hr

    running on ml.m5.large

    Model Batch Transform$20.00/hr

    running on ml.m5.large

    Infrastructure Pricing

    With Amazon SageMaker, you pay only for what you use. Training and inference is billed by the second, with no minimum fees and no upfront commitments. Pricing within Amazon SageMaker is broken down by on-demand ML instances, ML storage, and fees for data processing in notebooks and inference instances.
    Learn more about SageMaker pricing

    SageMaker Algorithm Training$0.922/host/hr

    running on ml.m5.4xlarge

    SageMaker Realtime Inference$0.115/host/hr

    running on ml.m5.large

    SageMaker Batch Transform$0.115/host/hr

    running on ml.m5.large

    Algorithm Training

    For algorithm training in Amazon SageMaker, the software is priced based on hourly pricing that can vary by instance type. Additional infrastructure cost, taxes or fees may apply.
    InstanceType
    Algorithm/hr
    ml.m4.4xlarge
    $10.00
    ml.m5.4xlarge
    Vendor Recommended
    $10.00
    ml.m4.16xlarge
    $10.00
    ml.m5.2xlarge
    $10.00
    ml.p3.16xlarge
    $10.00
    ml.m4.2xlarge
    $10.00
    ml.c5.2xlarge
    $10.00
    ml.p3.2xlarge
    $10.00
    ml.c4.2xlarge
    $10.00
    ml.m4.10xlarge
    $10.00
    ml.c4.xlarge
    $10.00
    ml.m5.24xlarge
    $10.00
    ml.c5.xlarge
    $10.00
    ml.p2.xlarge
    $10.00
    ml.m5.12xlarge
    $10.00
    ml.p2.16xlarge
    $10.00
    ml.c4.4xlarge
    $10.00
    ml.m5.xlarge
    $10.00
    ml.c5.9xlarge
    $10.00
    ml.m4.xlarge
    $10.00
    ml.c5.4xlarge
    $10.00
    ml.p3.8xlarge
    $10.00
    ml.m5.large
    $10.00
    ml.c4.8xlarge
    $10.00
    ml.p2.8xlarge
    $10.00
    ml.c5.18xlarge
    $10.00

    Usage Information

    Training

    • The system trains on user provided text datasets.
    • The train dataset must contain 2 files - "train.csv" and "validation.csv" with 'utf-8' encoding.
    1. train.csv
    • train.csv must contain 3 columns - ID, Text and Category.
    1. validation.csv
    • The format of validation.csv is similar to train.csv and must contain 3 columns - ID, Text and Category.

    • ID: unique identification number associated with the text

    • Text: Textual data that needs to be categorized.

    • Category: Class with the associated text.

    • Please refer sample notebook for detailed description.

    Channel specification

    Fields marked with * are required

    training

    *
    Input modes: File
    Content types: application/zip, text/plain, application/json, text/csv
    Compression types: None

    Model input and output details

    Input

    Summary

    If you are using real time inferencing, please create the endpoint first and then use the following command to invoke it:

    aws sagemaker-runtime invoke-endpoint --endpoint-name "endpoint-name" --body fileb://$file_name --content-type text/csv --accept application/output.csv
    Input MIME type
    text/csv
    Sample input data

    Output

    Summary
    • Content types: text/csv
    • The output will be a csv file with sampled data points from unlablled data. The output csv file will contain ID and the associated text sampled for human annotation.
    • These sampled datapoints should be annotated, added to train.csv and removed from unlablled.csv before training the machine learning model again.
    Output MIME type
    text/plain, text/csv, application/json
    Sample output data

    End User License Agreement

    By subscribing to this product you agree to terms and conditions outlined in the product End user License Agreement (EULA)

    Support Information

    Active Learning for Text Classification

    For any assistance reach out to us at:

    AWS Infrastructure

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Learn More

    Refund Policy

    Currently we do not support refunds, but you can cancel your subscription to the service at any time.

    Customer Reviews

    There are currently no reviews for this product.
    View all