Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Sign in
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Amazon Sagemaker

Amazon SageMaker is a fully-managed platform that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. With Amazon SageMaker, all the barriers and complexity that typically slow down developers who want to use machine learning are removed. The service includes models that can be used together or independently to build, train, and deploy your machine learning models.

product logo

H2O.ai H2O-3 GBM Algorithm

By:
Latest Version:
0.1
1 AWS review
Gradient Boosting Machine from H2O-3 Core Library

    Product Overview

    Gradient Boosting Machine (for Regression and Classification) is a forward learning ensemble method. The guiding heuristic is that good predictive results can be obtained through increasingly refined approximations. H2O’s GBM sequentially builds regression trees on all the features of the dataset in a fully distributed way - each tree is built in parallel.

    Key Data

    Type
    Algorithm
    Fulfillment Methods
    Amazon SageMaker

    Highlights

    • H2O’s Gradient Boosting Algorithms follow the algorithm specified by Hastie et al (2001)

    Not quite sure what you’re looking for? AWS Marketplace can help you find the right solution for your use case. Contact us

    Pricing Information

    Use this tool to estimate the software and infrastructure costs based your configuration choices. Your usage and costs might be different from this estimate. They will be reflected on your monthly AWS billing reports.


    Estimating your costs

    Choose your region and launch option to see the pricing details. Then, modify the estimated price by choosing different instance types.

    Version
    Region

    Software Pricing

    Algorithm Training$0.00/hr

    running on ml.c5.2xlarge

    Model Realtime Inference$0.00/hr

    running on ml.c5.2xlarge

    Model Batch Transform$0.00/hr

    running on ml.c5.2xlarge

    Infrastructure Pricing

    With Amazon SageMaker, you pay only for what you use. Training and inference is billed by the second, with no minimum fees and no upfront commitments. Pricing within Amazon SageMaker is broken down by on-demand ML instances, ML storage, and fees for data processing in notebooks and inference instances.
    Learn more about SageMaker pricing

    SageMaker Algorithm Training$0.408/host/hr

    running on ml.c5.2xlarge

    SageMaker Realtime Inference$0.408/host/hr

    running on ml.c5.2xlarge

    SageMaker Batch Transform$0.408/host/hr

    running on ml.c5.2xlarge

    Algorithm Training

    For algorithm training in Amazon SageMaker, the software is priced based on hourly pricing that can vary by instance type. Additional infrastructure cost, taxes or fees may apply.
    InstanceType
    Algorithm/hr
    ml.c5.2xlarge
    Vendor Recommended
    $0.00
    ml.c5.4xlarge
    $0.00
    ml.c5.9xlarge
    $0.00
    ml.c5.18xlarge
    $0.00
    ml.c4.2xlarge
    $0.00
    ml.c4.4xlarge
    $0.00
    ml.c4.8xlarge
    $0.00
    ml.m5.xlarge
    $0.00
    ml.m5.2xlarge
    $0.00
    ml.m5.4xlarge
    $0.00
    ml.m5.12xlarge
    $0.00
    ml.m5.24xlarge
    $0.00
    ml.m4.2xlarge
    $0.00
    ml.m4.4xlarge
    $0.00
    ml.m4.10xlarge
    $0.00
    ml.m4.16xlarge
    $0.00

    Usage Information

    Fulfillment Methods

    Amazon SageMaker

    See documentation for list of all available parameters that can be passed to the algorithm. NOTES: only parameter required is "training" hyperparameter. Please make sure to define "distribution" if the expected target is categorical. Or be sure to define "categorical_columns" with the specific categorical columns in the dataset.

    Metrics

    Name
    Regex
    AUC
    AUC: ([0-9.]*)
    MSE
    MSE: ([0-9.]*)
    RMSE
    RMSE: ([0-9.]*)
    auc_pr
    auc_pr: ([0-9.]*)
    LogLoss
    LogLoss: ([0-9.]*)
    Gini
    Gini: ([0-9.]*)

    Channel specification

    Fields marked with * are required

    training

    *
    training data
    Input modes: File
    Content types: csv
    Compression types: None

    Hyperparameters

    Fields marked with * are required

    training

    *
    Training Parameters: distribution?, categorical_columns?, target?
    Type: FreeText
    Tunable: No

    balance_classes

    Balance training data class counts via over/under-sampling
    Type: Categorical
    Tunable: No

    categorical_encoding

    One of: auto, enum, one_hot_internal, one_hot_explicit, binary, eigen, label_encoder, sort_by_response, enum_limited (default: auto).
    Type: FreeText
    Tunable: No

    class_sampling_factors

    Desired over/under-sampling ratios per class (in lexicographic order). (ex. '1.0,1.5,1.7')
    Type: FreeText
    Tunable: No

    col_sample_rate

    Column sample rate (from 0.0 to 1.0)
    Type: Continuous
    Tunable: No

    col_sample_rate_change_per_level

    Relative change of the column sampling rate for every level (must be > 0.0 and <= 2.0)
    Type: Continuous
    Tunable: No

    col_sample_rate_per_tree

    Column sample rate per tree (from 0.0 to 1.0)
    Type: Continuous
    Tunable: No

    distribution

    Distribution function
    Type: FreeText
    Tunable: No

    fold_assignment

    Cross-validation fold assignment scheme, if fold_column is not specified.
    Type: FreeText
    Tunable: No

    fold_column

    Column with cross-validation fold index assignment per observation.
    Type: FreeText
    Tunable: No

    histogram_type

    What type of histogram to use for finding optimal split points
    Type: FreeText
    Tunable: No

    huber_alpha

    Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).
    Type: Continuous
    Tunable: No

    ignore_const_cols

    Ignore constant columns.
    Type: Categorical
    Tunable: No

    ignored_columns

    Names of columns to ignore for training
    Type: FreeText
    Tunable: No

    learn_rate

    Learning rate (from 0.0 to 1.0)
    Type: Continuous
    Tunable: No

    learn_rate_annealing

    Scale the learning rate by this factor after each tree (e.g., 0.99 or 0.999)
    Type: Continuous
    Tunable: No

    max_abs_leafnode_pred

    Maximum absolute value of a leaf node prediction
    Type: Continuous
    Tunable: No

    max_after_balance_size

    Maximum relative size of the training data after balancing class counts
    Type: Continuous
    Tunable: No

    max_depth

    Maximum tree depth.
    Type: Integer
    Tunable: No

    max_hit_ratio_k

    Maximum number (top K) of predictions to use for hit ratio computation
    Type: Integer
    Tunable: No

    max_runtime_secs

    Maximum allowed runtime in seconds for model training. Use 0 to disable.
    Type: Continuous
    Tunable: No

    min_rows

    For numerical columns (real/int), build a histogram of (at least) this many bins, then split at the best point
    Type: Integer
    Tunable: No

    min_split_improvement

    Minimum relative improvement in squared error reduction for a split to happen
    Type: Integer
    Tunable: No

    nbins

    For numerical columns (real/int), build a histogram of (at least) this many bins, then split at the best point
    Type: Integer
    Tunable: No

    nbins_cats

    For categorical columns (factors), build a histogram of this many bins, then split at the best point. Higher values can lead to more overfitting.
    Type: Integer
    Tunable: No

    nbins_top_level

    For numerical columns (real/int), build a histogram of (at most) this many bins at the root level, then decrease by factor of two per level
    Type: Integer
    Tunable: No

    nfolds

    Number of folds for K-fold cross-validation (0 to disable or >= 2).
    Type: Integer
    Tunable: No

    ntrees

    Number of trees.
    Type: Integer
    Tunable: No

    offset_column

    Offset column. This will be added to the combination of columns before applying the link function.
    Type: FreeText
    Tunable: No

    pred_noise_bandwidth

    Bandwidth (sigma) of Gaussian multiplicative noise ~N(1,sigma) for tree node predictions
    Type: Continuous
    Tunable: No

    quantile_alpha

    Desired quantile for Quantile regression, must be between 0 and 1.
    Type: Continuous
    Tunable: No

    sample_rate

    Row sample rate per tree (from 0.0 to 1.0)
    Type: Continuous
    Tunable: No

    sample_rate_per_class

    A list of row sample rates per class (relative fraction for each class, from 0.0 to 1.0), for each tree, (ex. '1.3,1.1,0.5')
    Type: FreeText
    Tunable: No

    score_each_iteration

    Whether to score during each iteration of model training.
    Type: Categorical
    Tunable: No

    score_tree_interval

    Score the model after every so many trees. Disabled if set to 0.
    Type: Integer
    Tunable: No

    seed

    Seed for pseudo random number generator
    Type: Integer
    Tunable: No

    stopping_metric

    One of: auto, deviance, logloss, mse, rmse, mae, rmsle, auc, lift_top_group, misclassification, mean_per_class_error (default: auto).
    Type: FreeText
    Tunable: No

    stopping_rounds

    Early stopping based on convergence of stopping_metric.
    Type: Integer
    Tunable: No

    stopping_tolerance

    Relative tolerance for metric-based stopping criterion
    Type: Continuous
    Tunable: No

    tweedie_power

    tweedie power
    Type: Continuous
    Tunable: No

    weights_column

    Column with observation weights.
    Type: FreeText
    Tunable: No

    End User License Agreement

    By subscribing to this product you agree to terms and conditions outlined in the product End user License Agreement (EULA)

    Support Information

    AWS Infrastructure

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Learn More

    Refund Policy

    There is no refund policy as the algorithm is offered for free

    Customer Reviews

    Anonymous
    Only loads JSON
    Aug 9, 2019Verified purchase review from AWS Marketplace
    This model does not train because it tries to load JSON data only. I inputted a .csv but it did not work. ... Read more
    View all

    Reviews from AWS Marketplace

    1 AWS review