Posted On: Jun 28, 2022

Amazon SageMaker provides a suite of built-in algorithms, pre-trained models, and pre-built solution templates to help data scientists and machine learning practitioners get started on training and deploying machine learning models quickly. These algorithms and models can be used for both supervised and unsupervised learning. They can process various types of input data including tabular, image, and text.

Starting today, Amazon SageMaker provides four new tabular data modeling algorithms: LightGBM, CatBoost, AutoGluon-Tabular and TabTransformer. These popular, state-of-the-art algorithms can be used for both tabular classification and regression tasks. They are available through the SageMaker JumpStart UI inside of SageMaker Studio, as well as through python code using SageMaker Python SDKTo learn how to use these algorithms, you can find SageMaker example notebooks below:

  • LightGBM is a popular and high-performance open-source implementation of the Gradient Boosting Decision Tree (GBDT). To learn how to use this algorithm, please see example notebooks for Classification and Regression.
  • CatBoost is another popular and high-performance open-source implementation of the Gradient Boosting Decision Tree (GBDT). To learn how to use this algorithm, please see example notebooks for Classification and Regression.
  • AutoGluon-Tabular is an open-source AutoML project developed and maintained by Amazon which performs advanced data processing, deep learning, and multi-layer stack ensembling. To learn how to use this algorithm, please see example notebooks for Classification and Regression.
  • TabTransformer is a novel deep tabular data modelling architecture built upon self-attention based Transformers, an innovation from Amazon science research. To learn how to use this algorithm, please see example notebooks for Classification and Regression.

All these four algorithms can be used in all regions where Amazon SageMaker is available. To get started with these new models on SageMaker, refer to the documentation.