Posted On: Jun 28, 2022
Amazon SageMaker provides a suite of built-in algorithms, pre-trained models, and pre-built solution templates to help data scientists and machine learning practitioners get started on training and deploying machine learning models quickly. These algorithms and models can be used for both supervised and unsupervised learning. They can process various types of input data including tabular, image, and text.
Starting today, Amazon SageMaker provides four new tabular data modeling algorithms: LightGBM, CatBoost, AutoGluon-Tabular and TabTransformer. These popular, state-of-the-art algorithms can be used for both tabular classification and regression tasks. They are available through the SageMaker JumpStart UI inside of SageMaker Studio, as well as through python code using SageMaker Python SDK. To learn how to use these algorithms, you can find SageMaker example notebooks below:
- LightGBM is a popular and high-performance open-source implementation of the Gradient Boosting Decision Tree (GBDT). To learn how to use this algorithm, please see example notebooks for Classification and Regression.
- CatBoost is another popular and high-performance open-source implementation of the Gradient Boosting Decision Tree (GBDT). To learn how to use this algorithm, please see example notebooks for Classification and Regression.
- AutoGluon-Tabular is an open-source AutoML project developed and maintained by Amazon which performs advanced data processing, deep learning, and multi-layer stack ensembling. To learn how to use this algorithm, please see example notebooks for Classification and Regression.
- TabTransformer is a novel deep tabular data modelling architecture built upon self-attention based Transformers, an innovation from Amazon science research. To learn how to use this algorithm, please see example notebooks for Classification and Regression.
More detailed explanation for how to use these algorithms can be found in the following blogs on Bringing the power of deep learning to data in tables, and New built-in Amazon SageMaker algorithms for tabular data modeling: LightGBM, CatBoost, AutoGluon-Tabular, and TabTransformer.
All these four algorithms can be used in all regions where Amazon SageMaker is available. To get started with these new models on SageMaker, refer to the documentation.