Posted On: Dec 16, 2022
Today, Amazon SageMaker Feature Store is announcing SageMaker Python SDK support for its offline store. Amazon SageMaker Feature Store is a fully managed, purpose-built repository to store, update, search, and share machine learning (ML) features. SageMaker Feature Store offline store contains historical ML features, and you can use it to generate training data sets for training and batch inference. Until today, you had to use Athena and Glue and write ad hoc SQL queries to create these training datasets.
With this release, you can use the Python SDK methods to create training datasets. The SDK can be used to read the data into a dataframe or export it into csv. Instead of writing complex SQL queries, you can call these methods for common offline store use cases such as joining Feature Groups, time traveling, creating point in time accurate joins, and filtering duplicate records from training datasets.