Machine Learning for Everyone with Amazon SageMaker Autopilot and Domo

By Tamara Astakhova, Chaitanya Hazarey, Bo Niu, Solutions Architects – AWS
By Kavitha Rajendran, Data Scientist – AWS
By Ben Ainscough, Head of AI and Data Science Product – Domo

Domo is an AWS Machine Learning Competency Partner and modern, cloud-based business intelligence (BI) platform that puts well-governed BI in the hands of everyone across the organization.

Domo natively combines Integration Platform as a Service (IPaaS) capabilities for data integration, analytics, and visualizations for real-time and predictive insights. It also provides a foundation for rapidly building your own apps to take immediate action on those insights.

Customers depend on the Domo Business Cloud to access their data from a variety of sources and make it usable.

With the right data readily available, users are now looking to machine learning (ML) to help make the predictions needed to automate and speed up critical business processes and workflows.

Once ML predictions are produced, Domo provides a wide range of capabilities to leverage those predictions throughout the organization.

In Domo, users can easily produce visualizations, dashboards, or apps to communicate insights to the rest of the organization. These insights can even be published in other systems or webpages with the power of Domo Everywhere.

In this post, we will describe how the Domo platform, powered machine learning capabilities, can help organizations improve decision making and adapt faster to business changes.

Problem Statement

Machine learning allows users to drive insights about their business. Many organizations recognize the benefits of ML for business decision making. They also realize the challenges of working with multidimensional data that make it difficult and time-consuming to be analyzed by humans.

Additionally, there are many business users who have deep knowledge of the data but insufficient data science and machine learning expertise to create ML models for their complex data.

The AutoML approach provides a solution to these situations by speeding up the process through the automation of the pipeline steps.

Domo with Amazon SageMaker Autopilot

In collaboration with Amazon Web Services (AWS), Domo has created AutoML capabilities that are powered by Amazon SageMaker Autopilot.

Amazon SageMaker Autopilot is a fully managed AWS solution that automatically creates, trains, and tunes the best classification and regression ML models based on the data provided by a customer.

It automatically inspects your dataset and explores various solutions to determine the optimal combination of data preprocessing steps, ML algorithms, and hyperparameters.

Integrating Domo with Amazon SageMaker Autopilot allows business users to manage, clean, and refine data in Domo, and use AutoML to automatically create, train, and tune the ML models needed to get insights from their data.

It also enables users to share those insights with anyone, make predictions, and drive insightful business decisions.

With just a few clicks on the Domo platform, AutoML transforms your data to be ready for the machine learning process. It also runs hundreds of training jobs on any data set to automatically find the model that achieves the highest performance for your task.

You can then choose the model that best suits your business use case and easily deploy it to your production to make a prediction and drive better business decisions.

The combination of Domo AutoML and Amazon SageMaker Autopilot will help you more quickly adapt to incoming data to make business decisions faster.

Integration Overview

Architecture

The architecture diagram in Figure 1 shows how the Domo platform is integrated with Amazon SageMaker Autopilot and Amazon SageMaker Batch Transform, which automatically manages the processing of large datasets.

Figure 1 – AutoML processing using Domo and Amazon SageMaker Autopilot.

The Domo platform, hosted on AWS, connects customer data from the cloud, on-premises and/or proprietary systems, and processes them in real-time.
Domo integrates with Amazon SageMaker Autopilot to automatically train ML models with customer data.
Domo integrates with Amazon SageMaker Batch Transform to make predictions on customer data with the ML model created via Amazon SageMaker Autopilot.
Domo presents the result to customer via user interface (UI) and provides ability to export data back to customer source system.

Data Flow

The diagram in Figure 2 shows the automation of the M process using Domo AutoML, powered by Amazon SageMaker Autopilot and Amazon SageMaker Batch Transform.

Figure 2 – AutoML process data flow.

Customers import data from cloud, on-premises and/or proprietary systems into Domo platform via Domo connectors.
Customers manage the dataset using the Domo console UI.
Customers prepare and clean the data with Domo magicETL (Extract, Transform, Load).
Customers create an AutoML job in Domo console with the specific data set which is processed by Amazon SageMaker Autopilot.
Customers review the training result and pick the best candidate from the result of training job.
Prediction/inference is made as the data set gets updated with new data that’s processed by Amazon SageMaker Batch Transform.
Customers view the prediction via visualization in the Domo console UI.

Getting Data Ready for Supervised Learning

There are multiple ways to bring, process, and transform your data into Domo platform:

Pre-Domo Processing Options: You’ll generally use Workbench and Domo connectors to bring data into Domo, and you can perform some minimal transformations within Workbench and certain connectors before you upload the data into Domo.
In-Domo Processing Options: After you’ve brought your data into Domo, you’ll generally use Magic Transforms (ETL, MySQL, Amazon Redshift, or DataFusion) to create and transform new datasets within Domo.
In-Card Processing Options: After you’ve created datasets and started to build cards with them, you can use Beast Mode within Analyzer to add any additional dimensions or calculations that you need.

How to Use Domo AutoML

These are the steps for anyone to onboard into Domo and start using AutoML:

Upload data into Domo for AutoML.
Building a machine learning model on training data.
Model deployment and inference via AutoML.

Step 1: Upload Your Data into Domo

After logging into your Domo’s instance, click on the Data tab.

Domo has five different ways to bring data into your Domo instance:

Cloud Apps: There are 664 Cloud App Connectors available; Amazon Simple Storage Service (Amazon S3) is also one of the data connectors.
Files: Direct file upload from your local machine.
Database
On-premises
Via API

1(a) Import CSV File via UI into Domo to Start Auto ML

By selecting File option, you can drag and drop your CSV file into Domo.

1(b) Upload Data From Your S3 Bucket into Domo

In your AWS account, create an AWS Identity and Access Management (IAM) user who has access to the S3 bucket which contains the data to be upload into Domo for AutoML.
Get the Access Key Id, Secret Access Key, Bucket Name, and Region.
Click on the CLOUD APP tab.

Click on Amazon S3 Advanced and then GET THE DATA.

In the Credentials drop-down, select Add Account and provide your AWS account IAM user Access Key Id, Secret Access Key, Bucket Name, and Region.

If the credentials are correct, your Domo instance will be able to connect with S3 and ask you to provide the details of the object(s) such as filename, file type, file compression type, and delimiter character.
Note that Domo supports only tabular format files currently.
For real-time data, Domo provides a way to pull the delta, the updated portion of the file alone, from S3.
You can schedule the data pull every hour, every day at particular time; every weekday, every week; or every month.
If you want to upload only once, select Manually option in the drop-down.

Add name and description for your dataset and click SAVE.
Once the data is uploaded successfully, it will be available under Datasets.

Step 2: Build an ML Model on Your Training Data

Select the data set you uploaded, on which the ML model needs to be trained.
In the Overview tab, you can view the list of options: Direct impact, Create visualization, share this dataset, Tidy up this dataset, Create a new alert, Train new AutoML.

By clicking Train New AutoML, you can start the model training on your training dataset.
Once the mode is trained, you’ll be able to see the results under AutoML when you click on your dataset.
The result includes: number of test runs; model algorithm used (XGBoost/Linear learner); ML task type (classification/regression); prediction column/label name; model parameters; and performance metrics (such as prediction accuracy, training error, and validation error).

Step 3: Model Deployment and Inference via AutoML

Once the model training is done, click on Deploy algorithm.
This will deploy the best model (out of 100 test runs in this example) and does inference on the testing data.
You can upload the testing data into Domo as mentioned in Step 2.

Freddy’s Case Study

Freddy’s Frozen Custard & Steakburgers is a fast-casual restaurant offering a unique combination of cooked-to-order Steakburgers, Vienna beef hot dogs, shoestring fries, and other savory items along with freshly churned frozen custard treats.

Founded in 2002 and franchised in 2004, Freddy’s has nearly 400 restaurants across 32 states in the U.S.

Freddy’s was seeking a way to easily understand variations in taste of food scores across restaurant franchises to optimize menus and incentivize guests at each location. This required the evaluation of 18 different data sets spanning more than 100 different columns of information created for each one of Freddy’s restaurants at multiple points in time.

With so many individual columns of data to consider, it was not feasible to manually identify which areas of each restaurant were doing well and which areas needed improvement.

Freddy’s used Domo AutoML, powered by Amazon SageMaker Autopilot, to automatically create machine learning models that extracted useful information from Freddy’s to make predictions based on menu and price changes.

Freddy’s now uses these accurate ML models from within the Domo platform. As their data analysts were already familiar with the Domo platform, they were able to accelerate time to market by 28x.

As a result:

Freddy’s can use 5x larger datasets for accurate predictions.
Freddy’s launched a new guest loyalty program to effectively incentivize high-value guests.
Freddy’s conducted A/B testing to accurately score stores and optimize prices and menu mix.
Freddy’s increased foot traffic in their restaurants year over year, despite the COVID-19 impact, and is experiencing strong comp growth as well.

Conclusion

Machine learning provides insights to complex business problems to automate processes and improve decision making.

Traditionally, training and deploying complex ML models involved multiple steps and required a unique set of skills at each step. It took a long time to complete the entire process.

Domo AutoML, powered by Amazon SageMaker Autopilot, addresses these challenges by automating pipeline steps and expediting the entire process. This helps organizations improve decision making and adapt faster to business changes.

Visit the Domo website to see the AutoML solution in action. You may also contact Domo to start a free trail.

.

.

Domo – AWS Partner Spotlight

Domo is an AWS Machine Learning Competency Partner and modern, cloud-based business management platform that puts real-time data directly in the hands of everyone in the organization.

Contact Domo | Partner Overview | AWS Marketplace

*Already worked with Domo? Rate the Partner

*To review an AWS Partner, you must be a customer that has worked with them directly on a project.