SageMaker Takes the Heavy Lifting Out of Machine Learning

Kumar Venkateswar, Sagemaker PM Crypto. Fintech. Big Data. The technology industry loves buzzwords, but with so many new ideas and themes, it’s hard to know which ones will actually break through and have an industry-changing impact.

One moment that is here to stay is the use of artificial intelligence (AI) and Machine Learning (ML). While once only considered the domain of futurists and sci-fi movies, AI/ML has found new applications in a variety of verticals, thanks to the introduction of products like Amazon SageMaker.

Prior to SageMaker, many startups were unable to take advantage of the benefits of AI/ML because they didn’t have the resources necessary to build and train large-scale data models. Now, even early-stage companies have the ability to build, train, and deploy machine learning models quickly and easily, decreasing the investment needed while also increasing the number of ways ML can be applied.

To get a better understanding of the service and how it came to be, we sat down with one of the product managers of SageMaker, Kumar Venkateswar. He told us about what the team set out to build, the problems they were looking to fix, and what startups are currently using the service.

AWS Startups: In a nutshell, what is SageMaker and what does it do?

Kumar Venkateswar: Sagemaker is an end-to-end ML service that enables data scientists, developers, and machine learning experts to quickly and cost-effectively start building, training, and deploying models.

How has SageMaker improved upon the previous methods for ML?

SageMaker helps in three main ways: First, it reduces the cost dramatically. Companies used to have to hire infrastructure engineers who were dedicated to data science, as well as pay to set up EMR clusters just for ML. Sagemaker handles the infrastructure-heavy lifting, decreasing the time commitment needed from engineers, and also eliminates the need for standing infrastructure to support ML activity. This means you can scale your systems up and down depending on your immediate needs.

Secondly, it makes the whole process easier while also being flexible. One of the coolest things about SageMaker is that it supports all the major frameworks. Not only does that mean you can use either Tensorflow, MXNet, Caffe2 or Gluon, but you can also run them all in relatively the same way without the need to do a whole lot of editing and changing. This gives users the flexibility to swap out frameworks and algorithms as they wish, an openness that is noticeable throughout many services in the AWS ecosystem.

Lastly, SageMaker makes the process quicker as well. The platform takes care of a lot of the prep work, which decreases the amount of time needed to run an ML test. This enables users to iterate more quickly and opens up the potential applications of ML.

One of the cooler SageMaker features for startups is that it comes with preloaded algorithms. Could you explain what that means and how they were chosen?

Definitely. So, like you said, SageMaker comes preloaded with 12 ready-to-use algorithms that can be customized by each user to build a unique model. We included that set as they’re the most commonly used throughout the industry. 95% of ML is not deep learning or the cutting-edge stuff, but rather more simple algorithms applied in interesting ways. We took those commonly used algorithms—like linear learning, Xgboost, random forest, and gradient boosted trees—and made it so anyone can easily apply them to their specific situation.

So, it’s clear that SageMaker cuts down the cost of the necessary components toset up an ML model, but is it actually cheap enough for a startup to train and deploy these models?

Absolutely! We already have startup customers that train their models, some daily, for pretty affordable amounts, such as Convoy, Hudl, and Figure Eight. [Ed note: See below for more information about these companies!]

How did you come to work on SageMaker?

I came to Amazon in 2016 to concentrate specifically on ML initiatives. After talking with a ton of customers to learn what pain points we could help with, I found that I kept hearing similar issues that I had previously dealt with while working in the ML applications space.

At the time, there were a lot of repeatable activities users needed to go through when setting up an ML experiment. Those included creating a notebook environment to do data exploration, setting up a cluster in order to do distributed training, and then setting up a distributed environment to perform inferences at scale. You’d have to do these things as one-offs each time, which made it so that people would only use ML in critical situations because of the time and money investment needed. We built SageMaker to address these problems in a very Amazon way: with an emphasis on scalability and availability.

We’ve talked about how SageMaker is built to be easy for beginners. Does that mean more advanced AI/ML users won’t find it as helpful?

While we have built it to be easily used for everyday ML, it’s by no means limited to that—we are also pushing the boundaries. A good example of this is the Automatic Model Tuning feature, which is an expert tool for experts. All that to say, we’ve worked hard to make sure the platform scales well with any user’s expertise.

For example, there are some really interesting ways companies are leveraging Sagemaker in the autonomous vehicle space, as well as the video analytics space. These are the types of problems where customers need to scale the solutions over 50-100 Volta GPUs at once, which necessitates enormous compute power, but something that is doable with SageMaker.

Startups using Sagemaker

“At Convoy we provide a freight-shipping network, connecting available truckers and trucking capacity with companies looking to optimize their supply chains. One of the ways we use machine learning is to build our supply availability model, which forecasts the activity within our ecosystem to help us make more informed decisions. The biggest benefit we’ve seen from Sagemaker so far is that it enables our engineering team and data science team to easily collaborate, which also makes it easier to deploy models to production. With that, each team can focus on their own objectives, while still driving towards a shared goal.”

-David Tsai, Growth & Marketplace Engineering at Convoy

“Hudl uses Machine Learning and Deep Learning in a variety of ways, including automated player tracking and tagging of sports videos when interesting/notable events occur. Although we’re still getting ramped up using Sagemaker, our team is bullish on the service as a whole and we expect it to save us time training Tensorflow models. Previously, we were doing this on a couple local development servers, which can create a bottleneck when multiple people are looking to use a single GPU—Sagemaker accelerates that whole process. Also, we’re big fans of the Jupyter Notebook instances.”

-Ben Cook, Research Director at Hudl

“Our team at Figure Eight has built a platform that trains, tests, and tunes machine learning models for companies in a variety of industries. For example, one of our previous projects was working with an online marketplace to automatically shorten product titles—which can often be 40-50 words long—so they could then be read aloud for a conversational shopping experience. A standout Sagemaker feature for us is its ability to take care of provisioning and hosting. Historically, our team often had to spend more time on those parts of the process than actually building out the models. Sagemaker has enabled us to both spend more time focusing on important processes and increase the rate at which we prototype solutions.”

-Joan Xiao, Lead Machine Learning Scientist at Figure Eight

AWS Startups Blog

SageMaker Takes the Heavy Lifting Out of Machine Learning

Startups using Sagemaker

Resources

Follow

Learn

Resources

Developers

Help