Artificial Intelligence

Syed Jaffry

Author: Syed Jaffry

Syed Jaffry is a Sr. Solutions Architect with AWS. He works with a range of companies from mid-sized organizations to large enterprises, financial services to ISVs, in helping them build and operate secure, resilient, scalable, and high performance applications in the cloud.

Model hosting patterns in Amazon SageMaker, Part 3: Run and optimize multi-model inference with Amazon SageMaker multi-model endpoints

Amazon SageMaker multi-model endpoint (MME) enables you to cost-effectively deploy and host multiple models in a single endpoint and then horizontally scale the endpoint to achieve scale. As illustrated in the following figure, this is an effective technique to implement multi-tenancy of models within your machine learning (ML) infrastructure. We have seen software as a […]

How to scale machine learning inference for multi-tenant SaaS use cases

This post is co-written with Sowmya Manusani, Sr. Staff Machine Learning Engineer at Zendesk Zendesk is a SaaS company that builds support, sales, and customer engagement software for everyone, with simplicity as the foundation. It thrives on making over 170,000 companies worldwide serve their hundreds of millions of customers efficiently. The Machine Learning team at […]