Artificial Intelligence

Frank Liu

Author: Frank Liu

Deploy large models on Amazon SageMaker using DJLServing and DeepSpeed model parallel inference

The last few years have seen rapid development in the field of natural language processing (NLP). Although hardware has improved, such as with the latest generation of accelerators from NVIDIA and Amazon, advanced machine learning (ML) practitioners still regularly encounter issues deploying their large language models. Today, we announce new capabilities in Amazon SageMaker that […]

Model serving in Java with AWS Elastic Beanstalk made easy with Deep Java Library

Deploying your machine learning (ML) models to run on a REST endpoint has never been easier. Using AWS Elastic Beanstalk and Amazon Elastic Compute Cloud (Amazon EC2) to host your endpoint and Deep Java Library (DJL) to load your deep learning models for inference makes the model deployment process extremely easy to set up. Setting […]

Model serving made easier with Deep Java Library and AWS Lambda

Developing and deploying a deep learning model involves many steps: gathering and cleansing data, designing the model, fine-tuning model parameters, evaluating the results, and going through it again until a desirable result is achieved. Then comes the final step: deploying the model. AWS Lambda is one of the most cost effective service that lets you run code without […]