AWS Partner Network (APN) Blog

Nolan Chen

Author: Nolan Chen

Nolan Chen is a Partner Solutions Architect at AWS, where he helps Startups build innovative solutions using the cloud. Prior to AWS, Nolan specialized in data security and helping customers deploy high performing wide area networks. Nolan holds a Bachelor’s Degree in Mechanical Engineering from Princeton University.

Running GenAI Inference with AWS Graviton and Arcee AI Models

The growing demand for generative AI (GenAI) applications has led to a corresponding demand for compute resources that can run these workloads efficiently. In this post we share a step-by-step guide for optimizing GenAI inference workloads using AWS Graviton-based instances. We walk you through downloading Arcee AI SLMs, applying quantization techniques, and deploying models for efficient inference on AWS Graviton instances.