Containers
Tag: Machine Learning
Powering the Next Generation of AI Workloads on Amazon EKS with Anyscale
Ray is an open-source framework that manages, executes, and optimizes compute needs for AI workloads. It is designed to make it easy to write parallel and distributed Python applications by providing a simple and intuitive API for distributed computing. Ray unifies infrastructure by leveraging any compute instance and accelerator on AWS via a single, flexible […]
Host the Whisper Model with Streaming Mode on Amazon EKS and Ray Serve
OpenAI Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. It has demonstrated strong ASR performance across various languages, including the ability to transcribe speech in multiple languages and translate them into English. The Whisper model is open-sourced under the Apache 2.0 license, making it accessible for developers to build useful […]
Quora achieved 3x lower latency and 25% lower Costs by modernizing model serving with Nvidia Triton on Amazon EKS
Introduction Quora is a leading Q&A platform with a mission to share and grow the world’s knowledge, serving hundreds of millions of users worldwide every month. Quora uses machine learning (ML) to generate a custom feed of questions, answers, and content recommendations based on each user’s activity, interests, and preferences. ML drives targeted advertising on […]
Distributed machine learning with Amazon ECS
Running distributed machine learning (ML) workloads on Amazon Elastic Container Service (Amazon ECS) allows ML teams to focus on creating, training and deploying models, rather than spending time managing the container orchestration engine. With a simple architecture, control plane transparent upgrades, and native AWS Identity and Access Management (IAM) authentication, Amazon ECS provides a great environment […]
Train Llama2 with AWS Trainium on Amazon EKS
Introduction Generative AI is not only transforming the way businesses function but also accelerating the pace of innovation within the broader AI field. This transformative force is redefining how businesses use technology, equipping them with capabilities to create human-like text, images, code, and audio, which were once considered beyond reach. Generative AI offers a range […]
Run Monte Carlo simulations at scale with AWS Step Functions and AWS Fargate
Introduction Organizations across financial services and other industries have business processes that require executing the same business logic across billions of records for their machine learning and compliance needs. Many organizations rely on internal custom orchestration systems or big data frameworks to coordinate the parallel processing of their business logic across many parallel compute nodes. […]
Build a multi-tenant chatbot with RAG using Amazon Bedrock and Amazon EKS
Introduction With the availability of Generative AI models, many customers are exploring ways to build chatbot applications that can cater to a wide range of their end-customers, with each instance of chatbot specializing on a specific tenant’s contextual information, and run such multi-tenant applications at scale with a cost-efficient infrastructure familiar to their development teams. […]
Run Spark-RAPIDS ML workloads with GPUs on Amazon EMR on EKS
Introduction Apache Spark revolutionized big data processing with its distributed computing capabilities, which enabled efficient data processing at scale. It offers the flexibility to run on traditional Central Processing Unit (CPUs) as well as specialized Graphic Processing Units (GPUs), which provides distinct advantages for various workloads. As the demand for faster and more efficient machine […]
How Quora modernized MLOps on Amazon EKS to improve customer experience with scalable ML applications
This blog post was co-written by Lida Li of Quora Introduction Quora is a leading Q&A platform with a mission to share and grow the world’s knowledge, serving hundreds of millions of users worldwide every month. Quora uses machine learning (ML) to generate a custom feed of questions, answers, and content recommendations based on each […]
Actuate uses AWS Fargate for ML-based, real-time video monitoring and threat detection
This post was written in collaboration with Scott Underwood, Jacob Weiss, Tatiana Hanazaki, and Mark Berbera from Actuate AI. The goal at Actuate AI is to leverage technology to make the world a safer place. Our team at Actuate AI aims to do that by using cutting-edge computer vision to reduce the response time of […]