Containers

Tag: Machine Learning

Distributed machine learning with Amazon ECS

Running distributed machine learning (ML) workloads on Amazon Elastic Container Service (Amazon ECS) allows ML teams to focus on creating, training and deploying models, rather than spending time managing the container orchestration engine. With a simple architecture, control plane transparent upgrades, and native AWS Identity and Access Management (IAM) authentication, Amazon ECS provides a great environment […]

Train Llama2 with AWS Trainium on Amazon EKS

Introduction Generative AI is not only transforming the way businesses function but also accelerating the pace of innovation within the broader AI field. This transformative force is redefining how businesses use technology, equipping them with capabilities to create human-like text, images, code, and audio, which were once considered beyond reach. Generative AI offers a range […]

Run Monte Carlo simulations at scale with AWS Step Functions and AWS Fargate

Introduction Organizations across financial services and other industries have business processes that require executing the same business logic across billions of records for their machine learning and compliance needs. Many organizations rely on internal custom orchestration systems or big data frameworks to coordinate the parallel processing of their business logic across many parallel compute nodes. […]

Build a multi-tenant chatbot with RAG using Amazon Bedrock and Amazon EKS

Introduction With the availability of Generative AI models, many customers are exploring ways to build chatbot applications that can cater to a wide range of their end-customers, with each instance of chatbot specializing on a specific tenant’s contextual information, and run such multi-tenant applications at scale with a cost-efficient infrastructure familiar to their development teams. […]

Run Spark-RAPIDS ML workloads with GPUs on Amazon EMR on EKS

Introduction Apache Spark revolutionized big data processing with its distributed computing capabilities, which enabled efficient data processing at scale. It offers the flexibility to run on traditional Central Processing Unit (CPUs) as well as specialized Graphic Processing Units (GPUs), which provides distinct advantages for various workloads. As the demand for faster and more efficient machine […]

How Quora modernized MLOps on Amazon EKS to improve customer experience with scalable ML applications

This blog post was co-written by Lida Li of Quora Introduction Quora is a leading Q&A platform with a mission to share and grow the world’s knowledge, serving hundreds of millions of users worldwide every month. Quora uses machine learning (ML) to generate a custom feed of questions, answers, and content recommendations based on each […]

Title img: Actuate uses AWS Fargate for ML-based, real-time video monitoring and threat detection

Actuate uses AWS Fargate for ML-based, real-time video monitoring and threat detection

This post was written in collaboration with Scott Underwood, Jacob Weiss, Tatiana Hanazaki, and Mark Berbera from Actuate AI. The goal at Actuate AI is to leverage technology to make the world a safer place. Our team at Actuate AI aims to do that by using cutting-edge computer vision to reduce the response time of […]

Autonomous ML-based detection and identification of root cause for incidents in microservices running on EKS

This blog was co-written with Gavin Cohen, VP of Product at Zebrium. Overview If you’ve never experienced the frustration of hunting for root cause through huge volumes of logs, then you’re one of the few lucky ones! The process typically starts by searching for errors around the time of the problem and then scanning for […]

Advertising click-prediction modeling on Amazon EKS

In digital advertising, the ad click-through rate (CTR) model predicts the probability of a click given the ads and context x (for example, shopping query, time of the day, device). The output of a CTR model can be seen as a conditional probability p(y = click|x). A precise estimation of this probability influences our ability […]