AWS Compute Blog
Category: Amazon Machine Learning
Designing Serverless Integration Patterns for Large Language Models (LLMs)
This post is written by Josh Hart, Principal Solutions Architect and Thomas Moore, Senior Solutions Architect This post explores best practice integration patterns for using large language models (LLMs) in serverless applications. These approaches optimize performance, resource utilization, and resilience when incorporating generative AI capabilities into your serverless architecture. Overview of serverless, LLMs and example […]
Serverless ICYMI Q2 2024
Welcome to the 26th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, live streams, and other interesting things that you might have missed! In case you missed our last ICYMI, check out what happened last […]
Serverless ICYMI Q1 2024
Welcome to the 25th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, live streams, and other interesting things that you might have missed! In case you missed our last ICYMI, check out what happened last […]
Generative AI Infrastructure at AWS
Building and training generative artificial intelligence (AI) models, as well as predicting and providing accurate and insightful outputs requires a significant amount of infrastructure. There’s a lot of data that goes into generating the high-quality synthetic text, images, and other media outputs that large-language models (LLMs), as well as foundational models (FMs), create. To start, […]
The attendee’s guide to the AWS re:Invent 2023 Compute track
This post by Art Baudo – Principal Product Marketing Manager – AWS EC2, and Pranaya Anshu – Product Marketing Manager – AWS EC2 We are just a few weeks away from AWS re:Invent 2023, AWS’s biggest cloud computing event of the year. This event will be a great opportunity for you to meet other cloud […]
Building a serverless document chat with AWS Lambda and Amazon Bedrock
This post is written by Pascal Vogel, Solutions Architect, and Martin Sakowski, Senior Solutions Architect. Large language models (LLMs) are proving to be highly effective at solving general-purpose tasks such as text generation, analysis and summarization, translation, and much more. Because they are trained on large datasets, they can use a broad generalist knowledge base. […]
Optimizing GPU utilization for AI/ML workloads on Amazon EC2
This blog post is written by Ben Minahan, DevOps Consultant, and Amir Sotoodeh, Machine Learning Engineer. Machine learning workloads can be costly, and artificial intelligence/machine learning (AI/ML) teams can have a difficult time tracking and maintaining efficient resource utilization. ML workloads often utilize GPUs extensively, so typical application performance metrics such as CPU, memory, and […]
Amazon EC2 DL1 instances Deep Dive
This post is written by Amr Ragab, Principal Solutions Architect, Amazon EC2. AWS is excited to announce that the new Amazon Elastic Compute Cloud (Amazon EC2) DL1 instances are now generally available in US-East (N. Virginia) and US-West (Oregon). DL1 provides up to 40% better price performance for training deep learning models as compared to […]
Deploying machine learning models with serverless templates
This post written by Sean Wilkinson, Machine Learning Specialist Solutions Architect, and Newton Jain, Senior Product Manager for Lambda After designing and training machine learning models, data scientists deploy the models so applications can use them. AWS Lambda is a compute service that lets you run code without provisioning or managing servers. Lambda’s pay-per-request billing, automatic […]
Building deep learning inference with AWS Lambda and Amazon EFS
This post shows how you can use EFS for Lambda to deploy large DL libraries and models into a function for synchronous invocations.