AWS Compute Blog
Category: Serverless
Deploying AI models for inference with AWS Lambda using zip packaging
Users usually package their function code as container images when using machine learning (ML) models that are larger than 250 MB, which is the Lambda deployment package size limit for zip files. In this post, we demonstrate an approach that downloads ML models directly from Amazon S3 into your function’s memory so that you can continue packaging your function code using zip files.
How to export to Amazon S3 Tables by using AWS Step Functions Distributed Map
In this post, we show how to use Step Functions Distributed Map to process Amazon S3 objects and export results to Amazon S3 Tables, creating a scalable and maintainable data processing pipeline.
Accessing private Amazon API Gateway endpoints through custom Amazon CloudFront distribution using VPC Origins
This post demonstrates how you can connect CloudFront with a Private REST API in Amazon REST API Gateway using a VPC origin.
Building resilient multi-Region Serverless applications on AWS
This post presents architectural best practices for building resilient serverless applications, demonstrated through a multi-Region serverless authorizer implementation.
Serverless generative AI architectural patterns – Part 2
This post explores two complementary approaches for non-real-time scenarios: buffered asynchronous processing for time-intensive individual requests, and batch processing for scheduled or event-driven workflows.
Serverless generative AI architectural patterns – Part 1
This two-part series explores the different architectural patterns, best practices, code implementations, and design considerations essential for successfully integrating generative AI solutions into both new and existing applications. In this post, we focus on patterns applicable for architecting real-time generative AI applications.
Effectively building AI agents on AWS Serverless
Imagine an AI assistant that doesn’t just respond to prompts – it reasons through goals, acts, and integrates with real-time systems. This is the promise of agentic AI. According to Gartner, by 2028 over 33% of enterprise applications will embed agentic capabilities – up from less than 1% today. While early generative AI efforts focused […]
Implementing message prioritization with quorum queues on Amazon MQ for RabbitMQ
Quorum queues are now available on Amazon MQ for RabbitMQ from version 3.13. Quorum queues are a replicated First-In, First-Out (FIFO) queue type that uses the Raft consensus algorithm to maintain data consistency. Quorum queues on RabbitMQ version 3.13 lack one key feature compared to classic queues: message prioritization. However, RabbitMQ version 4.0 introduced support […]
Building resilient multi-tenant systems with Amazon SQS fair queues
Today, AWS introduced Amazon Simple Queue Service (Amazon SQS) fair queues, a new feature that mitigates noisy neighbor impact in multi-tenant systems. With fair queues, your applications become more resilient and easier to operate, reducing operational overhead while improving quality of service for your customers. In distributed architectures, message queues have become the backbone of […]
Infrastructure as code translation for serverless using AI code assistants
Serverless applications commonly use infrastructure as code (IaC) frameworks to define and manage their cloud resources. Teams choose different IaC tools based on their skills, existing tooling, or compliance needs. As applications grow, the need to shift between IaC formats may arise to adopt new features or align with evolving standards. Developers are rapidly adopting AI-powered […]