AWS Lambda expands response streaming support to all commercial AWS Regions
AWS Lambda response streaming is now available in all commercial AWS Regions, bringing full regional parity for this capability. Customers in newly supported Regions can use the InvokeWithResponseStream API to progressively stream response payloads back to clients as data becomes available.
Response streaming enables functions to send partial responses to clients incrementally rather than buffering the entire response before transmission. This reduces time-to-first-byte (TTFB) latency and is well suited for latency-sensitive workloads such as LLM-based applications as well as web and mobile applications where users benefit from seeing responses appear incrementally. Response streaming supports payloads up to a default maximum of 200 MB.
With this expansion, customers in all commercial Regions can stream responses using the InvokeWithResponseStream API through a supported AWS SDK, or through Amazon API Gateway REST APIs with response streaming enabled. Response streaming supports Node.js managed runtimes as well as custom runtimes.
Streaming responses incur an additional cost for network transfer of the response payload. You are billed based on the number of bytes generated and streamed out of your Lambda function over the first 6 MB. To get started with Lambda response streaming, visit the AWS Lambda documentation.