Networking & Content Delivery
Securing PartyRock: How we protect Amazon Bedrock endpoints using AWS WAF
PartyRock is an intuitive, hands-on generative AI app-building playground based on Amazon Bedrock. It allows users to experiment with generative AI technologies and build fun applications without coding, such as quiz generators or resume optimizers. Although providing a free generative AI playground online offers immense value to builders, it also presents significant security challenges.
In this post, we explore how we used AWS WAF to protect PartyRock from potential threats such as Distributed Denial of Service (DDoS) and Denial of Wallet (DoW). In this post, developers who are integrating generative AI capabilities into their online applications can learn about concrete AWS WAF-based techniques to protect their applications from similar threats.
The security challenges
Both threats we are addressing in this post can impact the availability and user experience of our free generative AI playground:
- DDoS attacks: A DDoS attack overwhelms your resources with a flood of traffic, potentially taking your service offline. PartyRock could be targeted by actors attempting to compromise its availability, damage its reputation, or demand ransom.
- Service abuse: Malicious actors might attempt to exploit the platform for unintended purposes, such as reselling generic generative AI compute power. They exploit cloud services’ ability to automatically scale resources (cloud elasticity) in our use case by generating massive numbers of app plays on PartyRock. Given that an average app play on PartyRock consumes 2000 input tokens and 400 output tokens using the Sonnet 3 model of Anthropic on Bedrock, every one thousand app plays costs us approximately 12$. Unchecked abuse of app plays could lead to DoW situations, where unexpected and potentially significant cloud bills allow malicious actors to turn the cloud’s flexibility against PartyRock.
To address these challenges, we implemented robust security controls from the beginning, with AWS WAF playing a central role in our defense strategy.
PartyRock’s serverless architecture
Before diving into our security measures, we review PartyRock’s serverless architecture, which we deployed using AWS Cloud Development Kit (AWS CDK):
- Amazon CloudFront: Serves as the entry point of PartyRock, providing better user experiences through caching and accelerating dynamic API calls. We implement AWS WAF protections, such as Bot Control rules, at the CloudFront edge.
- Amazon S3: Hosts the static frontend of PartyRock.
- AWS Lambda: Powers our React application and APIs by running application logic and integrating with other infrastructure components such as Amazon Cognito and Bedrock. We make the Lambda function available through CloudFront to reduce latency and cost. We configure Lambda URL directly as origins of CloudFront. In our case, the APIs don’t need advanced capabilities that are usually provided by API gateways, allowing us to keep the architecture simple with fewer necessary building blocks, and making it cheaper and faster. In addition, we configured the Lambda URL with Origin Access Control, a feature of CloudFront, to make sure only our CloudFront distribution can communicate with it. The Lambda function responsible for interfacing with Bedrock is configured with response streaming to stream Bedrock responses back to users.
- Amazon Cognito: Manages user authentication through social identity providers such as Google, Amazon, and Apple.
- Amazon Aurora Serverless: Serves as our PostgreSQL database, offering rich features and automatic scaling. Amazon Aurora Serverless is an on-demand managed database, that reduces the heavy lifting needed, allowing us to focus and quickly iterate on the features.
- Amazon CloudWatch RUM: Analyzes client-side performance and errors.
Implementing multi-layered defenses with AWS WAF
Defense in depth is a cybersecurity strategy that employs multiple layers of security controls to protect an infrastructure. Rather than relying on a single security measure, this approach uses a combination of various defensive mechanisms. The idea is that if one layer of security fails or is breached, other layers are still in place to detect, prevent, or mitigate the attack. We used a combination of AWS WAF rules, with an authentication step, to protect PartyRock.
IP reputation
We configured Amazon IP reputation list, a managed rule group based Amazon internal threat intelligence. It is useful if you would like to block IP addresses typically associated with bots or other threats. It includes three rules:
- AWSManagedIPReputationList: Inspects for IP addresses that have been identified as actively engaging in malicious activities.
- AWSManagedReconnaissanceList: Inspects for connections from IP addresses that are performing reconnaissance against AWS resources.
- AWSManagedIPDDoSList: Inspects for IP addresses that have been identified as actively engaging in DDoS activities.
Real-world impact: In February 2024, the AWSManagedIPReputationList rule helped us identify and block two hundred thousand requests in tens of minutes related to malicious activities, without impacting the experience of our users.
Rate limiting
Rate-based rules are designed to prevent web applications from being overwhelmed by numerous requests within a short period of time. This is particularly useful in mitigating DDoS attacks, bot attacks, and malicious traffic. The rule aggregates requests according to defined criteria, and rate limits the aggregate groupings, based on the rule’s evaluation window, request limit, and action settings.
We implemented multiple rate-based rules to prevent our application from being overwhelmed:
- A blanket rate limit to prevent any single source IP from overwhelming the website.
- A cookie-based rate limit to prevent a single user session from overwhelming the website, regardless of their used IP addresses.
Real-world impact: In June 2024, both rules helped us block millions of malicious requests trying to overwhelm PartyRock infrastructure.
Custom mitigation rules
We created some custom rules that help us quickly respond to DDoS attacks that could need a manual response in AWS WAF. They are configured to block requests with certain attributes (for example IP or the TLS fingerprint (JA3)) that match a value from the list of values we define. In normal circumstances, they don’t contain matching statements, consequently not blocking any request. If we need to respond to a DDoS attack manually, we first analyze the attack using CloudWatch Logs Insights to identify top talkers (for example the JA3 signatures that most contribute to the attack). Then, we add a corresponding matching statement in the created custom rules. This helps us respond faster to threats.
Real-world impact: In January 2024, we used the JA3-based custom rule to block sustained HTTP floods generating millions of requests per hour.
Hardening the security of registration step with CAPTCHA
The registration step is critical for the security of PartyRock. If malicious actors create fake accounts, then they could abuse our free generative AI compute power.
As a tradeoff to making PartyRock free, we decided to exclusively allow registration using federated identities from three providers: Amazon, Apple, and Google. In the registration step, they first have to log in through one of these identity providers and go through their verification steps, then authorize PartRock to access the necessary information. Finally, before creating the user account on PartyRock, an AWS WAF custom rule presents a CAPTCHA as a further verification step.
Using the CAPTCHA Javascript API, we changed our frontend code to display the CAPTCHA form within the PartyRock web application, making the registration experience smoother. Read this Amazon Networking post for more information about possible CAPTCHA integrations.
Advanced management of automated bot traffic
The last layer of security that we configured in AWS WAF is Bot Control. Our goal is to identify and manage automated bot traffic, regardless of whether the traffic comes from an anonymous or logged-in user. The Bot Control managed rule group provides two levels of protection:
- A basic, common inspection level that adds labels to self-identifying bots, verifies generally desirable bots, and detects high confidence bot signatures.
- A targeted inspection level that adds detection for sophisticated bots that don’t self identify.
Real-world impact: In the following graphs, you can observe how the common inspection level of Bot Control helped us block different HTTP floods throughout the year.
The Targeted inspection of Bot Control, provides advanced detections, based on techniques such as browser interrogation, fingerprinting, and behavior heuristics to identify bad bot traffic. These detections need the the AWS WAF Javascript Challenge running on the client side.
For this we integrated the Bot Control Javascript SDK in our pages. When a page loads, the SDK runs the Javascript challenge asynchronously to avoid impacting the user experience. When a challenge is successfully solved, the SDK places an encrypted token in the cookies. The token is sent in every request from the user’s session to AWS WAF, allowing it to analyze the user behavior. The token also includes detected fingerprints of automated browsers. In addition, the SDK collects telemetry during the user navigation to enrich the detection and mitigation logic further.
It’s asynchronous, thus we need to allow the first requests, even if they don’t carry a valid token. Thanks to the TGT_VolumetricIpTokenAbsent rule in Bot Control, if our application receives an increase in requests from a single IP without the token, then AWS WAF forces the challenge using the JavaScript challenge payload that silently completes the challenge, then redirects to the original page.
Real-world impact: TGT_VolumetricIpTokenAbsent protected PartyRock from hundreds of thousands of requests that were missing the token in June 2024. The second graph shows how Bot Control challenged with CAPTCHA requests that were coming from clients with signals of browser automation framework.
As mentioned before, Bot Control analyzes the behavior of a user session using an encrypted token, acquired after successfully solving the Javascript Challenge. It allowed us to identify sessions with higher than usual volume of requests and challenge them with CAPTCHA responses.
Real-world impact: The following graphs show requests challenged with CAPTCHA throughout 2024 because of elevated per session traffic levels to PartyRock.
Parting thoughts
To maintain an open, creative environment for generative AI experimentation while protecting against potential threats and abuse, we’ve created robust multi-layered defenses for PartyRock. Using AWS WAF, we implemented the following rules at a high level:
- IP reputation based rules to block requests coming from IPs that are associated with malicious activities, such as DDoS attacks.
- Different flavors of rate limits to prevent any single IP or session from overwhelming PartyRock’s infrastructure.
- Custom rules to block requests based on suspicious attributes that are identified during incident response.
- CAPTCHA rule for a further verification step during the user registration workflow.
- Bot Control managed rule to identify automated traffic coming from botnets and manage it appropriately.
This AWS WAF configuration isn’t static. Through the continuous monitoring and analysis of our application logs, we regularly refine and expand our ruleset to address emerging threats and patterns that we observe in day-to-day operations.
We also protected our Cognito User Pools using another set of AWS WAF rules. You can learn more about protecting Cognito User Pools with AWS WAF in the documentation.
To get started with AWS WAF, watch this short video. You can find more in-depth information in the AWS WAF documentation.
To learn more how to use AWS WAF to avoid cost-prohibitive traffic in large language model (LLM) apps, watch this talk from re:Inforce 2024.