Listing Thumbnail

    S4 Firewall - LLM Token Budget and Runaway Loop Control for Amazon VPC

     Info
    Deployed on AWS
    S4 Firewall is an in-VPC forwarding proxy that puts a pre-emptive spend firewall in front of your LLM traffic. Point your application's base_url at it (OpenAI-compatible, Anthropic Messages-compatible, or Bedrock-compatible); every request runs a synchronous attribute -> reserve -> budget/anomaly -> forward -> reconcile pipeline. Its headline job is the runaway-loop circuit breaker: a deterministic hard cap that blocks a request before it is relayed once a tenant, feature, or customer budget would be exceeded, plus best-effort detection of agent loops and near-duplicate call chains to contain a runaway agent before it burns the budget. Token spend is attributed per feature/tenant/customer and emitted to CloudWatch and an optional counts-only audit ledger. Ships as an Amazon Linux 2023 arm64 AMI with a least-privilege IAM role and one-click CloudFormation - no control plane, no database. Billed per instance per hour, with an annual option.

    Overview

    S4 Firewall is a forwarding proxy you run inside your own Amazon VPC to put a budget and a circuit breaker in front of your LLM token spend. Your application changes only its base_url: S4 Firewall exposes an OpenAI-compatible, an Anthropic Messages-compatible, and a Bedrock-compatible intake, relays each request to the upstream provider your application already uses, and returns the upstream response (including streaming responses, passed through chunk by chunk without buffering so time-to-first-token is preserved).

    Headline capability - the runaway-loop circuit breaker. Every request runs through a synchronous in-memory pipeline before it is relayed: attribute the request to a feature, tenant, and customer; reserve the worst-case cost (input tokens counted now, output priced at max_tokens times the output rate); check the reservation against the hierarchy of budgets; then either forward or block. There are two layers, kept honestly distinct. Layer 1, the hard cap, is deterministic and pre-emptive: cumulative spend is known, so any request whose reservation would push the running total past a configured hard cap is blocked before it is relayed - a 100 percent pre-emptive block of over-cap requests, with the same state in producing the same decision out, fixed by chaos tests. Layer 2, the loop block, is best-effort and behavioral: a runaway is only knowable after a few calls (those few are already billed), so this layer detects agent loops, near-duplicate call chains, and in-session amplification and bounds the blast radius - containing the runaway within a small number of requests or a small dollar amount. Layer 2 is explicitly best-effort, not a 100 percent guarantee, and ships with a conservative default plus a dry-run shadow mode so you can measure the false-block rate before you enforce.

    Honest budgeting under output uncertainty. The number of output tokens is unknowable until the response returns, so stopping before the bill is incurred is reserve-then-reconcile, not a flat estimate: the reservation uses the worst case to make the hard-cap decision, and when the response returns the provider's reported usage is taken as the source of truth and reconciled against the reservation. Token counts are normalized across providers and split into input, output, cached-read, and cache-write so the per-feature accounting reflects each provider's real rate card.

    Attribution and metering. Tag requests by header or API key to attribute token spend to a feature, tenant, or customer - finer than IAM-principal granularity. Spend rolls up by dimension and is emitted to Amazon CloudWatch (namespace S4/Firewall) and, optionally, to a counts-only append-only S3 audit ledger that records token counts, quantities, and attribution metadata - never prompt or response bodies.

    Data handling - the honest version. S4 Firewall itself does not persist or transmit your prompts or responses. Its only outbound call is the provider request your application would have made anyway; the firewall does not add an egress. The ledger and metrics carry token counts, not content (counts-not-content), fixed by property tests. Where prompts egress depends on the upstream you choose: when you route to Amazon Bedrock through a VPC interface endpoint (PrivateLink, which this AMI can provision), the Bedrock calls stay inside your VPC/AWS boundary; when you route to a third-party provider on the public internet, that traffic egresses to the internet and does not stay in your VPC. The value S4 Firewall provides is that the firewall does not hoard your prompts, does not send them to any third party of its own, and writes no bodies to the ledger - not that your prompts never leave the VPC.

    Operations. No separate control plane and no external database: budget state is held in-memory per instance and re-derived from zero on restart. The data plane is a single static binary running under a hardened systemd unit with zero elevated capabilities and a least-privilege IAM role (upstream model invocation, CloudWatch PutMetricData scoped to the S4/Firewall namespace, and write-only PutObject to the ledger bucket). No telemetry home-call and no license-key check - billing is AMI hourly + annual. Deploy in minutes with the included CloudFormation template, which optionally creates the Bedrock VPC interface endpoint.

    S4 Metrics bundle. S4 Firewall is sold standalone and is also offered as a bundle SKU with S4 Metrics (observability) as a separate Marketplace offer - observe plus enforce in one bill. The bundle is a distinct offer entity; this listing's pricing covers the standalone S4 Firewall product.

    There is no lock-in - it is a normal Amazon Linux 2023 AMI you run in your own VPC, billed per instance per hour through your AWS bill, with an annual contract option.

    Highlights

    • Runaway-loop circuit breaker, pre-emptive: a deterministic hard cap blocks any request before it is relayed once a feature/tenant/customer budget would be exceeded (100 percent pre-emptive block of over-cap requests), plus best-effort detection of agent loops and near-duplicate call chains that bounds the blast radius of a runaway agent before it burns the month's budget. (Layer 2 is best-effort, not a 100 percent guarantee; ships with a dry-run shadow mode.)
    • Per-feature/tenant/customer token attribution and budgets, in your VPC: a forwarding proxy you point your base_url at (OpenAI-compatible, Anthropic Messages-compatible, Bedrock-compatible). Reserve-then-reconcile accounting uses the provider's reported usage as the source of truth and splits input/output/cached/cache-write at each provider's real rate, emitted to Amazon CloudWatch and an optional counts-only audit ledger.
    • No prompts hoarded, no separate control plane: S4 Firewall does not persist or transmit your prompts - the only egress is the provider call your app already makes, and the ledger carries token counts, not content. Optional Amazon Bedrock VPC interface endpoint keeps the Bedrock path inside your AWS boundary. Runs as a single static binary on a standard Amazon Linux 2023 AMI with a least-privilege IAM role and one-click CloudFormation - no external database, no telemetry home-call.

    Details

    Delivery method

    Delivery option
    64-bit (Arm) Amazon Machine Image (AMI)

    Latest version

    Operating system
    AmazonLinux 2023

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    S4 Firewall - LLM Token Budget and Runaway Loop Control for Amazon VPC

     Info
    Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time. Alternatively, you can pay upfront for a contract, which typically covers your anticipated usage for the contract duration. Any usage beyond contract will incur additional usage-based costs.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    Usage costs (12)

     Info
    Dimension
    Cost/hour
    c7g.large
    Recommended
    $0.12
    c6g.medium
    $0.12
    c7g.8xlarge
    $0.12
    c6g.8xlarge
    $0.12
    c6g.large
    $0.12
    c7g.4xlarge
    $0.12
    c6g.4xlarge
    $0.12
    c6g.xlarge
    $0.12
    c7g.xlarge
    $0.12
    c7g.2xlarge
    $0.12

    Vendor refund policy

    Email aws-support@abyo.net  within 30 days of charge for refund requests; refunds are evaluated case by case.

    How can we make this page better?

    Tell us how we can improve this page, or report an issue with this product.
    Tell us how we can improve this page, or report an issue with this product.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    64-bit (Arm) Amazon Machine Image (AMI)

    Amazon Machine Image (AMI)

    An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.

    Version release notes

    Initial release: in-VPC LLM token budget and runaway-loop control.

    Additional details

    Usage instructions

    Deploy via the included CloudFormation (cfn-single.yaml for a single instance, or cfn-ha.yaml for a redundant fleet behind an internal load balancer); point your application base_url at the firewall. See the runbook and docs on the AMI.

    Support

    Vendor support

    Email support at aws-support@abyo.net .

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Similar products

    Customer reviews

    Ratings and reviews

     Info
    0 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    0%
    0%
    0%
    0%
    0 reviews
    No customer reviews yet
    Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.