Overview
S4 Scan learns from your Athena query history which columns are read and which predicates filter your tables, then rewrites the underlying S3 Parquet into the physical layout those queries want: hot columns first, sorted on the columns you filter by so row-group statistics prune, low-cardinality columns dictionary-encoded, files compacted to an efficient size, and compressed with zstd. Athena and Redshift Spectrum bill per terabyte scanned from S3, so a smaller, better-pruned layout is a direct, recurring bill reduction.
Safety is the core of the product. S4 Scan never overwrites your source data in place: it writes optimized data to a shadow location, verifies that every query returns byte-for-byte identical results (values, nulls, decimal scale, timestamp time zone), and only then swaps the AWS Glue table pointer. One-command rollback restores the original layout at any time; if verification fails the swap is refused. There is no lock-in - output is standard, Athena-readable Parquet using standard codecs.
The dry-run command projects the dollar savings on your real tables before you commit, attributing the reduction honestly across pruning, compression, and compaction. S4 Scan ships as a self-contained AMI that re-optimizes on an EventBridge schedule with no human in the loop, billed per instance per hour through your AWS bill.
Highlights
- Safe by construction: optimized data is written to a shadow location, verified query-result-identical (values, nulls, decimal scale, timestamp tz), then swapped via the Glue catalog - with one-command rollback. Source data is never modified in place.
- See the savings before you commit: a dry-run projects the monthly dollar reduction on your real tables and attributes it honestly to partition / row-group pruning, dictionary + zstd, and small-file compaction.
- No lock-in, no babysitting: output is standard Athena-readable Parquet, and the AMI re-optimizes on an EventBridge schedule so savings compound automatically.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/hour |
|---|---|---|
c6i.xlarge Recommended | Hourly software fee for c6i.xlarge | $0.75 |
t3.large | Hourly software fee for t3.large | $0.75 |
t3.medium | Hourly software fee for t3.medium | $0.75 |
m5.large | Hourly software fee for m5.large | $0.75 |
m5.xlarge | Hourly software fee for m5.xlarge | $0.75 |
c5.xlarge | Hourly software fee for c5.xlarge | $0.75 |
Vendor refund policy
Email support@abyo.net within 30 days of charge for refund requests; refunds are evaluated case by case.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
64-bit (x86) Amazon Machine Image (AMI)
Amazon Machine Image (AMI)
An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.
Version release notes
v1.1: real partition optimization (flat to partitioned, safe Glue swap), and optional per-GB usage metering.
Additional details
Usage instructions
Deploy via deploy/cloudformation/s4scan-quickstart.yaml. Set EnableMetering=true (default) for per-GB usage billing in addition to the hourly/annual fee.
Support
Vendor support
Email support at support@abyo.net .
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products

