AWS Partner Network (APN) Blog

Bring context from the physical world to AI with Wherobots on AWS

By: Damian Wylie, Head of Product – Wherobots
By: Ben Pruden, Head of Go-to-Market – Wherobots
By: Damion Harrylal, Senior Solutions Architect – AWS
By: Ragib Ahsan, Worldwide AI Acceleration Architect II – AWS

Wherobots Logo
Wherobots
Wherobots Connect Button

Generating context for AI systems to interpret the physical world on Amazon Web Services (AWS) is a data problem. An insurer needs to reprice wildfire policies using satellite burn boundaries joined with building footprints and weather patterns. A logistics team needs to reroute deliveries using real-time road and storm data. The spatial data for these use cases exists in Amazon Simple Storage Service (Amazon S3) data lakes and publicly available sources, but the complex spatial data processing operations required to make it AI-ready—coordinate transformations, raster imagery joins, and specialized spatial functions such as geostatistics, go beyond what general-purpose analytical systems were designed to handle.

In this post, we show you how to use Wherobots on AWS to turn raw spatial data into AI-ready context. You will see how customers like Leaf Agriculture, SatSure, and Aarden apply Wherobots’ capabilities in production, and find out how to get started.

AWS customers can easily utilize Wherobots to bridge the gap from raw data to context that AI can utilize, at any scale. You can process county-to-global scale spatial data stored in Amazon S3 into context that today’s AI systems can reason about, using a distributed query engine, inference pipelines, and integrations with agents. For the insurer in the wildfire scenario, this means going from raw satellite imagery to a risk-scored portfolio overlay in hours rather than weeks, without building custom infrastructure.

Rise in demand for physical world intelligence

Risk assessment, route optimization, and land monitoring depend on spatial data that AI systems can consume. For teams building on AWS, the more you are answering physical world questions with spatial data, the more demand you have for spatial infrastructure that can keep pace. According to Gartner, earth intelligence opportunities will exceed $20 billion by 2030. Spatial data processing at this scale requires infrastructure designed for spatial workloads — single-machine systems like PostGIS don’t scale, and general-purpose analytical systems do not support the coordinate reference systems, raster imagery joins, or spatial functions these workloads demand. Two principles should shape your approach to this challenge. First, spatial data should be AI-ready, reduced into actionable spatial context through inference and preprocessing workflows. Second, spatial data should be treated as a core data type in your infrastructure. When infrastructure is built around this data at the start, solutions become faster to build, cost less to run, and teams can innovate on shorter cycles.

How Wherobots grounds AI in the physical world

Building AI that understands the physical world generally requires three capabilities:

  • Scalable inference on imagery or sensor data
  • A fast and efficient engine for spatial data processing
  • Interfaces for agentic use

With Wherobots on AWS, you get these three capabilities running on AWS:

Assess risk faster with scalable inference using RasterFlow

Going back to the wildfire example, assessing risk at scale requires running inference on satellite and drone imagery to detect burned areas, vegetation loss, and structural damage. With RasterFlow, you can go from raw earth observation imagery and sensor data to AI-ready spatial features and earth intelligence. This solution lets you run distributed PyTorch model inference across your satellite and drone imagery on demand, with automatic mosaicking, prediction, and vectorization of results loaded into Apache Iceberg tables stored in Amazon S3.

Join and enrich spatial data at scale with WherobotsDB

After you have vectorized spatial features, you need to join, filter, and enrich them with other datasets to produce actionable context. WherobotsDB handles this at scale with native support for raster, vector, and tabular data operations on a distributed, Spark-based architecture that reads directly from your S3 buckets without copying data (zero-copy).

Wherobots was designed from the ground up for the coordinate transformations, raster imagery joins, and spatial functions these workloads require. As shown in Figure 1, WherobotsDB is the only query engine capable of finishing the SpatialBench queries at a scale factor of 1,000 (SF1000) in less than 10 hours. See the Apache Sedona SpatialBench documentation for full methodology, instance types, and test conditions.

Figure 1 - Apache Sedona SpatialBench query capability matrix

Figure 1 – Apache Sedona SpatialBench query capability matrix

In benchmarks run on comparable AWS instance configurations, spatial operations ran significantly faster in WherobotsDB and at a fraction of the cost compared to the next-closest-performing Spark engine, as shown in Figure 2.

 Figure 2 - Apache Sedona SpatialBench aggregated costs at SF1000

 Figure 2 – Apache Sedona SpatialBench aggregated costs at SF1000

Wherobots also led the effort to add native GEO type support to Apache Iceberg, enabling spatial capabilities across an open lakehouse architecture. This means your spatial data can live in the same scalable infrastructure as the rest of your data, reducing vendor lock-in and storage costs.

Query spatial data with AI agents and coding assistants

The Wherobots Model Context Protocol (MCP) server lets large language models (LLMs) and AI agents explore your data lake and lakehouse connected to the Wherobots Data Catalog and other integrated catalogs, generate and run spatial SQL queries, and produce insights from natural language. This brings spatial reasoning into your AI agent workflows through the Spatial AI coding Assistant, so you don’t need to write spatial code. The Wherobots Spatial AI Coding Assistant plugs into integrated development environments (IDEs)—such as Kiro, Visual Studio Code, and Cursor—pairs your LLM with the Wherobots MCP server to understand the spatial context of datasets in Amazon S3, transforming natural language into code that runs on WherobotsDB.

How Wherobots runs on AWS

Wherobots integrates natively with the AWS services your teams already use, so you can keep your existing security model, governance, and billing in place.

Storage. Reads from and writes to Amazon S3 using private connectivity and AWS Identity and Access Management (IAM) role-based access control. Wherobots writes results as Apache Iceberg tables directly to S3.

Deployment. Deploy Wherobots in a VPC within your own Amazon Virtual Private Cloud (Amazon VPC) for full network isolation (in-VPC deployment) or run it serverless with no infrastructure to manage.

Data governance. Wherobots integrates with AWS Glue (including the Data Catalog) or Databricks Unity Catalog on AWS for unified governance across spatial and non-spatial assets.

Orchestration. Integrates with Apache Airflow, including Amazon Managed Workflows for Apache Airflow (Amazon MWAA).

AI and imagery. RasterFlow runs entirely on AWS, running inference on imagery in Amazon S3 and writing vectorized results back as Iceberg tables.

Wherobots is available through AWS Marketplace with consolidated billing. AWS Marketplace simplifies discovery and deployment and gives you a single place to manage software procurement alongside your AWS spend.

Architecture overview

As shown in figure 3, Wherobots fits into your existing AWS data architecture as a specialized compute engine within Amazon VPC, using IAM-based access controls.Figure 3 - Wherobots on AWS architecture

Figure 3 – Wherobots on AWS architecture

Figure 4 shows an example data pipeline used by AWS and Wherobots customers to create AI context for the physical world, commonly including Amazon Relational Database Service (Amazon RDS) for PostgreSQL in the gold layer for low-latency application serving.

Figure 4 - Example data pipeline frequently used by AWS and Wherobots customers for creating context about the physical world on AWS

Figure 4 – Example data pipeline frequently used by AWS and Wherobots customers for creating context about the physical world on AWS

Customers building a physical world AI with Wherobots and AWS

Leaf Agriculture provides a unified API and query layer across over 60 machine and tractor telemetry types, covering millions of acres of farmland globally. Partnering with AWS and Wherobots, Leaf deployed a modern spatial data stack that is fast and cost-efficient.

“Previously, our data volumes and processing requirements were increasing faster than we could keep up with, burdening our team with costly rebuilds. Now with Wherobots on AWS, not only can we easily scale to millions of acres, we also can rest assured that our costs won’t spiral out of control.”

– G. Bailey Stockdale, CEO, Leaf Agriculture

SatSure uses satellite data and AI across agriculture, banking, financial services, and infrastructure. Using RasterFlow, SatSure runs distributed inference on satellite imagery at scale, vectorizing predictions into Iceberg tables for downstream analysis.

“RasterFlow meaningfully accelerates the work SatSure and Wherobots already do together. By automating mosaicking, preprocessing, and distributed inference into a single, on-demand workflow, it removes much of the engineering overhead required to operationalize our models at national and multi-season scale.”

– Rashmit Singh, CTO, SatSure

Aarden.ai, founded by former Zillow data infrastructure engineers, uses Wherobots to combine parcel-level data with land cashflow intelligence and feed AI models that build prospectuses far faster than traditional approaches.

“Spark is incredibly powerful, but it’s also a huge learning curve. Wherobots shortened the painful part of Spark and gave us production-grade scalability without having to babysit clusters. We were able to accelerate our spatial data processing from over 7 days to less than 30 minutes, a 300x improvement.”

– Ben Hudson, Co-founder and Head of Applied Science, Aarden.ai

Getting started

Get started using a free trial by subscribing to Wherobots through the AWS Marketplace. You can then continue with on-demand usage and consolidated billing. Visit wherobots.com for documentation, tutorials, and use case examples, or contact the Wherobots team to schedule a technical discussion or demo.

Connect with Wherobots

.


Wherobots – AWS Partner Spotlight

Wherobots is an AWS Partner and the AI Context Engine for the Physical World, offering purpose-built spatial data and AI infrastructure running natively on AWS. Founded by the original creators of Apache Sedona (over 70 million downloads), Wherobots is 3 times faster and delivers up to 45% better price-performance than most alternatives. From startups to enterprises, organizations use Wherobots to build AI systems that understand the physical world, processing petabyte-scale spatial data into operational intelligence across financial services, insurance, energy, agriculture, logistics, telecommunications, and more.

Contact Wherobots | Partner Overview | AWS Marketplace