End-to-end scalable vision intelligence pipeline using LIDAR 3D Point Clouds on AWS

This post was contributed by Frantz Lohier (AWS), Andrew Timpone (AWS), Saurabh Bahl (AWS), Phil Cooper (AWS)

New techniques in 3D world modeling and automated scene interpretation are unleashing a revolution in several industries such as mining and construction. By pairing vision capture systems (drones or 3D scanning equipment) with cloud-enabled AI technologies, it is now possible to analyze complex scenes, plan or track ground progress and ultimately guide or optimize operations. Several market signals in the mining and construction sector suggest that AWS customers are seeking vision intelligence solutions to automate the generation of actionable inspection reports. Four underlying classes of algorithms are required to achieve such end-goal:

SLAM (Simultaneous Location and Mapping): to help map a terrain and precisely localize machines (such as excavators in mining)
Photogrammetry to reconstruct a scene using various poses of still images or 3D scans
Point-cloud or 3D reconstructed scene interpretation techniques such as DSM2DTM (digital-surface-modeling to digital-terrain-modelling curation)
Image or video content interpretation using AI techniques

Part of the challenge is to scale the execution of those compute-intensive algorithms, for example, across mining sites and excavators and, generally, the vast amount of historical data generated. In this blog, we share best practice to map the execution of a popular open-source 3D point-cloud library onto a set of serverless AWS services.

LiDAR Data: Field to Cloud Processing Strategies

The evolution of LiDAR technology exemplifies both the promise and challenge of exponential data growth. When first airborne, mobile LIDARs were revolutionary—enabling the creation of Digital Terrain and Digital Surface Models that transformed how we understood our physical world. Data processing was elegantly simple: portable computers with a few hundred megabytes of storage could handle entire survey datasets locally. Today’s reality presents a stark contrast. Mobile LiDAR or sensors like the Puck from Velodyne generating 600,000 points per second delivering sub-millimeter accuracy, and able to create digital twins of unprecedented fidelity. Yet this leap in precision comes with an exponential cost—terabytes of data that demand sophisticated cloud infrastructure to unlock their value. This transformation forces a critical architectural decision: should processing power be pushed to the edge, accepting the constraints of battery life and local computing limitations, or should we efficiently pipeline this massive data volume to the cloud where scalable processing can extract maximum intelligence?

From the monitoring of slope stability in mines through accurate LiDAR survey, to river networks through dense forests, or powerline inspections—LiDAR and 3D scans represent a very powerful sensor to understand the world.

Today the global LiDAR market size is estimated at USD 2.74 billion in 2024 and is projected to reach USD 4.71 billion by 2030, growing at a CAGR of 9.5% LiDAR Market Size, Share & Trends Analysis Report, 2030. AWS helps organizations define and implement data transfer strategies for both connected and offline environments. The choice of transfer method depends on data volume and operational needs. For regular survey operations, AWS DataSync provides a secure online service that can optimize a 10 Gbps network connection through a single task, making it ideal for routine LiDAR uploads. Organizations with consistent high-volume data flows benefit from AWS Direct Connect, which offers dedicated network connections that bypass internet congestion while providing predictable bandwidth and lower transfer costs. For massive datasets or initial migrations, optimization strategies include minimizing cross-region traffic to reduce costs, pre-compressing LiDAR point clouds using LAZ format, and implementing hybrid approaches that combine DataSync with Amazon Simple Storage Service (Amazon S3) File Gateway to reduce on-premises infrastructure while maintaining seamless connectivity. For most LiDAR surveys, AWS DataSync offers the best combination of speed, automation, and cost-effectiveness. For specific cases, we also use parallel transfers through multiple DataSync tasks or Amazon S3 multipart uploads. The choice between these methods is based on data volume, transfer frequency, existing infrastructure, and cost requirements. As we also witness the growth of 4G, 5G and satellite connective networks, smart edge connection using AWS IoT Greengrass v2 as developed with our partner Hexagon have ensured customers build smart strategies to reduce cost and improve delivery

LiDAR technology as revolutionized spatial data collection through multiple platforms.

Airborne systems
Mobile mapping units
Handheld devices
Drone-mounted sensors

From Mount Everest to remote jungles, these diverse platforms enable unprecedented access to measurement capabilities. As our data collection expands, AWS provides the robust infrastructure needed to process, extract, and manage these massive datasets effectively. Our workflow focuses on three critical stages:

Efficient ingestion
- Streamlined data transfer from field to cloud
- Solutions for the traditional surveyor’s data transfer challenges
- Optimized upload protocols for large datasets
- Smart Processing
Scalable point cloud processing
- For example, using OpenDroneMap (ODM)
- Automated extraction of Digital Surface Models (DSM)
- Generation of Digital Terrain Models (DTM)
- Auto-scaling capabilities for varying data volumes
- Strategic Delivery
Production of analysis-ready LiDAR data
- Vector extraction for model development
- Customized output formats based on client requirements

This workflow enables efficient handling of data volumes ranging from Gigabytes to Petabytes, ensuring scalable processing and delivery of precise, client-specific products

AWS sample scalable implementation for processing aerial-images

In this section, we focus on optimizing LiDAR point cloud processing by detailing a cloud-enabled image processing pipeline intended to process a large amount of drone-generated lidar scans and images.

Figure 1: Drone Image Processing System architecture, as described in the post.

The Drone Image Processing System is a cloud-native, scalable solution built on AWS that transforms raw drone imagery into valuable geospatial products using the open-source OpenDroneMap (ODM) library and photogrammetry algorithms. The system handles the complete workflow from image upload through processing to result delivery, with real-time status updates and robust error handling. The architecture (Figure 1) consists of Amazon CloudFront for a content delivery network, Amazon Route 53 for DNS, AWS API Gateway for API access, AWS WAF and Shield automatic detection and mitigation of attacks, Amazon Cognito for authentication, Amazon S3 Access Points for image upload and download, Virtual Private Network with both public and private subnets, an Application Load Balancer to manage web access to the application, NAT Gateway for private subnet access to the internet, Amazon ECS Fargate for containers hosting the APIs and web application, Amazon RDS Postgres to store and manage application data and state, Amazon ECS Fargate to host OpenDroneMap cluster and node containers, Amazon ECR Repository for container images, AWS Step Functions to manage the image processing workflow orchestration and steps, AWS Lambda for task management and updates, Amazon S3 for storage of the unprocessed and processed images, and AWS X-Ray and Amazon CloudWatch for observability. In case of privacy concerns, uploaded images and 3D LIDAR scans can be preprocess to blur relevant image regions prior to storage.

Figure 2: The user workflow for end-to-end drone image processing, as described in the post.

The workflow (Figure 2) begins when a drone operator accesses the web application through their browser. The Application Load Balancer routes the request to one of the web application instances running on ECS Fargate. The application presents an intuitive upload interface. When users upload their drone images, the application immediately streams the files to Amazon S3 storage, organizing them in a structured folder hierarchy under a unique job identifier. Simultaneously, the application extracts metadata from the images (GPS coordinates, camera settings, timestamps) and creates a comprehensive job record in the Amazon Relational Database Service (Amazon RDS) for PostgreSQL cluster database with status “UPLOADED”. Upon user confirmation to begin processing, the application updates the job status to “PROCESSING” and triggers the AWS Step Functions workflow. This serverless orchestration service manages the complex processing pipeline, starting by invoking Lambda functions that launch the appropriate ECS containers.

The system launches two types of containers: ClusterODM (the orchestrator) and multiple NodeODM containers (the workers). This distributed architecture allows for parallel processing of large image datasets, with ClusterODM managing task distribution and NodeODM containers performing the actual photogrammetry computations. ClusterODM reads the input images from Amazon S3 and intelligently distributes processing tasks across available NodeODM containers, implementing tenant-aware resource allocation. Each NodeODM container processes its assigned images through the complete photogrammetry pipeline, generating intermediate results that are stored back in S3. The processing includes sophisticated algorithms for feature detection, image matching, bundle adjustment, dense point cloud generation, mesh creation, and texture mapping. The system transforms Digital Surface Models (DSM) into Digital Terrain Models (DTM) and generates high-quality orthophotos. Throughout processing, the system maintains comprehensive monitoring through CloudWatch, with both ClusterODM and NodeODM containers reporting progress and resource utilization. The application queries job status providing real-time updates to users through polling. Upon completion, Step Functions coordinates the cleanup of processing resources while ensuring all results are properly stored in Amazon S3. The system generates download URLs and notifies users of completion, allowing them to access their processed geospatial products including DSM, DTM, orthophotos, and processing reports.

Figure 3: The image processing pipeline as described in the post.

The image processing pipeline (Figure 3) represents the core photogrammetry workflow that transforms raw drone images into valuable geospatial products. Raw drone images undergo comprehensive validation including format verification, resolution checks, GPS metadata validation, and overlap analysis. The system extracts critical metadata including camera parameters, flight patterns, and geospatial coordinates that guide the processing algorithms. The core processing occurs through a sophisticated 10-step pipeline:

Feature Detection: Identifies distinctive points in each image using advanced computer vision algorithms
Image Matching: Finds corresponding features between overlapping images to establish spatial relationships
Bundle Adjustment: Optimizes camera positions and orientations for maximum accuracy
Dense Point Cloud Generation: Creates detailed 3D point representations of the surveyed area
Mesh Generation: Converts point clouds into triangulated surface models
Texture Mapping: Applies photographic textures to 3D meshes for realistic visualization
DSM Generation: Creates Digital Surface Models including all surface features
DTM Generation: Produces Digital Terrain Models representing bare earth elevation
Orthophoto Generation: Generates geometrically corrected aerial photographs
Report Generation: Creates comprehensive processing reports with quality metrics

The pipeline produces multiple output formats optimized for different use cases: GeoTIFF files for GIS applications, PLY point clouds for 3D analysis, OBJ meshes for visualization, and PDF reports for documentation. Processing typically takes 15-45 minutes depending on image count and complexity. All outputs are systematically organized in S3 with appropriate metadata, lifecycle policies for cost optimization, and secure download URLs for user access. The system maintains processing history and enables reprocessing with different parameters.

Figure 4, 5, 6: Sample orthophoto, digital surface model, and digital terrain model, described in the post.

The first image is an orthophoto (a corrected aerial or satellite image that removes distortion from camera tilt and terrain), the second image is a digital surface model (a 3D digital representation of the Earth’s surface that includes all natural and man-made features like buildings, trees, and terrain), and the third image is a digital terrain model (a 3D digital representation of a landscape’s bare earth surface, showing elevation, slope, and shape by removing features like buildings and trees, often generated from LiDAR or photogrammetry, and used in engineering, mapping, and environmental studies for tasks like water flow modeling and construction planning).

From this data, the customer can use these models to better understand the terrain, obstacles on the terrain, and better plan out the use of any resources on the terrain.

Conclusion

The digital transformation landscape is being revolutionized by smart digital reality technologies that generate detailed point cloud data through affordable cameras, LiDAR sensors, and autonomous robots. This surge in data collection has created an urgent need for automated processing solutions, where cloud computing plays a crucial role in converting raw data into valuable insights. AWS scalable HPC solutions address these challenges by providing on-demand compute resources that can dynamically scale to process massive point cloud datasets without the constraints of traditional infrastructure. By leveraging AWS Step Functions for job orchestration and Amazon Elastic Container Service (Amazon ECS) Fargate along with Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances, organizations can process 3D imagery workloads at up to 90% cost savings compared to on-premises solutions. Additionally, AWS serverless technologies like AWS Lambda and AWS Step Functions enable event-driven processing pipelines that automatically trigger when new drone imagery arrives, eliminating idle resource costs and ensuring processing capacity is available precisely when needed. This architectural approach not only accelerates time-to-insight but also implements cost governance through automated scaling policies and usage-based pricing models. The growing demand for real-time data processing across multiple locations has intensified the market’s need for sophisticated, well-designed architectural solutions, and AWS’s combination of HPC capabilities and serverless flexibility delivers the performance, geographic reach, and cost efficiency that modern 3D data processing demands.

AWS HPC Blog

End-to-end scalable vision intelligence pipeline using LIDAR 3D Point Clouds on AWS

LiDAR Data: Field to Cloud Processing Strategies

AWS sample scalable implementation for processing aerial-images

Conclusion

Resources

Follow

Learn

Resources

Developers

Help