[SEO Subhead]
This Guidance shows how to build scalable geospatial data repositories on AWS, simplifying the design of data pipelines and facilitating faster access to raw data. By integrating Earth on AWS datasets from the Registry of Open Data on AWS, it eliminates the need for storing this data in your own data lake, reducing costs and complexity. This Guidance also offers integration with a variety of dissemination mechanisms and supports diverse processing demands, from basic spatial queries to complex analytics. These features allow you to streamline geospatial workflows and enhance data accessibility.
Please note: [Disclaimer]
Architecture Diagram

[Architecture diagram description]
Step 1
Invoke a data ingestion pipeline based on new scene detection. Subscribe to Amazon Simple Notification Service (Amazon SNS) topics for managed datasets with appropriate filters, and configure time-based ingestion rules using Amazon CloudWatch.
Get Started

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
CloudWatch provides comprehensive monitoring and observability for your applications. It captures and analyzes events, logs, and metrics to give you real-time insights into your system's health and performance. By using CloudWatch, you can proactively detect issues, troubleshoot problems more efficiently, and respond to incidents faster. This continuous monitoring helps ensure better application reliability while allowing you to maintain optimal performance across your AWS infrastructure.
-
Security
AWS provides a comprehensive suite of security services and features to protect your data and resources. AWS Identity and Access Management (IAM) enables fine-grained access control, allowing you to set permission policies that restrict who can access and manage AWS resources. Data protection occurs through various means: Amazon S3 employs server-side encryption and bucket policies for data at rest, while AWS Key Management Service (AWS KMS) offers customer-managed keys for encrypting data in Amazon S3, Amazon Relational Database Service (Amazon RDS), and DynamoDB.
This Guidance enhances network security by using security groups attached to container task network interfaces, protecting virtual private cloud (VPC) resources. It also configures network access control lists for subnet-level access restrictions and utilizes VPC endpoints to keep traffic within the AWS environment, safeguarding data in transit. Furthermore, the use of managed services like Amazon ECS, Lambda, and SageMaker reduces your security maintenance burden under the shared responsibility model.
-
Reliability
The services selected for this Guidance offer high availability, durability, and scalability for your applications. Lambda enhances reliability by running functions across multiple Availability Zones (AZs) so that event processing continues even if one AZ fails. Aurora PostgreSQL provides robust high-availability options, replicating data six ways across three AZs for improved fault tolerance, even with a single database instance.
For controlled scaling and resilience, Step Functions allows you to manage processing rates, preventing overload on downstream services and avoiding rate limits. It also orchestrates stateless components, which are inherently more scalable, robust, and manageable. Data durability is supported by Amazon S3, offering automatic cross-Region replication, while both DynamoDB and Aurora provide flexible backup capabilities for point-in-time recovery.
-
Performance Efficiency
This Guidance uses AWS managed services that automatically adjust to workload demands. For example, Lambda scales automatically to handle querying and data processing based on incoming event volume. Step Functions manages workflow orchestration, dynamically adjusting to increased loads by parallelizing or queueing tasks.
For data storage and access, this Guidance utilizes three key services:
- Amazon S3 accommodates high throughput and numerous requests without provisioning, optimizing for various access patterns.
- DynamoDB offers flexible query capabilities with automatic scaling.
- Aurora automatically adjusts compute and memory resources to match workload demands.
Together, these services provide a scalable infrastructure capable of handling varying workloads efficiently so that your application can maintain performance and responsiveness as demand fluctuates without the need for manual intervention or complex capacity planning.
-
Cost Optimization
This Guidance optimizes costs through several strategies:
- Storage: Uses datasets from the Registry of Open Data on AWS and recommends downloading raw files to your data lake for multiple processing iterations.
- Compute: Uses serverless services for automatic scaling, employs spot instances for batch processing with on-demand failover, and suggests reserved instances for Amazon RDS for PostgreSQL.
- Data transfer: Uses VPC endpoints, contains processing within a single VPC, and eliminates the need for additional transfer services between components.
These approaches minimize expenses across storage, compute, and networking while maintaining performance for geospatial processing workloads.
-
Sustainability
Through the use of managed, serverless services, this Guidance minimizes the environmental impact of backend resources by scaling them up and down to meet demand. Additionally, you can monitor CloudWatch metrics to make sure that the scaled environment is not overprovisioned, further reducing your environmental impact.
Related Content

[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.