AWS for Industries
Weather-based photovoltaic monitoring: how BDPV scales on AWS
Solar power adoption is experiencing unprecedented growth globally, with a remarkable 346 GW added in 2023, representing a 74% increase from the previous year. This surge brings a critical challenge for home solar producers: how do they effectively monitor their installations to make sure of optimal performance and return on investment? ASSO BDPV (Photovoltaic Database, from the French Base de Données PhotoVoltaïque), a French non-profit organization, tackled this challenge by building an AWS-powered platform that monitors over 3,000 solar installations in near real-time. Their solution combines advanced weather data and sophisticated algorithms to monitor and optimize solar energy production for small-scale producers.
In this post, we explore how BDPV used AWS services to build a scalable, reliable, and cost-effective monitoring architecture. As a volunteer-run organization, they face unique technical challenges. We examine how they strategically combined multiple AWS services and the real impact of their solution on France’s growing solar community.
Monitoring of solar installations for individual producers
ASSO BDPV was created in 2016 to support residential photovoltaic producers who often lack the expertise to properly monitor their solar installations. Building upon a web platform initially created by volunteers in 2008, the organization allowed producers to track their solar production data in near real-time and benchmark their performance against nearby installations. This comparative approach provides producers with valuable insights into their installation’s health and efficiency.
However, solar energy production is inherently variable, influenced by multiple factors including weather conditions and physical obstacles that create shadows. To enhance monitoring precision and automate anomaly detection, BDPV recently conducted research that led to the development of a sophisticated algorithm. The algorithm calculates theoretical production at a fine-grained time resolution of 15-minute intervals for each installation using satellite irradiance measurements, weather reanalysis, and digital terrain modeling. Comparing this theoretical output with actual measured production allows the system to automatically identify anomalies and track solar panel degradation over time.
The algorithm forms the cornerstone of BDPV’s comprehensive performance reports, which are generated bi-weekly for more than 3,000 installations. These reports provide individual producers with detailed insights into their installation’s performance and automatically notify them of any detected anomalies. This proactive monitoring approach strengthens producer confidence in solar technology while enabling swift responses to potential issues, thereby minimizing production losses.
Figure 1. Performance report for a solar power installation
A closer look at BDPV’s performance reports
The performance reports provide each producer with comprehensive insights into their installation’s health through several key visualizations and metrics:
1. Solar path and shading analysis
Figure 2. Shading analysis showing solar paths across seasons, with color intensity indicating radiation levels
The report includes a shading analysis that overlays the sun’s trajectory throughout the year with potential obstructions. Using color gradients from yellow to dark orange, it shows the theoretical maximum radiation intensity, helping solar panel owners understand how shadows from nearby buildings or terrain features impact their production at different times of the day and year.
2. Production performance tracking
Figure 3. Daily solar production over the past 30 days: actual measurements (blue bars) compared to simulated output (red line), with maximum production over the period (orange dashed line)
The production monitoring report offers a clear visual comparison between actual and simulated output. Blue bars represent the measured production, while a red line indicates the expected performance based on the simulation. This simulation accounts for actual weather conditions using satellite data, which explains why the red line follows the same weather-driven variations as the measured production. The report displays this data at both weekly and daily intervals, providing both long-term trends and immediate performance insights.
3. Long-term performance evolution
Figure 4. Performance ratio over time: weekly ratio of actual production to simulated expected output (blue dots), with trending line (black)
The report tracks installation efficiency over time through a performance ratio analysis. This visualization presents weekly performance points, a ratio comparing measured production to simulated expectations, with a black trend line from a robust regression. Color-coded confidence intervals in green indicate the expected performance range, while red zones highlight periods of significant underperformance that may require attention. The system calculates annual degradation rates from the regression slope, helping producers understand their panels’ aging process and compare it to industry standards.
These automated analyses help producers maintain optimal performance of their installations. A summary section in the bottom right of the report displays key metrics including long-term degradation rates and system alerts, making it easy for non-technical users to quickly assess their system’s health.
The challenge: scaling the performance report generation
Although BDPV had successfully managed its community of thousands of solar producers for years through simple benchmarking, the introduction of their sophisticated performance analysis system brought unprecedented scaling challenges.
“Moving from basic comparative analysis to detailed performance reports was like switching from a bicycle to a rocket ship in terms of complexity,” recalls David Trebosc, BDPV’s president. “Our initial system worked well for comparing installations, but generating these new, highly detailed performance reports at scale required a complete rethinking of our approach.”
The scale of the operation was daunting: more than 3,000 solar installations transmitting daily production data, each needing comprehensive performance reports every two weeks. Behind these reports are layers of complex data processing. The system needed to ingest and analyze vast amounts of weather data, extracting time series for each installation from multiple satellite images. For each installation, the algorithm needed to process this weather data alongside digital terrain models to calculate theoretical production at 15-minute intervals. Then, the system had to compare this theoretical output against actual measured production data to detect anomalies and calculate performance metrics. Finally, this analyzed data needed to be transformed into detailed visual reports with multiple charts and metrics, including solar path analysis, production comparisons, and long-term degradation trends. All of this had to be generated and distributed to thousands of users.
Operating as a non-profit association added another layer of complexity to the challenge. With limited financial resources and relying on volunteer support rather than full-time technical staff, BDPV needed a solution that would be both scalable and cost-effective. “We had to be creative,” says David Trebosc. “We couldn’t throw money at the problem – we needed smart solutions that would maximize our limited resources.”
The processing workflow needed careful coordination of multiple interconnected steps. The system needed to manage numerous data processing tasks, from satellite data ingestion to final report generation. Some of these were running in sequence, while others were in parallel, all while maintaining robust monitoring and error handling capabilities.
“What we needed was an architecture that could handle complex scientific calculations at scale, while remaining manageable for a volunteer-run organization. Finding that balance was our biggest challenge,” explains Yves-Marie Saint-Drenan, the researcher behind BDPV’s new algorithm.
A composite architecture, combining several AWS services
To address this challenge, BDPV sought a carefully architected solution using cloud services. The goal was to select services based on their specific strengths and cost-effectiveness, creating a system that could deliver high performance and reliability within the organization’s constraints.
AWS offers a diverse array of services that cater to various processing requirements. The architecture developed by BDPV uses these services to optimize performance and cost-effectiveness throughout its pipeline:
- Amazon S3 serves as the primary storage solution, hosting both raw satellite imagery and processed data. Its cost efficiency and ability to parallelize read and write operations made it perfect for the large dataset of historical weather data.
- AWS Step Functions orchestrates the entire workflow, managing complex processing sequences and enabling parallel execution where possible. This service has proven invaluable for coordinating the various components while maintaining visibility into the process.
- AWS Lambda functions handle short-running, parallelizable tasks such as extracting specific pixels from satellite images. Their serverless nature means you only pay for actual computation time, making them cost-effective.
- Amazon Elastic Container Service (Amazon ECS) handles compute-intensive tasks that need specialized libraries or longer processing times than Lambda allows. It provides the flexibility to run containerized workloads while maintaining cost efficiency.
- AWS Glue manages the Extract, Transform, Load (ETL) processes, particularly for concatenating daily data and creating time series. Its managed nature reduces operational overhead.
A closer look at the performance reports generation workflow
Figure. 5: Performance reports generation workflow
The generation of performance reports involves several steps, orchestrated through Step Functions workflows. In this section we examine each step of this process:
1. Satellite data processing
The workflow, triggered daily by Amazon EventBridge on a schedule, begins with satellite data retrieval and processing. This step uses Amazon ECS because of the requirement for specialized libraries that exceed Lambda’s size limitations. The raw satellite images are processed and cropped to the relevant geographic areas.
2. Cost optimization with Amazon S3 Glacier
To optimize storage costs, original LSA-SAF (Land Surface Analysis Satellite Application Facility) satellite images are automatically moved to S3 Glacier through lifecycle policies. This approach significantly reduces storage costs while maintaining access to historical data when needed.
3. Grid point time series generation
The system generates time series for each grid point (representing a geographic location) using a combination of Lambda functions for parallel processing and AWS Glue for data aggregation. This process was detailed in a previous post, demonstrating how to efficiently handle this large-scale data processing. In Step 3a, each Lambda function processes a day of data. Then, in Step 3b, time series data is restructured with AWS Glue, changing the partition key from date-based to grid point-based. This transformation optimizes the data structure for the subsequent processing steps.
4. Historical data integration
Newly extracted data is concatenated with previously processed data from earlier runs, making sure of a continuous and complete dataset for analysis.
5. Data pre-processing
A Step Functions workflow manages the pre-processing phase by orchestrating Lambda functions and ECS tasks that prepare the data for report generation. By the end of this stage, all necessary data is organized by grid point on Amazon S3, ready for the final processing step.
6. Parallel report generation
The final stage involves generating performance reports for each installation. To optimize processing time, the workflow parallelizes the generation across grid points using multiple ECS tasks. This approach significantly reduces the overall processing time for the 3000+ installations.
7. Data storage and state management:
Amazon DynamoDB tracks the processing status and stores calculation parameters, particularly those used for comparing installations with each other. Amazon S3 stores the data generated at each step of the process, providing a reliable, cost effective, and scalable storage solution.
This workflow demonstrates how different AWS services can be combined to create a robust and efficient processing pipeline. The use of Step Functions for orchestration makes sure of the reliable execution and monitoring of the entire process, while the combination of various compute services (Amazon ECS, Lambda, and AWS Glue) provides the right balance of performance and cost-effectiveness for different types of processing tasks.
Real-world impact
Since implementing the AWS solution, BDPV has achieved significant improvements in their monitoring capabilities:
- Processing capacity increased to handle 3,000+ installations with bi-weekly reporting
- 90% reduction in compute and storage costs through the use of serverless and containers, and optimized use of Amazon S3 and S3 Glacier
- Early detection of performance issues, helping producers minimize production losses
- Evolution from simple benchmarking to precise performance monitoring
“AWS services have enabled us to build something unprecedented in the solar industry – a free monitoring platform that combines satellite data with sophisticated algorithms to provide detailed performance analysis for thousands of small-scale installations,” says David Trebosc. “What’s remarkable is that we’ve achieved this as a volunteer-run organization. Despite having no prior cloud computing experience, AWS’s ease of implementation and wealth of online resources made it relatively straightforward for us to bring this solution to life.”
Looking ahead, BDPV continues to focus on enhancing their monitoring capabilities. The team is currently analyzing the data from their new performance reports to refine their anomaly detection system. Their goal is to speed up the detection of potential issues while maintaining high accuracy and avoiding false alarms. This ongoing research demonstrates BDPV’s commitment to continuous improvement in supporting the growing community of solar homeowners.
Conclusion
In this post, we explored how BDPV used multiple AWS services to build a scalable architecture for monitoring photovoltaic installations. Combining Amazon S3, AWS Step Functions, AWS Lambda, AWS Glue, and Amazon ECS allowed them to create a solution that processes complex satellite data and generates detailed performance reports for thousands of installations.
This architecture demonstrates how organizations can:
- Use a composite approach combining different AWS services for optimal performance and cost
- Use serverless technologies to minimize operational overhead
- Scale complex data processing workflows while maintaining cost efficiency
- Build sophisticated monitoring solutions with limited resources
Ready to build a similar solution? Start by exploring the Step Functions documentation to understand how you can orchestrate your own data processing workflows. From there, dive into AWS Workshops for hands-on labs and more AWS posts for deeper insights into scalable solutions.