AWS Public Sector Blog
Ready to Use Future Climate Information
The climate data factory is a French startup working on making future climate information easily accessible. The team believes that easy access to actionable data of future climate conditions is essential to fight climate change.
Read on for the Q&A with Dr. Harilaos Loukos of the climate data factory on how we can best make use of data.
What is future climate information and what do you mean by “ready to use”?
Future climate information is created by climate models that simulate the Earth’s future climate, while considering scenarios of future atmospheric greenhouse gas concentration. They are sophisticated 3D-models developed by research institutes around the world. They include all of the known drivers of our climate, starting with atmospheric and ocean physics, as well as biological and chemical processes that influence the climate system. These models are computationally demanding to run and generate petabytes of data. This creates challenges in terms of data storage and access.
By “ready to use,” we mean that we (the climate data factory) have taken care of the data management, processing, calibration with observations, and quality control that is necessary before you can use the data for your own applications. Indeed, the output from climate models is distributed as raw model data, and the mandatory tasks of data management and processing are left to every user to perform. These tasks require significant expertise and resources that can take weeks or months to perform, even for skilled users. For non-specialists, these tasks are particularly difficult.
Who uses the data that you produce and what do they use it for?
Our data is used by climate change consultants and scientists for impact studies and modeling. Consultants, for example, use the data to infer a general understanding of how changing temperatures and rainfall patterns may evolve and impact the socio-economic activity of a specific city or territory by analyzing the vulnerability of its support infrastructure (agriculture, energy, health, transport, and economic activity) to present and future climate. Based on this assessment, they develop adaptation strategies to reduce vulnerability to climate change.
Where our “ready to use” data is most valuable is for quantitative studies using environmental models because raw climate data is not fit for this purpose. For example, crop yield models are calibrated with observations of present climate but quality future climate data is necessary to produce assessments of potential crop yield changes that may affect food security. The same holds true for hydrological or ecological models.
How can we access the data?
By visiting our ecommerce site, you can easily search, select, and download climate change model data from a catalog of six weather variables and 30 climate indicators for over 4,300 cities and 70 countries worldwide. We can add any missing location or country within 48 hours. We also support our users with data selection, data use, or any related issues they might have. For organizations that need large data volumes or specific processing, we respond to specific requests as an on-demand service.
Our processing methods are fully documented through online help, reports, and peer reviewed scientific papers. We are committed to producing our data transparently within the highest standards of the climate modelling research community and we partner with academics on both technical and scientific aspects of our processing chain.
Can you describe how you have used AWS technology to develop your solution?
Our operations include raw data download and storage, data production, data handling, and data storage again. As a startup with limited resources and with no previous experience with Amazon Web Services, we migrated gradually from local resources to AWS. We started by using Amazon Simple Storage Service (Amazon S3) to serve our data products on Amazon S3 to our ecommerce site. This was the fastest and most efficient way to serve our data. We now serve all of our data from Amazon S3 and archive them to Amazon Glacier.
We have migrated our data handling (going from bulk-processed data to the packaged data products) to AWS. We use core services like Amazon EC2 and Amazon Batch with Amazon EFS that give us control on the time/cost ratio for generating updated data packages. We have now started the migration of the last component, which is data production. It consists of tens of thousands lines of code in shell scripts and Python that will now be running on Amazon EC2, Amazon Batch, Amazon EFS, and Amazon RDS.
We understand that you have some exciting plans to improve your services to your customers, can you share?
Yes, our next version of “ready to use” data will be based on higher resolution observational datasets. It gets a bit technical but the idea is to use the most up-to-date historical data produced by public climate centers. The ERA5 Reanalysis data set covering 1979-2018, produced by the European Centre for Medium range Weather Forecast (ECMWF) will be made available this December. The good news is that it will be available directly on Amazon S3 via the Registry of Open Data on AWS, which will make things easier (and cheaper) for us, and probably for others as well.
What has impressed you most about using AWS?
I am always amazed by the incredible services and resources that are available to startups today compared to a pre-cloud world that was not so long ago. Needless to say that the climate data factory wouldn’t exist without those services and resources.
But what strikes me the most, is how proactive AWS is at making you feel that you do count even if you are a small startup. We receive great support and we have a local person we can talk to, who answers back the same day and identifies the right support for us, like solution architects who spent half a day with us working on our issue.
Learn more about the Registry of Open Data on AWS and read more of the Q&As who are using AWS for open data here and here.