Zooniverse’s Open Source Answer to Disaster Relief
Zooniverse created the Planetary Response Network (PRN) to support relief efforts helping crisis victims. Beginning with the Nepal earthquake in 2015, the Zooniverse team knew the PRN would be helpful. The PRN takes satellite data pre-and-post disaster and uses that data to inform ground-based rescue teams where they need to go to be the most effective.
This data provides a ground-level view of big, global scale issues. Zooniverse aggregates data from 1.4M volunteers and 65 full time researchers analyze all of this data to derive value from it in an effort to reduce the impact of disasters.
We sat down with Sascha T. Ishikawa, Citizen Science Web Developer at Zooniverse, to talk about the PRN and their use of Sentinel-2 data for their project.
Can you tell us what AWS services you use to power PRN?
For all of our major services, we use Amazon Simple Storage Service (Amazon S3), Amazon Elastic Compute Cloud (Amazon EC2), and Amazon Kinesis. Everything Zooniverse and PRN lives in an S3 bucket, from third party data to websites, static websites, and data sets. All of our projects are open sourced. We have volunteers providing classifications on pre-verified data sources, like satellites, during a certain time around a disaster, which can create a backlog of data that needs to get processed and stored in a database. This firehose of data can take a while to get processed. To combat this challenge, we have EC2 instances ready to go to crunch all of the data. We then have Kinesis set up to listen to the stream of data classifications (a classification is when a volunteer sees an image or a video clip that needs to be classified or annotated, something like a collapsed building—see images below) and message board posts. Kinesis is at the heart of what we call “subject retirement.” When we have an image that has been seen by enough people, it gets taken offline to focus volunteers’ efforts on the remaining, unseen images.
You launched with 1,300 images and each one was checked 20 times within 2 hours. How do you handle such a quick burst of traffic? Did you need to scale up at any point to handle the enthusiastic response?
No doubt we had to scale up; all of this is handled with an elastic load balancer, which automatically distributes incoming application traffic across multiple Amazon EC2 instances in the cloud. Zooniverse experiences a constant trickle of data coming from volunteers, press releases, and media attention, but then it can spike to up to tens of thousands of classifications per hour. It has taken us a while to get that right so there is no down time. When the PRN launched over a 2-hour period, we experienced no down time!
Were you able to replicate some of the AWS architecture you’ve used from other citizen science projects for this initiative?
I have been using AWS since 2009, so I took my experience and definitely used it towards this project. The PRN was built using the Zooniverse project builder, so it is no different from a technology standpoint than our other projects with AWS (hear about Zooniverse’s other projects with AWS in this video). However, what is unique about this project is that it has a separate app that handles data processing to bring satellite images in. And we are constantly looking to improve and incorporate the latest technology.
What lessons from this response will you carry into your next emergency response effort?
PRN is our answer to disaster relief effort, which is an ongoing project that we will refine over the next month. The earliest event this was used for was the Nepal earthquake, when it was in its very early beta stage. It was a proof of concept that ended up proving to be useful, so the funding continued and we got to use it to help relief efforts in Nepal and Ecuador. The plan is to make this a standalone app and fetch data from the European Space Agency or Planet Labs.
You used Sentinel-2 data for a comparative data set. What are the advantages of Sentinel-2? How do you think Sentinel-2 data can be used in future emergency response efforts?
Having Sentinel-2 data available for anyone via Amazon S3 allows us to use it, often within hours of production. Although Sentinel-2 data has a lower spatial resolution (10m) when compared to Planet Labs (3-5m), it contains multi-spectral data (which Planet Labs does not) ranging from visual to near-infrared. The latter, although we didn’t use it for this last Ecuador earthquake, is more sensitive to vegetation, which could be a good indication of landslides. The reason we didn’t use near-infrared was because its spatial resolution is slightly less than the regular visual spectrum images. In short, we made a judgment call to focus on image spatial resolution over multi-spectral data.
Take a look at the images below, and here is an actual example of volunteers spotting some form of damage.
Thank you to Zooniverse for their work and their time talking with us! Check out Zooniverse’s other projects here.