Safer, Built by Thorn, Combats Child Sexual Abuse Material on Content-Hosting Sites Using AWS-Powered Solution
2021
The spread of child sexual abuse material (CSAM) online is a complex and pervasive problem—and an urgent one for companies that let users store their photos, videos, and other data in content publishers and user-generated content-hosting sites. Perpetrators use those companies’ sites to build communities where they disseminate and collect CSAM without the companies’ knowledge. “Unfortunately, child abuse content likely exists on the majority of sites that accept user-generated content,” Julie Cordua, CEO of the nonprofit Thorn, said in a press release. Reports of abusive material in the United States alone have risen by 15,000 percent in the last 15 years, according to Thorn, which builds technology to defend children from sexual abuse. In the midst of this spike in abusive material, Thorn set out to stop the viral spread of CSAM by building a tool that could quickly identify, remove, and report CSAM from content-hosting sites. That tool is Safer. In a little over a year and a half, it has helped identify more than 150,000 pieces of CSAM.
Content publishers and user-generated content-hosting sites are often highly motivated to help in the fight against CSAM but lack the tools to effectively identify and remove CSAM. Using a variety of Amazon Web Services (AWS) solutions, Thorn, an AWS Partner, developed Safer, an application that works within customer storage environments to detect CSAM. The application then elevates suspected CSAM to the company for review and helps report confirmed CSAM to the National Center for Missing and Exploited Children (NCMEC)—the nonprofit national clearinghouse for this content—which is uniquely positioned to engage law enforcement to rescue victims. SmugMug, through its subsidiary Flickr, was one of the first beta customers of Safer, validating that Safer could scale to scan petabytes of data. By using a full stack of services from AWS, Thorn built an accessible tool that any company can use to identify, remove, and report CSAM from content-hosting sites without a large investment in staff headcount, unnecessary exposure of employees to disturbing material, or unforeseen legal risk.
AWS helps us remain streamlined across all our products and engineering resources internally and use the newest, most secure, and most flexible tools to stay responsive to our customer base.”
Sarah Potts
Director of Strategic Initiatives, Thorn
Building on AWS to Tackle Abuse
From its founding in 2012, Thorn set out to close the technology gap between bad actors and those on the front lines of protecting and identifying abused children. “When Thorn started, we set our mission specifically on the intersection of technology and child sexual abuse,” says Sarah Potts, director of strategic initiatives for Thorn. To expand its efforts, the organization built Safer, its first enterprise product to address CSAM. “We want to help companies employ the latest detection techniques and best practices to keep their sites, their employees, and their communities safe from child sexual abuse,” says Potts. “We want the internet to be a place where users are not accidentally encountering CSAM. Not only does this pose a severe risk of secondary trauma to internet users, but also victims of these crimes are retraumatized knowing that their abuse content continues to exist on the internet. Removing this information provides these victims a chance to recover.” In the 2017 Survivors Survey from the Canadian Centre for Child Protection, 69 percent of survivors of child sexual abuse reported a fear of being recognized by their abuse imagery—and in some cases actually have been. The permanence of these images on the internet has a lasting impact long after the abuse has taken place.
Importantly, Safer makes protection against CSAM more accessible to the tech industry. “Safer is specifically intended to equip content-hosting companies with the tools they need to quickly identify, remove, and report CSAM at scale,” notes Christina Crimmins, director of product for Safer. “The idea is not only to match images and videos against known pieces of CSAM through hash matching but also—as new ones become uncovered by the Safer community—to share that knowledge and improve matching going forward.” The majority of the tech industry lacks the expertise to tackle the issue with the same level of sophistication that Safer delivers. “The idea is for Safer to be both scalable and accessible to companies that otherwise wouldn’t have access to this technology but that still experience the same issues as any of the larger content-hosting sites,” notes Caroline Chang, senior product marketing manager for Thorn. “It’s a third-party solution that brings the expertise of the entire Thorn system directly to a company’s trust and safety team. Once it’s implemented, it runs as if it were an in-house solution.”
By using detection services that are hosted on the content publishers’ and content-hosting sites’ cloud infrastructure, Safer is able to detect potential CSAM faster than current industry practices. Classifiers are used to elevate potential never-before-detected CSAM. Then hashes—sometimes likened to a digital fingerprint—are generated from scanned user content and sent to a central matching service hosted on Thorn’s cloud infrastructure, where they are compared to a database of known CSAM hashes. Critically, customers’ images or videos never leave the customers’ storage environment. “The content-hosting sites’ user data remains private. Classification and hashing take place within the customer’s infrastructure so that Thorn never receives the actual image or video,” emphasizes Crimmins. “We only receive a hash representation of the content, and then we use that hash to match against a database of hashes that represent known or previously reported CSAM.” If Safer detects CSAM, the customer can use Safer’s reporting feature to share the content with the NCMEC, where the report is reviewed by an expert analyst. If appropriate, the information is sent to law enforcement to attempt to rescue the child. In the meantime, Safer provides secure storage options for reported CSAM in accordance with regulatory obligations while preventing employees from being unintentionally exposed to CSAM.
From Beta Testing to Full Launch: Collaboration between Thorn, AWS, and Flickr
Using AWS services, Thorn built and launched Safer as a beta solution in 2018. Thorn uses Amazon Elastic Kubernetes Service (Amazon EKS) to make the Safer matching system highly available, secure, and scalable. Hashes representing known or previously reported CSAM and the associated metadata are stored in Amazon Simple Storage Service (Amazon S3) and Amazon Relational Database Service (Amazon RDS). The company also uses Amazon Elastic Container Registry (Amazon ECR) to manage development environments and to share Docker images with customers that plan to run Safer in their own AWS environment. Companies running Safer on AWS use Amazon Simple Queue Service (Amazon SQS) to manage content volume fluctuations when scanning for CSAM as well as reporting detected CSAM. Companies running Safer on AWS also use Amazon S3 to store content, which enables a seamless integration with Safer’s detection and reporting functionality.
Flickr, an online photo-management site, became Safer’s second beta customer, helping realize Safer’s vision to make CSAM detection technology available to content-hosting sites and to validate how to operate such a solution at scale. “Flickr was a perfect beta partner,” says Cordua. “With a passionate team committed to upholding company values and serving the community, Flickr was deeply involved with iterative improvements that have made Safer unobjectionably better—developing a tool that will serve the needs of both our future customers and the children at the center of our mission.”
SmugMug was enthusiastic about exploring solutions to eliminate CSAM, starting with its photo-hosting site Flickr, which the company had acquired in 2018. As a free content-hosting site, Flickr has more images than any other SmugMug property and is a more likely target for CSAM. Using Amazon S3 since its launch in 2006, SmugMug has stored billions of photos and videos, amounting to petabytes of data, for its SmugMug and Flickr customers, reading and writing from Amazon S3 at throughputs of up to 500 Gbps. “AWS has helped us to grow our storage using Amazon S3 over 1,000 times without needing to worry about storage capacity and scalability. On AWS, we can focus on innovations to thrill our customers,” says Andrew Shieh, director of operations at SmugMug. That preexisting relationship made Flickr’s integration with Safer seamless.
Aside from using Safer’s hash-matching technology to proactively identify CSAM, Flickr was the first company to adopt its critical reporting feature, designed to interface with the CyberTipline Reporting API at the NCMEC. As the clearinghouse and comprehensive reporting center for missing and sexually exploited children, the NCMEC maintains a variety of hash lists of known CSAM, which specially trained analysts review and further investigate while collaborating with law enforcement for victim identification. “Because Thorn’s mission is to streamline the entire system so that it is easier to remove CSAM from the internet, partnering with the NCMEC is a really big part of our work,” says Potts. “So making NCMEC reports easy for companies to submit—with all the information the NCMEC needs to do its job—is a part of Safer we’re really proud of.” The reporting feature is a direct line of communication into the NCMEC and streamlines mandatory reporting once CSAM has been flagged by Safer, resulting in greater efficiency and employee wellness. Safer was built to support teams of any size, enabling variations in the identification and reporting of CSAM based on customers’ company policies and procedures for investigating and escalating toxic content and suspect accounts.
Working with Flickr, Thorn fine-tuned its AWS architecture so that it could scale and be fully customizable to various content-hosting companies’ needs. “We had a really close feedback cycle with Flickr. We would release something to the Flickr team members, and they would tell us, ‘This isn’t working’ or ‘This new feature would be great for us to have,’” says Emily Schultz, director of engineering at Thorn. “That helped us grow the product to have the features we need as we expand our customer base beyond Flickr.” Soon, Flickr was running the Safer hash on every single piece of content uploaded to the site—which led to a massive win in the fight against child abuse. In May 2020, Flickr reported that Safer had detected and reported abuse content that led to the identification of 21 child victims. That report led to an investigation by law enforcement, and the perpetrator is now in federal prison.
By using AWS, Safer can scale to support content-hosting sites of any size. “We’ve worked with companies that have a single individual responsible for handling CSAM as they begin to explore the depth of this issue on their sites, and we’ve worked with some companies that have matured their processes over the years and are supported by a dedicated team of moderators,” says Chang. “What we’ve learned through our experiences with Flickr is that we need to build a system that will remain flexible to those different needs. No company ever has to feel like it needs to reach a certain critical point in order to adopt Safer. Instead, we fit into the stage and operating practices of the company and then let it grow.” Thorn has also developed a flexible pricing model for Safer: an annual license fee that is tiered based on the number of files a company has. “The costs increase as the volume of content being scanned increases, and AWS has enabled us to scale these services to provide a cost-efficient alternative to building in house. User-generated content will remain unpredictable, but the level of coverage should not,” explains Chang. In September 2020, Safer launched on the AWS Marketplace, which expanded the reach of Safer by introducing the application to AWS customers.
Eliminating CSAM from the Internet
Since October 2018, Safer has grown to support more than 15 customers and has expanded its capabilities to detect even more CSAM. As of October 2020, the application stores over six million hashes that represent CSAM imagery—a 247 percent increase in hashes since its initial launch. From January 2019 to August 2020, Safer matched more than 100,000 image and video files to known CSAM. Thorn’s ultimate goal is to eliminate CSAM from the web entirely.
Using Amazon S3 and other AWS services, Thorn was able to develop and operate a tool for identifying, removing, and reporting CSAM on a massive scale. AWS has partnered closely with Thorn to help combat CSAM, providing millions of dollars in AWS credits for Thorn to develop and operate its services and hundreds of hours of pro bono advanced cloud services and technical support so that Thorn can continue to innovate and scale the growth of its products. “AWS is a key contributor to all our work because of its variety of offerings,” says Potts. “AWS helps us remain streamlined across all our products and engineering resources internally and use the newest, most secure, and most flexible tools to stay responsive to our customer base.” With CSAM detection and post-detection solutions powered by AWS and designed to scale up and iterate quickly, Thorn empowers individual companies to contribute and use intelligence across a network of companies to find and take down more CSAM. Every new image or video found is another step closer to making the internet safer, building exponential impact to protect children from exploitation.
About Thorn
Benefits of AWS
- Developed a solution on AWS for detecting, reporting, and removing CSAM
- Automated verification of content against 6 million hashes that represent known CSAM imagery
- Proven technology that matched more than 100,000 image and video files to known CSAM from January 2019 to August 2020
AWS Services Used
Amazon Simple Storage Service (Amazon S3)
Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.
Amazon Simple Queue Service (Amazon SQS)
Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications.
Amazon Elastic Container Registry (Amazon ECR)
Amazon Elastic Container Registry (ECR) is a fully managed container registry that makes it easy to store, manage, share, and deploy your container images and artifacts anywhere.
AWS Marketplace
AWS Marketplace is an online software store that helps customers find, buy, and immediately start using the software and services that run on AWS.
Get Started
Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.