Spokeo Case Study
Spokeo, headquartered in Pasadena, California, is a website that enables users to search for people by first or last name, address, phone, or social handles—or any combination of these identifiers—in order to connect with others and protect themselves. The Spokeo people-search engine aggregates data from online and offline sources, including white-pages listings, public records, and social-network information. Spokeo users have access to 12 billion public records packaged into simple, easy-to-use people profiles.
Google search ranks us higher if our pages are fast. By making our site faster using Amazon EFS, we’re improving our ranking in addition to making our customers happier.”
Lead Software Architect, Spokeo
To support its fast-growing search engine, Spokeo needed a better way to scale its systems. “Our web application hosts more than 18 million unique visitors per month, times however many page views each visitor has,” says Mike Daly, the company’s chief technology officer. “That’s in addition to having 12 billion records, which we need to aggregate and process at scale. Our previous on-premises solution couldn’t provide that scale.”
Webpage-loading time was another major challenge for Spokeo. For example, to ensure high placement of Google search results and give customers a fast search experience, the company needed to serve up pages with consistently high performance. However, the organization could not achieve the level of speed it was looking for using its existing web application framework. “We tweaked it as much as we could, and we still could only render pages at a limited speed,” says Austin Fonacier, the company’s lead software architect. “It took too much time, money, and effort, and we still had unmeasurable performance gains.”
To address these challenges, Spokeo required a shared file system that could serve records to a fleet of application servers simultaneously. Without a shared file system, the company would need to process and aggregate data for every single search, and build each page dynamically.
Why Amazon Web Services
To solve its scalability challenges, Spokeo decided to move its on-premises web application to the Amazon Web Services (AWS) Cloud. “AWS is—and has always been—far ahead of the competition in terms of the range of services offered,” Daly says. “In addition to being ahead of the game, AWS always has competitive pricing.”
Initially, Spokeo moved its web application to Amazon Elastic Compute Cloud (Amazon EC2) instances. The company also started using Amazon DynamoDB—a fully managed NoSQL database service—to host its 12 billion original source records and to store original records received from third-party sources as well as data resulting from processing and aggregation.
To solve its performance and search-result challenges, Spokeo began using Amazon Elastic File System (Amazon EFS), a shared-file-storage solution for the cloud. Using Amazon EFS, records are stored in files that serve as a sharable cache for fast rendering. As a result, Spokeo can stage and cache search results, which optimizes page load times and increases the speed of search results. By relying on Amazon EFS, the organization avoided the need to build its own network file-storage solution and dynamically render individual pages for each search.
“We knew we wanted to deliver pages faster to our end users and web crawlers, and Amazon EFS was the right solution as far as speed, scalability, and costs were concerned,” says Fonacier. Spokeo uses Amazon EFS as the core of its webpage indexing solution. A reverse proxy layer behind the application checks whether a cached page is in Amazon EFS, and, if so, delivers it to an end user. Using Amazon EFS, Spokeo supports 30 million webpage views each day. “Amazon EFS can support 250,000 page views per second if needed,” says Fonacier. Spokeo was able to get up and running on Amazon EFS very quickly. “Once we got the notification that our Amazon EFS system was ready, it was in production in only a few minutes,” Fonacier says.
Most recently, Spokeo completed a full migration of its remaining data center systems and applications to AWS.
Relying on AWS to power its website, Spokeo can easily scale to support its growing volume of site traffic. Daly says, “We have more than 12 billion public records that we aggregate from a variety of sources, and AWS makes those records easily searchable for our website visitors.” Fonacier adds, “We used to get hit with huge spikes of traffic, going from zero to thousands upon thousands of requests per second, and our infrastructure would crumble underneath.”
Using Amazon EFS, Spokeo has improved its page-loading speeds; serving a web page’s content from Amazon EFS takes between 30 and 50 milliseconds, compared to 300 to 500 milliseconds previously. “We can render pages much faster than we ever could before by using Amazon EFS,” says Fonacier. “It gives us a great way to pre-generate and store all our pages and still be very fast when either Google or our end users hit them, regardless of whether it’s during peak traffic times or slow ones. Now we’re getting more consistent traffic, because our Google search traffic is served out of our Amazon EFS solution.”
With faster page-loading speeds and search-result rendering, Spokeo can improve its search-engine-optimization rankings. “Google search ranks us higher if our pages are fast,” Fonacier says. “By making our site faster using Amazon EFS, we’re improving our ranking in addition to making our customers happier.”
Previously, managing scalability and speed was very difficult for the Spokeo team. Now, with Amazon EFS underneath, the entire process is simplified. “We’ve never done network file storage at scale before, so it’s something we would have had to figure out ourselves—spinning up and managing the system, and tweaking it to get it as fast as we want it to be,” says Fonacier. “So instead of spending thousands of hours doing all that and putting a lot of money into it, we literally just clicked a few buttons and had the solution ready. Amazon EFS just works, out of the box. We don’t have to think about it. We don’t have to worry about IOPS or provisioning.”
Spokeo is also seeing significant cost savings by using Amazon EFS. “Delivering pages out of Amazon EFS is much cheaper than spinning up all the architecture required to dynamically render each page for every search,” says Daly. “If we received 20 times our normal amount of traffic tomorrow, for example, our Amazon EFS costs would remain the same. When you’re dealing with the scale we are, that is very beneficial. Using AWS, we know we can easily and cost effectively support whatever traffic comes our way. That’s why it’s the best technology for our needs.”
Spokeo, headquartered in Pasadena, California, is a website that enables users to search for people by first or last name, address, phone, or social handles—or any combination of these identifiers—in order to connect with others and protect themselves.
AWS Services Used
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.
Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale.
Learn more »
Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed elastic NFS file system for use with AWS Cloud services and on-premises resources.
Learn more »
Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.