Change is a constant on the front page of the website known as Woot.com. At Woot—the original daily deals site, founded in 2004 and acquired by Amazon in 2010—there are new special deals on electronics, clothing and outdoors gear, sports equipment, housewares, and other products offered every day—sometimes every 30 minutes.
In 2018, the company decided it was time for a change on the site's backend, too. Specifically, Woot wanted to deprecate its legacy data warehouse—based on Amazon Relational Database Service (RDS) for Oracle Database—and shift to a cloud-native data warehousing solution on Amazon Web Services (AWS).
The challenges that the legacy warehouse posed for the company included the need for new custom pipelines each time data sources were added, which sometimes took weeks to build; a cumbersome querying process that meant some potentially valuable queries were never even attempted; and the necessity of tightly restricting user access to the data warehouse, because it resided in the company's production AWS account.
Today, Woot is running a serverless data warehouse based on Amazon Kinesis Data Firehose and Amazon Simple Storage Service (Amazon S3) for data ingestion and storage. It uses AWS Lambda to orchestrate AWS Glue for ETL job scheduling and metadata management tasks. Amazon Athena and Amazon QuickSight offer powerful, user-friendly querying and data visualization, even for users with no SQL knowledge. And all this sits in a separate data warehouse account, fully segregated from the company's production account.
Given the range of options available from AWS for obtaining, managing, and gaining insights from data, how exactly did Woot decide on the solution it selected? In short, by listening to customers—in this case, the many categories of employees who rely on the data warehouse to ensure great experiences for Woot customers.
"I wanted this project to be a force for good within Woot," says Chaya Carey, a data engineer at Woot and the sole employee responsible for managing the company's data warehouse. "With the tight deadline we faced, it was tempting to just get a list of requirements, execute, and worry about technical debt later. Instead, we spent a lot of time talking about who used the data warehouse, what challenges they were encountering, and what they needed to use the data for."
Through these conversations, one goal Carey developed for the new data warehouse was to shift to a model of shared responsibility for data that would eliminate the need for her to build or alter custom pipelines for every new service or service change. "I wanted services to send data to the data warehouse and have it be accepted with minimal intervention," she says. "But I needed to find an easy way to push data that fit with the existing skill set of developers."
Carey found a ready solution by having developers use AWS Software Development Kits (SDKs) for the various programming languages and platforms in use at Woot to send data to the warehouse's Kinesis Data Firehose delivery stream.
"Instead of building a batch job to send data from a service, all developers need to do now is add an API call that pushes data to the Firehose endpoint," says Carey. "Kinesis Data Firehose made the shared responsibility model a much easier sell to our developers. This was a big win for the migration, because we eliminated the lag time we used to have for adding new services or adapting to changes in existing ones."
By choosing Amazon Athena and QuickSight for data querying and visualization, Woot has made life much easier for the many employees— including accountants, financial analysts, inventory analysts, vendor managers, and customer service representatives—who need information from the Woot data warehouse to do their jobs but lack data science or business intelligence skill sets.
"Queries in the previous solution required opening a ticket, obtaining manager approval, receiving a password that was only good for 90 days—and on top of that you needed to understand SQL to write your query," says Carey. "Now, by using Amazon QuickSight, anyone can build graphs and other visualizations just by dragging and dropping, with no SQL knowledge needed. For employees who want more customization, there is an option to query through the Athena console, but, again, no SQL knowledge is needed."
Not only is the process of querying now simpler, but the queries themselves also take much less time to complete. "Every user we've talked to has told us how much faster querying is in Amazon Athena," says Carey. "We're also hearing that queries that were previously too complex are running with no trouble on Athena, which means people are able to answer even more questions than they could before."
Because the AWS tools in the new solution are so user-friendly, a growing number of employees are taking a self-service approach to answering questions. "People are so impressed by the visualizations they can build in QuickSight that they are looking for more and more ways to use it," says Carey. "We only have four BI employees, and traditionally they've always had more requests than they could get to. Now nontechnical employees can use Amazon QuickSight to get information on their own, so Woot BI resources have more time for strategic projects."
Carey says the migration has not only addressed the challenges of the previous solution but has also positioned Woot to start experimenting with the many other tools and services available on AWS—and saving money on top of all that. "By shifting to serverless AWS data warehouse solution, we cut the cost of operating our data warehouse by almost 90 percent," says Carey. She is also pleased to report that, as a result of the new solution's serverless architecture, she was finally able to take a three-week vacation without being paged once.
Carey adds, "The fact that the data warehouse is now in its own account and stores everything in Amazon S3 enables me and our BI engineers to seamlessly integrate with and explore other technologies, such as Amazon Elastic MapReduce, Amazon SageMaker, and Amazon Redshift Spectrum. We're really excited about where we can go from here."
To learn more, visit aws.amazon.com/what-is/data-warehouse/.