Jungle Scout Leverages AWS Data Lab to Accelerate Analytics Solution on AWS


Challenge
With a financial and business mandate to reduce data silos and advance the company’s analytics capabilities, Jungle Scout’s engineering team was tasked with building a data lake to modernize its existing data platform. Needing to complete this work in only a few months, Jungle Scout leveraged the AWS Data Lab as a way to accelerate the development of its data lake and help the team architect the solution in line with best practices and the latest solution trends.
Solution
With the support of the AWS Data Lab, Jungle Scout's team built the foundation of an Amazon Simple Storage Service (S3) based data lake in only four days that ingests data from their Node.js custom application in near-real time using Amazon Kinesis Data Firehose. Amazon EMR transforms, cleanses, and prepares the raw dataset before storing the curated dataset in a separate Amazon S3 bucket, and leverages integrated Apache Hudi functionality for managing change data capture. AWS Step Functions is used for orchestration. Finally, Jungle Scout data scientists and data engineers execute ad-hoc queries against the curated dataset using Amazon Athena. See architecture diagram below.
Benefit
Jungle Scout left the AWS Data Lab with a functioning data lake and a repeatable pattern for building data pipelines that hydrate the data lake and provide a simple method for joining datasets from different systems into one centralized location. By using Amazon S3 as the core of the data lake, Jungle Scout is able to reduce its storage footprint across other databases and remove data silos, ultimately helping the team reduce cost and increase productivity. Additionally, rather than storing time series data at a daily level, the data lake allows Jungle Scout to move to intraday granularity in a cost-effective way. The solution also makes it simpler to manage multiple versions of product metadata changes, giving Jungle Scout’s data scientists and engineers the flexibility to view data changes several times per day and troubleshoot data faster.
“By leveraging the AWS Data Lab, we were able to launch our analytics solution to production only three months after joining the lab and with only two engineers working full-time on the project. This has resulted in a major shift in how engineers at Jungle Scout build data processing pipelines.”
Alex Handley, Principal Architect, Jungle Scout
-
AWS Services Used
-
About Jungle Scout
-
About AWS Data Lab
-
AWS Services Used
-
About Jungle Scout
-
Jungle Scout is an all-in-one platform for finding, launching, and selling Amazon products. Jungle Scout provides customers with a full suite of business management solutions and powerful market intelligence to help entrepreneurs and brands manage their Amazon businesses.
-
About AWS Data Lab
-
AWS Data Lab offers accelerated, joint engineering engagements between customers and AWS technical resources to create tangible deliverables that accelerate data and analytics modernization initiatives. During the lab, AWS Data Lab Solutions Architects and AWS service experts support the customer by providing prescriptive architectural guidance, sharing best practices, and removing technical roadblocks. Customers leave the engagement with a prototype that is custom fit to their needs, a path to production, deeper knowledge of AWS services, and new relationships with AWS service experts.
Architecture Diagram

Get Started
Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.