Case Study / Matching & Solution Business

2023
Recruit Logo

Recruit uses AWS for their common infrastructure to support large-scale data processing.

Utilizing over 30,000 CPU cores to process their compute jobs, Recruit has accelerated their business and development.

1,200

Number of machine learning jobs run per day

30,000 CPU cores

Maximum concurrent processing capacity

Approximately 50

Number of projects in use

50%

Reduced development lead time for recommendation systems

Approximately 10 person-months per year

Man-hours reduced by recommendation system

Overview

Recruit Co., Ltd. operates the Matching & Solution business within the Recruit Group. The company built “Crois”, a data processing platform used in many projects by Recruits‘s internal organizations including a recommendation system for Recruit's real estate information site and to accelerate its business and development. Crois utilizes Amazon Web Services (AWS) managed services and can scale up to 30,000 CPU cores for large-scale workloads.  

Recruit Case Study

Business Issues | Developed "Crois” a data processing platform used company-wide.

Recruit provides services that connect customers and clients such as "Rikunabi", "Jalan", and "SUUMO”. To operate services at different scale, data analysis for advertising data, log data and demographic data is indispensable. Therefore, the company's Data Management & Planning Office developed "Crois" as a data processing platform that can be used across internal organizations and internal projects.

'We initially developed Version 1 in 2017 as a prototype of our data processing infrastructure for ad technology. And then Version 2 in its current form was released in 2019 after continuous evolutions. In 2021, Recruit merged its operating and functional companies into a new organization. After examining the data analysis environments of each company, we found similar processing and requirements there, so we began operating this product as an infrastructure that can be used throughout the company" says Mr. Naoyuki Abe who is the Division Officer of the Data Technology Unit, Data Management & Planning Office, Product Development Management Office, Product Management Division.

“Crois" is equipped with a workflow function that does jobs analysis as a data processing infrastructure utilizing machine learning and other methods. The processes of data processing are catalogued as reusable "modules". Highly versatile modules can be used in across the organization regardless of the business unit. Users can also register their own modules for common use.

“The concept we’ve developed is a "data solution kitchen”. We regarded data as ingredients, modules as recipe and that the business organization cook a dish called a data solution based on the recipe. This solution eliminates the burden on data scientists and data engineers in the business organization with creating a separate analysis infrastructure, and at the same time allows scientists and planners, who do not have engineering skills, to focus on data analysis without the help of engineers.” (Mr. Abe)

kr_quotemark

By utilizing AWS managed services for "Crois" a large-scale data processing platform for our internal organization, we have achieved high agility and scalability required by Recruit's business”.

Mr. Sogo Oishi
Division Officer, Data Management & Planning Office, Product Development Management Office, Product Management Division, Recruit Co., Ltd.

The Solution: AWS Batch and Amazon ECS on AWS Fargate realized large scale processing

Currently, "Crois" provides functions including module execution infrastructure, a container image catalog, a workflow engine, and an authorization management system, all of which can be manipulated via API or Web UI.

AWS Batch and Amazon ECS on AWS Fargate are used for the container execution infrastructure. AWS Batch is used for tasks which consume huge amounts of CPU, memory and disk resources, such as machine learning (ML) and for tasks that requires GPU instances. Amazon ECS on AWS Fargate is used for lightweight tasks such as querying a data warehouse that can scale quickly.

AWS Step Functions is used as workflow engine. By transferring all control of job execution to Step Functions and by separating it from “Crois”, Recruit was able to ensure scalability and fault tolerance at the same time.

“Since there were only a few development members at the beginning, the key point was to use managed services that could reduce the operational load. Considering the scalability of running hundreds of instances simultaneously, AWS Batch and Amazon ECS on AWS Fargate were the way to go, as the containers are simple to scale and do not impose an operational burden." Said Mr. Toshiaki Uno from Data Management & Planning Office, Product Development Management Office, Product Management Division, explaining the reason for choosing these AWS Services.

The authorization management system utilizes AWS Identity and Access Management (IAM) and AWS Key Management Service (AWS KMS) to ensure high level of security by issuing KMS keys that are paired with IAM roles on a root project basis.

“If data is stored without encryption, developers with access privileges can view the contents. Therefore, we took security and privacy into consideration by encrypting all intermediate files and access keys with AWS KMS so that even developers cannot see related information" said Mr.Uno.

The development and operation of "Crois" has been conducted entirely in-house by the company's development team since Version 1. The development team, which started with a few people, has grown to more than a dozen as the number of job execution has increased.

“During our development, we always discuss among team members what are the best practices out of the abundant AWS services to use and how to develop efficiently utilizing these tools. Whenever we have a problem, we consult with AWS solution architects". said Mr. Uno.

The impact : Approximately 1,200 jobs executed per day in about 50 projects

Launched in 2019, "Crois" has immediately been used in internal projects. Its performance has been gradually recognized and the number of jobs executed per day has now reached approximately 1,200. The utilization rate has been at 99.9%. “Since merging our operating companies in 2021, when the Data Management & Planning Office became a matrix-type organization that crosses the domain-specific unit for data strategy/planning in each business domain and the specialized unit to provide cross-disciplinary support over the domains, the number of requests to use “Crois” in their own businesses has increased as a result of its reputation spreading by word of mouth across the business organizations", said Mr. Masafumi Tsurutani from Data Management & Planning Office, Product Development Management Office, Product Management Division.

On "Crois," multiple large-scale processes can be executed simultaneously, supporting up to 30,000 CPU cores as of December 2022.

“Within Recruit, there are many departments that run large-scale Machine Learning (ML) workloads, with requirements for 1,000 or 1,500 parallel processing."Crois” is a scalable platform that can easily meet these requirements," said Mr. Uno.

Currently, "Crois" is being used in about 50 projects within Recruit. A typical use case for large-scale processing is the recommendation system for "SUUMO”, their real estate information website.

The "SUUMO" recommendation system had been using an in-house batch workflow for many years, which had accumulated technical debt and reduced development and operational efficiency. The introduction of "Crois" has increased development efficiency by using general-purpose modules and reusing modules created by users themselves, reducing lead time in development by 50%. We can also leave the management and operation of the compute infrastructure to AWS's managed services, which is expected to save up to approximately 10 person-months of operation per year," says Mr. Tsurutani.

Expectation on “Crois” is increasing more and more, as Mr. Abe says, “For business organizations, the biggest advantage is that they can concentrate on efforts to improve matching accuracy, which they should originally be committed to, without having to develop data processing infrastructure on their own".

“As the number of projects in use increases, it is predicted that more computing resources and a faster execution infrastructure will be required in the future. We will continue to work with AWS to evolve into a more scalable infrastructure," said Mr Abe.

Architecture

Recruit Architecture

Company Profile Recruit Co., Ltd.

Founded in 1960 as University Newspaper Advertising Co. Ltd. In 2012, it moved to a holding company structure and was split into Recruit Co., Ltd. which oversees the Matching & Solutions business, and some operating companies and functional companies. In 2021, the new Recruit was established by merging with seven core operating companies and functional companies. The company is currently developing their matching business that connects clients and users and a SaaS-type business support business with their vision of "Follow Your Heart" and mission of "Opportunities for Life. Faster, simpler and closer to you.”  

Key Services Currently In Use

AWS Batch

AWS Batch enables developers, scientists, and engineers to efficiently run hundreds of thousands of batch and ML computing jobs while optimizing computational resources, so they can focus on analyzing results and solving problems. 

See here for details »

Amazon ECS on AWS Fargate

AWS Fargate is a serverless, pay-as-you-go computing engine that lets you focus on building applications without managing servers. AWS Fargate supports both Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS).

See here for details »

AWS Step Functions

AWS Step Functions is a visual workflow service that enables developers to build distributed applications using AWS services, automate processes, orchestrate microservices, and build data and machine learning pipelines.

See here for details »

AWS IAM

AWS Identity and Access Management (IAM) allows you to specify which users and groups have access to AWS services and resources, centrally manage granular permissions, and analyze access to improve permissions across AWS.

See here for details »



Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.