AWS Partner Network (APN) Blog

Simplifying Talent Acquisition Processes with Quantiphi and a Modern Data Strategy on AWS

By Prudhvi Raj Atluri, Associate Solution Architect, AWS – Quantiphi
By Sanchit Jain, Data Analytics and Cloud Practice Lead, AWS – Quantiphi
By Sudhir Gupta, Principal Partner Solutions Architect – AWS

Quantiphi-AWS-Partners
Quantiphi
Connect with Quantiphi-1

Companies can utilize a variety of strategies to find talented individuals. Many use online talent databases for talent sourcing, while others prefer more traditional means such as referrals or networking.

Irrespective of the various methodologies a company uses to identify potential employees, it’s critical to use the right tools to monitor the talent acquisition process (TAP). It’s also imperative for companies and talent sourcing agencies to oversee the talent acquisition process from start to finish, which is typically done with the use of talent management software.

However, such software has its own limitations. In most cases, they do not store historical data or the lifecycle that captures key metrics to harness the true power of data with analytics.

Quantiphi’s cloud-native data platform facilitates the convergence of talent and recruiter performance data to provide crucial insights into the talent acquisition process.

In this post, we will highlight the critical aspects of Quantiphi’s serverless, fully-managed extract, transform, load (ETL) pipeline along with the benefits of the centralized lake house (data lake + data warehouse) solution built on Amazon Web Services (AWS) in helping talent acquisition companies.

We’ll also describe how it helped a leading talent management software enabler company make better decisions and improve the way customers analyze and monitor TAP and hiring performance.

Quantiphi is an AWS Premier Tier Services Partner and AWS Marketplace Seller that is an artificial intelligence (AI)-first company driven by the desire to solve transformational problems at the heart of the business.

Data Lake Architecture Helps Talent Acquisition Companies

Talent management software is commonly used by talent acquisition companies to manage the process of finding, assessing, and developing talent within an organization. However, while this software provides basic features for managing job vacancies and employee placements, it falls short in terms of advanced data analytics.

To derive meaningful insights and streamline the recruitment process, companies need a data lake analytics platform that can handle large amounts of data, promote self-service capabilities, and provide robust security and scalability.

This enables companies to easily move data, generate ad-hoc reports, and create business intelligence (BI) dashboards that can help analyze the job lifecycle, hiring performance, and recruiter performance over time.

Quantiphi’s Batch Data Analytics and BI Platform

Quantiphi has created a framework to rapidly build an AWS cloud-native data analytics and BI platform. This framework helps customers to overcome the initial challenges of configuring a production-ready data lake and deploying building scalable automated workflows with minimal customization and effort.

The framework contains modules and functions that are optimized for performance and scalability based on Quantiphi’s experience and expertise in delivering high-quality automated data pipelines to support data and analytics initiatives.

Quantiphi’s serverless and fully managed ETL framework allows a scheduled data ingestion mechanism to fetch incremental data, stage, integrate, and finally ingest published data to Amazon Redshift, a cloud data warehouse system, to provide analytics and visualization capabilities around the talent acquisition data with ​​the best price-performance at any scale.

Quantiphi-Talent-Acquisition-Data-1

Figure 1 – Detailed solution architecture.

Data Ingestion Layer (ETL/ELT)

Quantiphi’s data lake ingestion framework uses a workflow engine on AWS Step Functions to automate the process of transferring data. PySpark code running on AWS Glue is used to perform ETL of data, from the staging to the integration layer and eventually to the published layer.

This process is made easier and faster through the use of a lightweight, reusable blueprint that defines the configuration for various steps of data transformation. The framework is capable of performing both ETL (extract, transform, load) and ELT (extract, load, and transform) patterns.

Amazon DynamoDB manages the state of ETL bookmarks by saving the change data capture (CDC) checkpoint information for various data sources to ensure incremental data is processed.

AWS Lambda performs simple tasks to supplement AWS Step Functions workflow to record the state of the ETL process in DynamoDB.

Data Lake and Warehouse (Lake House)

Amazon Simple Storage Service (Amazon S3) acts as a storage for all of the staging, integration, and published data files pushed from the ELT layer. This S3 bucket is partitioned per the use case, and files are stored in JSON, Parquet, and AVRO file format at different steps for faster retrieval.

AWS Glue databases are leveraged to deploy metadata schema for external tables AWS Glue tables to catalog staging, integration layer schemas.

AWS Lake Formation is used to centrally manage and control the access to the data in the S3 data lake across different groups of users. Different sets of policies are defined for each dataset present in the data lake, which helps in the democratization of the data while accessing through Amazon Athena. Thus, the column-level and row-level security for specific datasets is managed through these policies within AWS Lake Formation for different groups of users.

Amazon Redshift is leveraged as a data warehousing solution that maintains the data from all the factories and sensors in facts and dimensions, along with implementing SCD Type 2 and for maintaining the incremental history.

Data marts are created as summary tables which include all of the required calculated KPIs and metrics for the dashboards up to a certain level of aggregation, as per the dashboarding requirements using materialized views on Redshift. These materialized views are refreshed per the requirement of the KPIs to be available on the dashboard.

Consumption Layer

  • Amazon Athena provides ANSI SQL queries to extract the enriched data from the S3 data lake for any type of ad-hoc requirements or reports. The enablement of Athena provided a way of extracting and transforming the data from the data lake for excel based reports.
  • Amazon QuickSight visualizes dashboards from the backend materialized views on Redshift. A different set of dashboards are created as per the different personas, and are shared with a different group of end users per their access rights.
  • Row-level security datasets are implemented to the QuickSight dataset, allowing only a respective group of users to access sensitive information and KPIs presented on the dashboards.

Logging Monitoring and Security Layer

  • AWS Key Management Service (AWS KMS) is used to encrypt the data at rest which is stored on the raw and refined S3 buckets, Redshift clusters, Elasticsearch clusters, and Amazon CloudWatch log groups.
  • Amazon CloudWatch stores all of the application logs in the respective log groups from the ELT Lambdas and Glue jobs. These logs will be automatically streamed to the Amazon OpenSearch Service clusters.
  • Amazon OpenSearch Service is leveraged to query and access the application logs which are streamed from the CloudWatch logs groups to this OpenSearch cluster. The logs obtained in the OpenSearch cluster are used to create OpenSearch monitoring dashboards. Developed dashboards provide the real-time metrics on the ELT ingestion pipeline, which helps in performance tuning of the entire streaming pipeline.

DevOps Implementation

Quantiphi developed this entire solution to provide a near real-time dashboard to continuously monitor and visualize the performance of manufacturing factories for a client’s customer. The solution is built per the scalability requirement and as a way of providing faster insights to end users.

The following DevOps practices result in a significant increase in the production releases for the entire ELT pipeline for new customers:

  • Automation of ELT pipeline using CI/CD and AWS CloudFormation templates.
  • Automation of Redshift and QuickSight dashboards using CI/CD and CloudFormation templates.
  • Automated monitoring and alerting mechanism deployment through CI/CD and CloudFormation templates.

Customer Use Case: TalentNet

As the originators of direct sourcing, TalentNet‘s talent acquisition platform has helped the world’s leading businesses revolutionize how they attract, engage, and power their workforce. The platform is deployed as an enterprise software-as-a-service (SaaS) platform and hosted on AWS. Large contingent workforce programs utilize it to directly attract, retain, and engage top-notch talent without relying on conventional staffing suppliers.

TalentNet wanted to build a data lake platform capable of integrating incremental CDC data from a multi-tenant-based SaaS application that’s flexible enough to derive analytics from the best features of an enterprise data warehouse and a data lake through sophisticated governance and management.

Quantiphi developed a serverless, fully managed, and streaming ETL/ELT pipeline and a lake house solution to enable analytics on talent acquisition and hiring performance data. This allowed TalentNet to monitor and optimize talent acquisition processes to make recruitment more efficient and effective.

Using Quantiphi’s AWS cloud-native data analytics and BI platform built on AWS, TalentNet could scale as per the requirement to process and support the existing client base, and onboard new tenants with minimal configuration. This provided an efficient way to generate insights using QuickSight dashboards facilitated better decision-making for recruiters, hiring managers, and company personnel who are responsible for managing the talent and monitoring hiring performance.

“Quantiphi helped us manage and consolidate talent data and eventually added a business intelligence layer to the platform,” said Shawn Duggan, VP of Application Development and Platform Engineering at TalentNet. “Data-driven decisions are now integral to our customers’ talent acquisition processes and overall hiring strategy.”

The following figure describes how Quantiphi’s batch data ingestion pipeline ingests incremental CDC data to AWS.

Quantiphi-Talent-Acquisition-Data-2

Figure 2 – SaaS multi-tenant app to AWS data lake ingestion pipeline.

Summary

Integrating data lake and cloud technologies with SaaS-based applications such as talent management software can offer unprecedented insight to organizations, hiring managers, and recruiters.

Quantiphi’s AWS cloud-native data analytics and business intelligence platform built on AWS helps integrate these applications and data into a consolidated data lake. This provides strategic insights that extend beyond analyzing data into process optimization to drive better business outcomes.

You can also learn more about Quantiphi in AWS Marketplace.

.
Quantiphi-APN-Blog-Connect-2023
.


Quantiphi – AWS Partner Spotlight

Quantiphi is an AWS Premier Tier Services Partner and AI-first digital engineering company driven by the desire to solve transformational problems at the heart of business.

Contact Quantiphi | Partner Overview | AWS Marketplace | Case Studies