SEEK Asia modernizes search with CI/CD and Amazon OpenSearch Service

This post was written in collaboration with Abdulsalam Alshallah (Salam), Software Architect, and Hans Roessler, Principal Software Engineer at SEEK Asia.

SEEK is a market leader in online employment marketplaces with deep and rich insights into the future of work. As a global business, SEEK has a presence in Australia, New Zealand, Hong Kong, Southeast Asia, Brazil and Mexico and its websites attract over 400 million visits per year. SEEK Asia’s business operates across seven countries and includes leading portal brands such as jobsdb.com and jobstreet.com and leverages data and technology to create innovative solutions for candidates and hirers.

In this post, we share how SEEK Asia modernized their search-based system with a continuous integration and continuous delivery (CI/CD) pipeline and Amazon OpenSearch Service.

Challenges associated with a self-managed search system

SEEK Asia provides a search-based system that enables employers to manage interactions between hirers and candidates. Although the system was already on AWS, it was a self-managed system running on Amazon Elastic Compute Cloud (Amazon EC2) with limited automation.

The self-managed system posed several challenges:

Slower release cycles – Deploying new configurations or new field mappings into the Elasticsearch cluster was a high-risk activity because changes affected the stability of the system. The little automation on both the self-managed cluster and workflows led to slower release cycles.
Higher operational overhead – Sizing the cluster to deliver greater performance, while managing cost effectively, was the other challenge. As with every other distributed system, even with sizing guidance, identifying the appropriate number of shards per node and the number of nodes to meet performance requirements still required some amount of trial and error, turning the exercise into a tedious and time-consuming activity. This consequently also led to slower release cycles. To overcome this challenge, in many occasions, oversizing the cluster became the quickest way to achieve the desired time to market, at the expense of cost.

Further challenges the team faced with self-managing their own Elasticsearch cluster included keeping up with new security patches, and minor and major platform upgrades.

Automating search delivery with Amazon OpenSearch Service

SEEK Asia knew that automation would the key to solving the challenges of their existing search service. Automating the undifferentiated heavy lifting would enable them to deliver more value to their customers quickly and improve staff productivity.

With the problems defined, the team set out to solve the challenges by automating the following:

Search infrastructure deployment
Search A/B testing infrastructure deployment
Redeployment of search infrastructure for any new infrastructure configuration (such as security patches or platform upgrades) and index mapping updates

The key services enabling the automation would be Amazon OpenSearch Service and establishing a search infrastructure CI/CD pipeline.

Architecture overview

The following diagram illustrates the architecture of the SEEK infrastructure and CI/CD pipeline with Amazon OpenSearch Service.

The workflow includes the following steps:

Before the workflow kicks off, an existing Amazon OpenSearch Service cluster with a live feeder hydrates it. The live feeder is a serverless application built on Amazon Simple Queue Service (Amazon SQS) via Amazon Simple Notification Service (Amazon SNS) and AWS Lambda. Amazon SQS queues documents for processing, Amazon SNS enables data fanout (if required), and a Lambda function is invoked to process messages in the SQS queue to import data into Amazon OpenSearch Service. The feeder receives live updates for changes that need to be reflected on the cluster. Write concurrency to Amazon OpenSearch Service is managed by limiting the number of concurrent Lambda function invocations.
The Amazon OpenSearch Service index mapping is version controlled in SEEK’s Git repository. Whenever an update to the index mapping is committed, the CI/CD pipeline kicks off a new Amazon OpenSearch Service cluster provisioning workflow.
As part of the workflow, a new data hydration initialization feeder is deployed. The initialization feeder construct is similar to the live feeder, with one additional component: a script that runs within the CI/CD pipeline to calculate the number of batches required to hydrate the newly provisioned Amazon OpenSearch Service cluster up to a specific timestamp. The feeder systems were designed to achieve idempotency processing. This meant unique identifiers (UIDs) from the source data stores are reused for each document, and duplicated documents update an existing document with the exact same values.
At the same time as Step 3, an Amazon OpenSearch Service cluster is deployed. To accelerate the initial data hydration process temporarily, the new cluster may be sized two or three times larger against sizing guidance with shard replicas and index refresh interval disabled until the hydration process is complete. The existing Amazon OpenSearch Service cluster remains as is, which means that two clusters are running concurrently.
The script inspects the number of documents the source data store has and groups the documents by batch sizes. SEEK identified that 1,000 documents per batch provided the optimal ingestion import time, after running numerous experiments.
Each batch is represented as one message and is queued into Amazon SQS via Amazon SNS. Every message that lands in Amazon SQS invokes a Lambda function. The Lambda function queries a separate data store, builds the document, and loads it into Amazon OpenSearch Service. The more messages that go into the queue, the more functions are invoked. To create baselines that allowed for further indexing optimization, the team took the following configurations into consideration and reiterated to achieve higher ingestion performance:
1. Memory of the Lambda function
2. Size of batch
3. Size of each document in the batch
4. Size of cluster (memory, vCPU, and number of primary shards)
With the initialization feeder running, new documents are streamed to the cluster until it is synced with the data source. Eventually, the newly provisioned Amazon OpenSearch Service cluster catches up and is in the same state as the existing cluster. The hydration is complete when there are no remaining messages in the SQS queue.
The initialization feeder is deleted and the Amazon OpenSearch Service cluster is downsized automatically to complete the deployment workflow, with replica shards created and the index refresh interval configured.
Live search traffic is routed to the newly provisioned cluster when A/B testing is enabled via the API layer built on Application Load Balancer, Amazon Elastic Container Service (Amazon ECS), and Amazon CloudFront. The API layer decouples the client interface from the backend implementation that runs on Amazon OpenSearch Service.

Improved time to market and other outcomes

With Amazon OpenSearch Service, SEEK was able to automate an entire cluster, complete with Kibana, in a secure, managed environment. If testing didn’t produce the desired results, the team could change the dimensions of the cluster horizontally or vertically using different instance offerings within minutes. This enabled them to perform stress tests quickly to identify the sweet spot between performance and cost of the workload.

“By integrating Amazon OpenSearch Service with our existing CI/CD tools, we’re able to fully automate our search function deployments, which accelerated software delivery time,” says Abdulsalam Alshallah, APAC Software Architect. “The newly found confidence in the modern stack, alongside improved engineering practices, allowed us to mitigate the risk of changes—improving our time to market by 89% with zero impact to uptime.”

With the adoption of Amazon OpenSearch Service, other teams also saw improvements, including the following:

Common Vulnerability and Exposure (CVE) has dropped to zero with Amazon OpenSearch Service handling the underlying hardware security updates on SEEK’s behalf, improving their security posture
Improved availability with the Amazon OpenSearch Service Availability Zone awareness feature

Conclusion

Amazon OpenSearch Service managed capabilities has helped SEEK Asia to improve customer experience with speed and automation. By removing the undifferentiated heavy lifting, teams can deploy changes quickly to their search engines, allowing customers to get the latest search features faster and ultimately contributing to the SEEK purpose of helping people live more productive working lives and organisations succeed.

To learn more about Amazon OpenSearch Service, see Amazon OpenSearch Service features, the Developer Guide, or Introducing OpenSearch.

About the Authors

Fabian Tan is a Principal Solutions Architect at Amazon Web Services. He has a strong passion for software development, databases, data analytics and machine learning. He works closely with the Malaysian developer community to help them bring their ideas to life.

Hans Roessler is a Principal Software Architect at SEEKAsia. He is excited about new technologies and upgrading legacy to newer stacks. Always staying in touch with the latest technologies is one of his passions.

Abdulsalam Alshallah (Salam) is a Software architect at SEEK, Previously a Lead Cloud Architect for SEEKAsia, Salam has always been excited about new technologies, Cloud, Serverless & DevOps, in addition to his passion of eliminating wasted time/effort & resources; He is also one of the leaders of AWS User Group Malaysia.