AWS Partner Network (APN) Blog

Customized Mapping Performance Evaluation with Amazon SageMaker and NextBillion.AI’s ENZYME System

By Muchen Tang, Data Scientist – NextBillion.ai
By Glendon Thaiw, Solutions Architect – AWS
By YokeTong Tan, Partner Solutions Architect – AWS

NextBillion.ai-AWS-Partners-2024
NextBillion.ai
NextBillion.ai-APN-Blog-CTA-2024

In today’s fast-paced world, accurate and reliable maps are essential for a wide range of applications, including navigation, logistics, and location-based services.

NextBillion.ai’s vision is to provide an industry-leading platform that helps enterprises build, scale, and manage their own mapping ecosystem. An AWS Partner and AWS Marketplace Seller, NextBillion.ai provides a bundle of location tech stacks such as navigation, routing, and snap-to-road. It also builds optimization solution suites for supply chain and logistic usages in resource planning and operational scheduling.

To manage a billion different maps for millions of enterprises, a systematic approach to evaluate map quality for different clients is necessary. In this post, NextBillion.ai introduces the evolution of its internal system, which is called ENZYME, in measuring one keen map quality performance metric—ETA—which is short for estimated time of arrival.

Inaccurate ETA can result in bad user experience for industries like food delivery, ride hailing, field service, and logistics. At the same time, bad ETA can lead to a reduction of overall system efficiency due to sub-optimal decisions like driver allocation and vehicle delivery planning.

What Affects ETA?

In practice, traditional routing engines produce routes by the “shortest path” algorithm through modeling real map road networks to abstract graphs. Here, the shortest path is an abstract concept, whereas normally road lengths would be interpreted as driving time by introducing weight terms with respect to live traffic estimation.

Two main factors affect ETA accuracy:

  • Map data represents road geometry, road speed profile, toll and traffic lights information, turning rules and restrictions, vehicle restrictions in height, weight, and so on.
  • Driver behavior such as preference over fastest or shortest, trunk way over residence road, avoiding toll fee, and so on, would affect the choice of route and thus overall, ETA accuracy.

Therefore, an efficient route planning engine with accurate map data as well as traffic information plus advanced artificial intelligence (AI) model would lead an accurate ETA.

System Design

The following process flow was designed by NextBillion.ai to run the ETA evaluation process. An internal copy of the target map service, named as the route engine, is replicated for the following reasons:

  • First, to retrieve the route planning results for every origin-destination pair for the training and evaluation process. Directly calling the production environment would affect service-level agreement (SLA) and stability.
  • Second, to retrieve additional route-relevant features for model training steps. Data extract, transform, and load (ETL) and processing steps are necessary to run the internal map service. At the same time, intermedia map data would be generated for the future feature extraction steps.

Next, the client’s data is processed, primarily their stream of the driver’s global positioning system (GPS) trajectory to retrieve driver behavior-relevant features. The feature extraction process is completed by incorporating the intermedia map data from earlier steps. This results in the generation of origin-destination pairs, including ground truth actual time of arrival (ATA) and route-related features.

Subsequently, the model training process is used to train machine learning (ML) models to further improve ETA accuracy.

Figure 1 – ETA evaluation process flow.

There are multiple challenges in building the above ETA evaluation system:

  • All of the processes above are complex and require particular supporting tools, and an efficient process management system is needed to manage the complex processes. A proper data layer is also needed to store the intermediate results to connect processes.
  • ETA evaluation requires processing and analyzing vast amounts of data. The overall processing is time-consuming and resource-intensive.
  • An adaptable AI service is necessary for model training, validation, and model performance metrics calculation. It should support a wide variety of ML models, as well as deep learning models for more accurate model performance.
  • As customer or user data is required for supervised model training, concerns in legal and data security can arise. For example, some of the clients in the U.S. should follow the government’s data protection policy such that original data should be kept within the country. Others may worry about the risk their data be used for other purposes, so the system to needs to be lightweight for it to be deployed easily at any location according to customers’ requirements.

The diagram in Figure 2 below illustrates the system architecture for ENZYME. Amazon Elastic Kubernetes Service (Amazon EKS) is adopted to run all relevant services, while Redis is used to model task queues with a first come first serve manner and PostgreSQL stores all task relevant metadata.

Web user interface (UI) to interact with users, with functionality in data uploads, task creation, status check, model training, result visualization, and more.

System Architecture for “enzyme”

Figure 2 – System architecture for the ENZYME system built by NextBillion.ai.

The following descries the data flow of the ENZYME system for evaluating quality of map (map data, route engine):

  1. Users create a map evaluation task, indicating basic parameters: vehicle type, map, origin point, destination point, evaluation metrics.
  2. For each incoming task, the ENZYME API (HTTP service) generates a task ID and pushes task information into PostgreSQL and a new task entry into Redis.
  3. ENZYME Task Runner polls for new tasks from Redis, and retrieves corresponding details from PostgreSQL.
  4. ENZYME Task Runner triggers Amazon SageMaker to run map evaluation process.
  5. SageMaker is configured to perform end-to-end map evaluation process, including ETL, feature extraction, feedback training, and evaluation.
  6. SageMaker leverages data stored in Amazon Simple Storage Service (Amazon S3) bucket to perform evaluation and store results.

Amazon SageMaker was selected to power the ETA evaluation engine due to the following:

  • Provides script processing with customized containers and support using Amazon S3 as a data layer of intermediate steps input/output, which enables decoupling of the complicated map data process into steps of script processing. A customized container is also leveraged to handle the complex ETL and processing steps.
  • Creates serverless Spark clusters on demand. This feature helps to maximize the computational power utilization and simplify the process of feature extraction.
  • Provides a wide array of built-in machine learning models and frameworks for training and optimizing the AI model. SageMaker’s automatic model tuning capabilities are used to fine-tune hyperparameters and select the best model configuration for improved accuracy. Additionally, SageMaker enables distributed training on large datasets in parallel, which helps to shorten the training time.
  • Simplifies the deployment of AI models using SageMaker’s built-in hosting functionality. This enables predictions to be served real-time, and further improve the live service ETA performance.

The above framework components are all hosted on AWS, providing the convenience for deployment at any region. This is especially meaningful for clients who have strict data protection requirements.

Summary

Utilizing Amazon SageMaker, NextBillion.ai transformed the process of map quality evaluation, operating with heightened efficiency and accuracy to deliver cutting-edge solutions to clients. The overall training process time is reduced by 30% on average.

Taking Singapore map processing as example, the average processing time takes around 2.5 hours for the previous training process (excluding extra manual action time). With ENZYME, the processing time of one complete training cycle reduce to 1.5 hours. For large maps such as USA or India, the savings as percentage reduces as the pure map data ETL processing time takes a larger percentage of the whole processing time.

ENZYME is used to train custom maps and results in more precise estimated time of arrival (ETA) for scenarios like food delivery and logistics. Specifically, the mean absolute percentage error (MAPE) between route engine estimated and and actual time of arrival has decreased by 10-20% compared to regular maps for car/truck driving modes.

For a regular car driving mode map, which is commonly used in ride hailing use cases, the MAPE can be improved from around 20% (average normal map vendor performance such as Here or TomTom directions API) to 11% with industry data input for training. For other scenarios like food delivery, which uses motorcycles or even bicycles as transport vehicles, traditional map vendors perform worse as they pay less effort in optimizing the map, MAPE can be as large as 35%. The machine learning model trained using ENZYME can cut down it to 16%.

To learn more about AI-powered map products and solutions, visit nextbillion.ai/.

.
NextBillion.ai-APN-Blog-Connect-2024
.


NextBillion.ai – AWS Partner Spotlight

NextBillion.ai is an AWS Partner that provides a bundle of location tech stacks such as navigation, routing, and snap-to-road. It also builds optimization solution suites for supply chain and logistic usages in resource planning and operational scheduling.

Contact NextBillion.ai | Partner Overview | AWS Marketplace