Machine Learning on AWS Powers Onfido’s Success in Preventing Fraud

2020

Online identification is critical to preventing fraud and identity theft. Since 2012, Onfido has been a leader in the field of digital identity verification. The company realized early on that machine learning and facial biometric recognition were key in automating the digital identity verification process. For Onfido, providing these services to more than 1,500 companies—from car-sharing services to cryptocurrency exchanges—requires computing that is agile, scalable, robust, and secure. So the company turned to Amazon Web Services (AWS).

Based in the United Kingdom with nine offices around the world, Onfido conducts millions of ID verifications per month. Its machine learning architecture can verify documents against more than 4,500 types of identification, whether it’s a California driver’s license or a passport from India. Onfido found it could do this efficiently and cost effectively using a combination of several AWS services, but especially Amazon Elastic Compute Cloud (Amazon EC2) P3 Instances powered by NVIDIA V100 Tensor Core GPUs, which have been proven to reduce machine learning training times from days to minutes, and the fully managed Amazon Elastic Kubernetes Service (Amazon EKS). “If there’s one service that’s helped us to scale, it’s Amazon EC2,” says Onfido cofounder and chief architect Ruhul Amin.

Technology concept. 3D render
kr_quotemark

If there’s one service that helped us to scale, it’s Amazon EC2. It enabled us to train more models much faster than we had before.”

Ruhul Amin
Cofounder and Chief Architect, Onfido
 

Using Machine Learning to Halt Fraud

Machine learning is critical to Onfido’s business model. Every time users take a photo of their ID or snap a selfie to verify their identity, Onfido runs a complex series of automated tasks that its artificial intelligence must be trained daily to perform, including document recognition, optical character recognition, biometric verification, face matching, and even “liveness” detection in a selfie. To augment these automated checks, Onfido also has a “human loop,” bringing in a team of identity verifiers to securely validate user identity. This seamless combination sets it apart from its competitors. But it’s vital that automated verification—before being passed off to manual verification—executes with speed and accuracy. “We operate with a very low rate of fraud that passes through automatic verification,” says Martins Bruveris, a machine learning researcher at Onfido.

Onfido had been working on AWS from as early as 2014, taking advantage of Amazon Simple Storage Service (Amazon S3), an object-based storage service built to store and retrieve any amount of data from anywhere. Aware of the computing capacity it could access on AWS, the company migrated its on-premises workloads to Amazon EC2 instances. “Amazon EC2 instances enabled us to scale and to train more models much faster than we had before,” says Tom Forbes, a software engineer at Onfido.

Discovering the Advantages of a Managed Kubernetes Service

While hosting its workloads on Amazon EC2 P3 Instances, Onfido moved to Kubernetes, relying on an open-source component called Kops to manage its Kubernetes cluster. But Onfido ran into problems—including outages and network issues. Upgrades were overly complex and involved downtime that Onfido wanted to avoid. “There was always a fear of what could go wrong,” says Amin. “I think one of our biggest Kubernetes outages happened because there was a networking component with a hardcoded limit of 100 nodes, and we struggled to understand why our cluster was continually falling down.”

Onfido looked into options to manage its Kubernetes-based processing, eventually landing on Amazon EKS. “We use our custom-built service to submit training jobs on Amazon EKS. This service uses the Python client for the Kubernetes API and the Kubeflow Fairing library under the hood,” says Roopali Parab, DevOps engineer at Onfido. Harvey Johal, DevOps manager at Onfido, admits that the company was initially skeptical about Amazon EKS because of the number of settings. “But it’s a very simple service to use, which honestly makes it easier because it takes out a lot of the complexity that can come with running Kubernetes on your own.” By using Amazon EKS with TensorFlow Serving, Onfido could have high-density, high-performance models while easily scaling up and down with consumer demand. “We’ve seen at least a two if not three times increase from when we started using Amazon EKS last year to where we are now in terms of volume and throughput,” says Forbes.

Onfido also adopted Amazon S3 Intelligent-Tiering, which is a cloud object storage class that delivers automatic cost savings by moving data between two access tiers—frequent access and infrequent access—when access patterns change and is ideal for data with unknown or changing access patterns. This enabled Onfido to pay less to store the data that it doesn’t use often, helping the company realize a 27 percent cost savings.

With its expanded use of AWS services, Onfido has seen its verifications soar 1,200 percent in the past 2 years. “Without the capability of AWS, we just wouldn’t be able to handle this number of jobs,” Forbes notes.

Delivering Customer Service and Security

Now that Onfido’s use of Amazon EC2 has cut down on the complexity of its on-premises configurations, Onfido researchers are free to focus on designing machine learning tests that improve the accuracy of the company’s automated services. “We’re able to get to the core of our job, which is machine learning magic, rather than worrying about if and when and how jobs are going to run,” says Forbes. And when Onfido needs help, AWS is there to provide support, whether over chat or videoconference. “I feel that any piece of feedback that we communicate back is heard and understood,” says Lyubomira Dimitrova, technical program manager at Onfido.

For a company like Onfido, data security is paramount, and AWS offers the level of security it needs. AWS CloudTrail provides Onfido with a robust suite of services to detect security breaches, including governance, compliance, operational auditing, and risk auditing; AWS Key Management Service (AWS KMS) makes it easy for customers to create and manage cryptographic keys. “We encrypt now on AWS KMS using Amazon S3 batch operations,” says Forbes, “and the associated AWS Lambda enables us to do this really easily across hundreds of millions of bits of media.”

Providing Computing for Millions of ID Checks

The team at Onfido has found that the wide array of services from AWS work together seamlessly to improve its machine learning–powered identity verification technology. “It’s nice to see a cloud provider with so many different services so cohesively put together,” says Forbes. “I think it’s an impressive feat of engineering.” Proud of what it has already built, Onfido is excited to continue exploring how it can take advantage of AWS solutions. “For every technical problem we have, we always ask, ‘Does AWS have a service for this?’ And the answer is often yes,” says Amin. “As a company, we’re hooked on it.”

Onfido's ML Architecture


About Onfido

Founded in 2012, Onfido helps companies verify their users with identity as the key to access. Onfido uses machine learning to validate a person’s government-issued ID, compare it with their facial biometrics, and uncover fraud quickly.

Benefits of AWS

  • Increased identity verifications by 1,200%
  • Saved 27% with Amazon S3 Intelligent-Tiering
  • Increased throughput by 2–3x with Amazon EKS
  • Ran significant numbers of training jobs with high data storage levels on tight timelines

AWS Services Used

Amazon Elastic Compute Cloud

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.

Learn more »

Amazon EC2 P3 Instances

Amazon EC2 P3 instances deliver high performance compute in the cloud with up to 8 NVIDIA® V100 Tensor Core GPUs and up to 100 Gbps of networking throughput for machine learning and HPC applications.

Learn more »

Amazon Simple Storage Service

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.

Learn more »


Get Started

Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.