AWS Machine Learning Blog

Pixm takes on phishing attacks with deep learning using Apache MXNet on AWS

Despite numerous cybersecurity efforts, phishing attacks are still on the rise. Phishing is a form of fraud where perpetrators pretend to be reputable companies and attempt to get individuals to reveal personal information, such as passwords and credit card numbers. It’s the most common social tactic.  93 percent of all breaches today start with phishing emails, according to a recent Verizon Data Breach Investigations Report.

Traditional solutions to stop phishing attacks use blacklisting, IP reputation, and spam filters deployed in the cloud to stop already known phishing sites. According to the report, only 17 percent of phishing campaigns are reported, which means attacks from unreported or new phishing sites, also known as zero-day phishing, aren’t stopped. And even when attacks are detected, it can take minutes to hours before the attack is verified and entered into the blacklist database.

Pixm, a New York-based startup, takes a new approach to the growing problem of phishing by using computer vision. Pixm’s deep-learning computer-vision-based endpoint security solution detects phishing attacks in real time at the point of click, within your browser, either on a desktop or a laptop.

“There’s a lot of focus on malware, but it all starts with a phishing email. Yet there isn’t that much focus on phishing security,” explains Arun K. Buduri, Co-Founder and Chief Product Officer at Pixm. “Blacklisting and IP reputation are favored solutions to block phishing attacks, but they are reactive and after the fact.”

Heatmap of all phishing attacks blocked by Pixm during the 2016 US elections, where nearly 70% of the attacks blocked were hosted on US small businesses and universities.

The Pixm solution is deployed on your endpoint device in the same way you might install antivirus software on your desktop. When a customer opens a phishing link in their browser, the Pixm software visually analyzes the page and performs computer vision object detection coupled with spatial analysis to determine if it’s a phishing attack, and shuts it down within a second. For example, perpetrators often target customers of large banks, creating phishing sites that look just like an authorized bank’s website.

Using deep learning computer vision models created with the Apache MXNet deep learning framework, Pixm continually analyzes website screenshots. For example, Pixm uses object detection to train their models on brand logos to detect if a logo on a bank’s login page is authentic.

Pixm evaluated many deep learning frameworks such as Caffe, Caffe2, TensorFlow and Keras, but ultimately chose MXNet because they found it has the best support for multiple operating systems and high-performance model inference using Amazon EC2 compute instances. MXNet also allowed rapid-scale training on large volumes of image data using graphics processing units (GPU) available on Amazon EC2 P3 instances.

“In a span of 4 months we processed 8 to 10 million web pages for our customers,” says Buduri. “For model training, MXNet has distributed multi-GPU training which allows us to bring up a cluster of GPU instances and train across them faster.” Processing millions of web pages at such scale in a short time was possible by using a massively-scalable architecture implemented using AWS services including Amazon API Gateway, AWS Lambda, and Amazon DynamoDB.

Pixm ships its software with MXNet bundled to run deep learning computer vision algorithms directly on the endpoint. By using deep-learning computer vision for real-time detection of phishing websites, Pixm helps protect clients from major cyberattacks and the costs associated with them, which can amount to millions of dollars and loss of customer trust.

The Dana Foundation, a philanthropic organization that supports brain research through grants, publications, and educational programs, uses the Pixm phishing solutions to help protect its users from phishing attacks. “As social engineering attacks become more sophisticated, we wanted a more sophisticated solution as well,” explains James Rutt, Chief Information Officer at Dana. “I like that Pixm is using computer vision to analyze legitimacy. It takes cybersecurity to the next level.”

To get started with deep learning, you can try MXNet as a fully-managed experience on the Amazon SageMaker machine learning platform, or use MXNet with the AWS Deep Learning AMIs (Amazon Machine Images) which provide preconfigured development environments for deep learning.


About the author

Cynthya Peranandam is a Principal Marketing Manager for AWS artificial intelligence solutions, helping customers use deep learning to provide business value. In her spare time she likes to run and listen to music.