AWS Startups Blog

Using Machine Intelligence to Predict Your Career Success

The untapt Team at the AWS Pop-up Loft New York

The untapt Team at the AWS Pop-up Loft New York

Guest post by Ed Donner, CEO, untapt

There was great excitement in the artificial intelligence (AI) community last month with the launch of OpenAI, the non-profit that’s on a mission to use AI to benefit humanity. The group, chaired by Elon Musk and Sam Altman, is supported by a who’s who of tech luminaries including Reid Hoffman and Peter Thiel.

It was a fitting conclusion to a year filled with ground-breaking advances in AI and machine learning. Personal assistant launched to great reception. Amazon unveiled the Echo that has a wide range of abilities from ordering food to playing music. Facebook created a stir by unveiling Facebook M, a mysterious intelligence that uses an unknown algorithm, and now Mark Zuckerberg has an intriguing New Year’s resolution to build an intelligence system of his own.

Here at untapt, we’re using machine learning to reinvent the hiring process for software engineers. We’ve built a digital hiring platform that matches developers with hiring managers and puts them directly in touch with each other. Initially focused on FinTech, untapt will soon expand to cover technology hiring across broad industry verticals. I want to share a bit about our history, how we came to be, what we’re planning next, and how AWS helped make it happen.

The Human Element

What we learned: There have been many attempts to ‘solve’ recruitment over the last two decades, but it’s proven challenging to automate a process that is so nuanced and dependent on human factors.

First, I should start with the obvious question — why do software engineers need another job site?From a hiring perspective, the developer community is already a well-served demographic. Between incumbents like LinkedIn and, and newer, more progressive companies like Hired and AngelList, engineers already have plenty of places to turn to when it comes time to looking for work.

And anyway, what’s so difficult about matching people with jobs? Isn’t this a solved problem? It’s easy enough to use any of those platforms to build a list of Java developer jobs living in New York City, for example. Is that not enough?

Well, as both software engineers and hiring managers will tell you, there’s a lot more to successful recruitment than simple keyword matching. One of our engineers noted that there are dozens of complex factors that both parties must take into account as they consider a possible match: salary, benefits, career history, engineering specialty, company size, seniority, location, visa status, relocation preferences, and the controversial “cultural fit.” The list goes on. Ignore any of those factors at the match phase, and there’s a good chance you’re wasting someone’s time.

This is where traditional recruiters excel. Humans are exceptional at handling these nuances with grace. There’s a reason that recruiters still exist 17 years after the launch of — humans are simply better equipped to deal with these kinds of complexities than heuristic-based digital matching systems.

With that said, some complex matches are beyond the reach of even the most exhaustive human recruiters. Consider a FinTech startup in Palo Alto that’s hiring locally, but willing to offer relocation if just the right candidate came along. And here comes that very exceptional candidate: Jen, a Python developer in the Chicago area, looking for a quantitative role in a startup environment. Human recruiters, limited by the bandwidth of their own working hours and contact network, would almost certainly miss the unlikely match — however suitable it may be.

We’re leveraging recent advances in statistical learning to bridge the gap between human recruiters and competing digital recruitment platforms. It’s working: the platform’s candidate suggestions are already three times more likely to be invited to interview by hiring managers candidates to managers relative to the human selection process. We’re helping engineers and hiring managers savetime, so they can get back to what they’re best at — writing great software.

Seeding the Marketplace

There’s a difficult chicken-and-egg problem in starting a business that relies on machine learning — you need data to train the model, but you need a model in place to start collecting data!

Before we could begin creating our machine learning engine, we needed to gather data on job seekers as well as hiring managers and firms.

Then we needed data about how hiring decisions are made at scale. For example, how many hiring managers are willing to relocate candidates? What are the most common skills among front-end developers? How do visas affect hiring manager decisions?

Here’s how we approached it. In February 2015, we began tackling the chicken-and-egg problem by unleashing an early version of untapt to the world, with a basic algorithm. There, engineers could browse roles they matched to, read up on hiring companies, and watch videos of hiring managers describing their roles. If candidates were interested, they could connect directly with the manager.

It took off much faster than we were expecting! By September, we had 15,000 engineers signed up, 70 companies hiring, and over 7,000 job applications. Those figures provided sufficient data to train a more sophisticated model.

Machine Learning to Predict the Interview Decision

Finding and improving the model takes research, experimentation, and perseverance.

The goal of our algorithm is to predict whether a developer will be interested in a particular role and will make it to the interview process. The algorithm uses a classification model with inputs related to both the developer and the job. To obtain increasingly valuable inputs, we experimented with hundreds of features, including interaction terms that incorporate attributes of both the developer and the job.

Some of the features were generated via a statistical technique called topic modeling to come up with some of the features. This approach looks for patterns of words to provide a more human-like, nuanced understanding of resumes.

We’ve made several surprising discoveries. For example, it turns out that distance is a less important factor than we expected (people are more willing to relocate than we realized and hiring managers are generally happy to speak to any strong candidate, regardless of proximity.) Also, different hiring managers often look for completely different things — some are strongly influenced by which school you went to, while others think it’s irrelevant.

The algorithm learns the preferences of each separate hiring manager from the decisions they make. We’re finding that we know more about the hiring practices of a manager than they do.

The results so far look very promising indeed. Managers are asking to interview 40% of their candidates, which is ahead of the industry by a wide margin (it can be as low as 1% for online job boards). And this is only the beginning. We’re experimenting with Factorization Machines, an advanced approach that considers a wider set of inputs to improve prediction accuracy, and allows us to combine our classifier model with collaborative filtering techniques.

Ready for Massive Scale on AWS

If you’re building new marketplaces with your company, you need infrastructure with a low barrier to entry and ability to scale rapidly. Make sure you aren’t locked into a particular set of technical choices that may inhibit you or not support your long-term goals.

When we started out, we were on a shoestring budget and following the Lean Startup model, and we wanted our product in the market as quickly as possible.

We needed to roll out something fast, at nominal cost, that could ramp up rapidly. Initially, we selected a cloud service that offered a simple integrated solution. But we quickly learned that we would be locked into a proprietary tech stack, which limited our future choices.

One month in, we made the switch to AWS, running our infrastructure across multiple Amazon EC2 instances and making extensive use of Amazon S3, the Amazon CloudFront CDN, and the Amazon Route 53 DNS. Our tech stack included Python/Flask on Heroku, MongoDB and PostgreSQL databases, AngularJS client, and R for data analysis and modeling. We’ve been experimenting with the Amazon Machine Learning API because of its rapid deployment of advanced models.

Amazingly, our infrastructure cost for the first few months was under $100, despite multiple development and test environments running on EC2. As the results flowed in and our business model was validated, we dialed up our marketing spend and saw traction across the platform. AWS kept pace. We spun up Heroku dynos and added EC2 instances to meet the demand.

Looking ahead to 2016, we expect to achieve 10X volumes in the next year. Here’s the thing: as we scale, we get more and more data to train the algorithms, and we become increasingly accurate at predicting the best fit between engineer and role. Which means we get better at facilitating more talented developers to discover ever-more-appropriate dream jobs.. We demonstrated this at the AWS loft event in New York and were voted the startup most likely to grow exponentially — which is very much the plan!