AWS Startups Blog
Amenity Analytics Uses a Serverless-First Architecture and NLP to Break Down Text-Based Financial Data
Data has always been the lifeblood of financial analysis, back to the days of chalk boards and ticker tape. In fact, finance has arguably pushed technical innovation more than most fields, from punch tape to telex machines to real-time electronic quotes on stock tickers both big (Times Square) and small (scrolling text on TV).
The technological innovations we’ve witnessed in the 21st century include split-second automated trades, quant modeling and a turn toward big data. These days, investors, financial analysts and insurers can parse and review heaps of structured data, such as financial metrics and stock prices, with ease.
At the same time, says Co-founder and CEO Nathaniel Storch, it can take hours if not days to compile equally useful information that’s buried in textual data. Storch, a former financial analyst, says he “felt this pain personally while analyzing public companies.” It took him untold hours to get the information he needed from written data such as regulatory filings, news articles, research reports and earnings call transcripts. And that was for a single company. “Trying to get this information at scale was impossible. So we built Amenity Analytics to help our clients address this critical problem and treat information in text the same way they treat structured data.”
The startup which has offices in Israel and New York, is at heart a natural language processing (NLP) company. Its algorithms sift through tremendous amounts of data, processing around a million pieces of text information per day. The software gleans insights that are then shared with its clients, which include the likes of Nasdaq and Moody’s. “Some of the most important information our clients need to inform their business decisions exists in text formats, and it goes largely unexploited as a source of insight due to the difficulties of analyzing text meaningfully,” says Vice President of Engineering Roy Penn.
For its customers—which include some of the world’s largest insurance companies, banks, investment firms and more—the company’s software generates top-line trends and scores around the ideas it unearths, then pinpoints the specific articles and sentences referenced.
For insurance companies, for example, Penn says, “We analyze and refine millions of news and other documents per day and put that into a clear set of risk metrics alerting underwriters to potential problems, with full transparency into the source content.”
Amenity’s customers expect that the company will unearth actionable data points, says Penn, “even if they’re hidden behind layers of wordsmithing, so we employ state-of-the-art linguistic pattern matching techniques.” According to Penn, the key to Amenity’s success in the NLP field has been creating its own framework. Most companies who use NLP—a branch of machine learning focused on understanding linguistic data as it is given by people, versus the well-defined outputs of computers—run commonly known algorithms. However, by devising and creating “whatever algorithms we want,” Penn says, “we are able to operate them in ways that have an edge on other companies.”
Predictably, this system of complex NLP classifications means heavy CPU workloads, which are handled by a serverless-first AWS architecture. “Our entire stack is based on AWS tools. We’ve written big parts of it in C and managed to squeeze it into several Lambda functions,” says Penn. “This gives our data scientists the ability to run many experiments quickly and cheaply. Overall, moving to serverless NLP has reduced our costs of analysis by 90 percent and time of analysis by 95 percent,” he says. Additionally, “maintenance and code complexity are reduced when we use serverless patterns, and that translates to faster development cycles.”
Penn is a specific fan of the extract, transform, load (ETL) process through AWS Glue, saying that it perfectly fits Amenity’s needs for “a system that is fast enough, economical enough and scalable enough so that you can handle both very slow news days and crazy requests, like a customer needs 10 million piece of information analyzed in a day.” He also cites time and cost savings: “With the ‘new idea’ process of ETL, we save about 50 percent of the cost, and with NLP, we managed to reduce the cost by about 10x and reduce the time of analysis by about 20x to 100x. And that is huge, because by doing that you enable yourself to complete a creative cycle faster, think faster, and implement and test faster.”
In addition to the publicly available information Amenity scans and supplies to banks, investors and finance companies, it also can give companies internal intel by examining their secure, private documents. “It could be secure because of personal identifiable information, or maybe it has some of your secret sauce in it,” says Penn. “Hedge funds might have secret trade information and want to analyze their own documents but not let anyone else know about them. Insurance companies have a lot of emails going back and forth with their customers, and they might want to know the trends and topics that are being surfaced in their emails.” What constitutes valuable information differs from one company to another, Penn explains, but by configuring and tweaking data using the algorithms the Amenity team has created, “we can find whatever is important for everyone in their own universe.”
In the future, Penn says, Amenity plans to expand its offering to other industries, such as health, legal and education. The company also wants to expand its purview to London. In the meantime, Amenity will keep on gathering and analyzing text-based information and delivering it to its customers. “The more complex the better,” says Penn.