Faculty Uses Amazon EFS to Scale Innovative Machine-Learning Platform

faculty crop

More than 2,000 data scientists across the globe are solving their business problems with the help of Faculty (formerly ASI Data Science), a London-based artificial intelligence (AI) solution provider. Relying on the Faculty platform (formerly SherlockML), data scientists can write code that enables them to quickly create containerized machine-learning models and access large data sets.

Since Faculty’s launch in 2014, it has run its platform in the Amazon Web Services (AWS) Cloud. Andrew Brookes, the company’s chief technology officer, says, “We created a tool for enterprises, and they trusted AWS because it is the biggest and most successful cloud provider.” As the company’s customer base grew, Faculty needed a more scalable shared-file storage system. Brookes says, “Some projects require up to 10 terabytes of storage, and we needed the data science infrastructure to be able to scale easily to meet that, without us needing to provision or resize the file system to accommodate variance in data size.”

As a small company, Faculty also wanted to scale its file system without having to spend time managing the underlying infrastructure. Scott Stevenson, data engineer at Faculty, says, “Our core mission is building compelling software for our customers. We only have 15 people working on the Faculty platform, and we need to spend our time creating new features for our customers. We were focusing too much on managing the network file storage solution.”

In addition, the company wanted to give its users an easier way to collaborate. “Data scientists had silos of data that they worked on,” Stevenson says. Because there was no shared workspace, teams struggled to collaborate, and data could be out of date if teams were not alerted in time.

“The sign of a great technology is that you forget it’s there. Amazon EFS just works. It requires zero maintenance on our end.”

- Scott Stevenson, Data Engineer, Faculty

  • About Faculty
  • Benefits
  • AWS Services Used
  • About Faculty
  • Headquartered in the United Kingdom, Faculty is a provider of data science, machine-learning, and artificial intelligence solutions. The company’s data science platform gives data scientists the ability to use code to build machine-learning models and gain access to large data sets.

  • Benefits
    • Scales to support 10 TB of customer data
    • Deploys the Faculty platform days faster
    • Gives developers more time to build innovative features
    • Increases collaboration and development consistency 
  • AWS Services Used

Centralizing File Storage Using Amazon EFS

Faculty began using Amazon Elastic File System (Amazon EFS), a cloud-native shared-file system and scalable file storage, to overcome its challenges. “Amazon EFS was the best solution for our shared storage needs. Using this service, we no longer have to provision storage or worry about managing the network file system ourselves,” says Stevenson.

All Faculty data is stored on a single Amazon EFS file system, with one subdirectory per data science project. Amazon EFS gives Faculty users a shared workspace for code and data, which can be mounted on each customer’s data science environment. Any changes to a machine-learning model are reflected immediately on collaborators’ and data scientists’ machines.

Faculty also uses AWS CloudFormation scripts to provision code. “Using AWS CloudFormation, we have completely automated the deployment of Faculty,” Stevenson says.

Building New Features for Data Scientists Instead of Managing Storage

Using Amazon EFS, Faculty no longer needs to manage file storage for data science projects. With all data stored in a central file system, the company has a reliable, secure, and highly available solution for storing data. “The sign of a great technology is that you forget it’s there,” says Stevenson. “Amazon EFS just works. It requires zero maintenance on our end. Instead of trying to build and manage our own storage system, which would be technically challenging, we can rely on Amazon EFS to manage it for us.”

By migrating the Faculty file-storage system to Amazon EFS, engineers can spend more time innovating. For example, Faculty built a new feature that enables data scientists to spin servers up and down on demand and to train machine-learning models in parallel. “This is something data scientists have struggled to do on their own, which we were able to accomplish due to the time saved by moving to Amazon EFS,” says Stevenson.

Faculty-ASIDataScience_ArchitectureDiagram_resize

Supporting Customers’ Current and Future Needs Better

Faculty is taking advantage of the elastic scalability of Amazon EFS to better support its customers’ growing data storage needs. Recently, Faculty worked with the U.K. government’s Home Office on a machine-learning project to automatically identify terrorist propaganda online. During this project, the Home Office stored 10 terabytes of video data on the Faculty platform—representing thousands of videos. The data scientists used Faculty to create and train a machine-learning model that detected 94 percent of propaganda and automatically rejected extremist content. “The data scientists working on the project didn’t have to worry about how to store 10 terabytes of data; Amazon EFS easily accommodated that scale,” says Brookes. Additionally, Faculty did not have to do any provisioning or resizing of the file system to handle the increased volume of data during batch workloads.

By automating the provisioning of Amazon EFS file systems through AWS CloudFormation, Faculty can more quickly deploy the Faculty platform for new customers. “We deploy Faculty to new customers weekly, and because we’ve built the provisioning into code through AWS CloudFormation, everything is automated,” says Stevenson. “We are saving deployment engineers days of effort every time we deploy the solution.”

Faculty has also improved collaboration between data scientists. “There is more consistency in the data science development process because everyone is working with the most updated code base thanks to Amazon EFS,” says Stevenson. “This ultimately helps us deliver a better product to our customers.”


Learn More

Learn more about Amazon Elastic File System (Amazon EFS).