Skip to main content
2024

Faraday Lab Deploys LLMs Designed for French Accessibility in Days Using Amazon SageMaker

With most generative artificial intelligence (AI) solutions being built on large language models (LLMs) that are trained on English datasets, startup Faraday Lab created its own generative AI trained on French language datasets to improve accessibility and diversity.

Overview

To scale its LLMs, Faraday Lab migrated to Amazon Web Services (AWS). By using AWS and working with Data Reply France, an AWS Partner, Faraday Lab reduced LLM training time, increased inference speed, and maintained data sovereignty for its clients.
Missing alt text value

Benefits

50%

faster LLM training time

30%

increased inference speed

About Faraday Lab

Faraday Lab is a French startup providing open-source large language models to 45,000 users across 30 countries. The company trains its artificial intelligence models in French and other European languages to make its solution more inclusive and to improve the quality of responses.

Opportunity | Training LLMs in Days on Amazon SageMaker for Faraday Lab

Faraday Lab, founded in 2023, aims to improve generative AI accessibility and diversity by training and fine-tuning LLMs on French language datasets to give AI access to everyone, without geographic limitation. The startup’s original on-premises architecture wasn’t scalable, so it migrated to AWS and began using Amazon SageMaker—designed to build, train, and deploy machine learning models for any use case.

Faraday Lab worked with Data Reply France to learn how to use AWS infrastructure and tools. Data Reply France helped Faraday Lab with its fine-tuning process and with cleaning an open-source dataset to reduce it from 14 million samples to 100,000, with 10,000 samples of high-quality language from the French Parliament. Using this dataset, Faraday Lab fine-tuned its model on Amazon SageMaker and deployed it in 15 days using Llama 2 Chat as a base model. The final tool, Ares chat, provides a no-code user interface for users to interact with Aria LLM and generate images using Ares creative, which is powered by stable diffusion XL 1.0.

Solution | Reducing LLM Training Time by 50 Percent and Maintaining GDPR Compliance Using AWS

By training its LLMs on Amazon SageMaker, Faraday Lab improved its models’ accuracy. Its ARIA 70B version 3 model has a massive multitask language understanding score of 64.75. The company has reduced its LLM training time by 50 percent—the training itself takes only 2 days—and increased inference speed by 30 percent.

Faraday Lab uses models from Hugging Face, a community for open-source generative AI, including Llama 2 and Falcon. Faraday Lab’s own models are deployed on Hugging Face, and in the company’s first 6 months, 10,000 users have downloaded them. The team also created a dedicated chrome extension as an image generation tool for Ares creative. Most of Faraday Lab’s tools are accessible, without coding skills, straight from the Faraday website.

Faraday Lab also meets General Data Protection Regulation (GDPR) requirements. Its inference endpoints are hosted on AWS servers located in Europe so that both the user’s prompts and the model’s responses remain in the EU. By training its models on French, Faraday is working on data diversity and reducing data bias based on language and culture.

Using Amazon SageMaker, Faraday Lab keeps its costs low. “We can have an industrial-level solution with a small budget using AWS,” says William Elong, CEO of Faraday Lab. “The speed of training our models and the low spend are critical advantages. Using Amazon SageMaker to save time and costs is priceless.”

Outcome | Innovating Accessible Generative AI for the Future

Faraday Lab is expanding its solution to provide accessible generative AI to more users around the globe. It plans to implement AWS Inferentia—accelerators designed by AWS for deep learning inference applications—to reduce inference costs even further.

“I would invite all startups to use Amazon SageMaker to train their LLMs,” says Elong. “You have the best of both worlds with inference capacity and cost-effective training capacity.”

Missing alt text value
I would invite all startups to use Amazon SageMaker to train their LLMs. You have the best of both worlds with inference capacity and cost-effective training capacity.

William Elong

CEO, Faraday Lab