Osmo digitizes smell and cuts AI costs by 200x with Meta Llama on AWS

¿Qué le pareció este contenido?

As part of its journey towards digitizing scent, Osmo builds AI-powered tools that generate fragrance formulas from creative briefs. Model development presents a uniquely complex challenge because results can only be evaluated by physically producing the scent. The team initially relied on proprietary frontier model APIs, but this became unsustainable due to limited customization, lack of model ownership, and high costs. By moving to Meta Llama on Amazon Web Services (AWS), Osmo gained full control of its models and infrastructure, cutting training costs by 20 times and inference costs by 200 times. In evaluations by trained perfumers, its open-model system matched the performance of leading frontier models.

Digitizing the final sense of human perception

Computers have learned to see. They've learned to hear. They can detect light, motion, and sound at scales and speeds that far exceed human ability. But for most of computing history, one sense has remained out of reach: smell. Osmo is on a mission to change that. The company combines machine learning, perfumery, chemistry, and neuroscience to build what it calls "Olfactory Intelligence,” giving AI systems the ability to sense and interpret the chemical world.

Today, Osmo primarily serves the fragrance industry, while also actively expanding how scent can be used as a medium for brand storytelling. Through its Studio platform, users start from a sentence, a mood board, or an image, then set guardrails such as notes to avoid, budget, and clean standards. Studio translates those inputs into manufacturing-ready formulas that are delivered to customers’ doorsteps as physical samples. By lowering the barriers to scent creation, Osmo is transforming fragrance from an exclusive luxury into an accessible, powerful tool for modern brand expression.

Building this technology is a uniquely difficult machine learning problem. Smell is extraordinarily high-dimensional. The human nose has roughly 400 distinct receptor types, compared to just three for vision, and odor perception depends on complex molecular combinations. Human descriptions of scents are notoriously inconsistent, making labeled data hard to collect. "Unlike most AI systems, we can't evaluate results instantly. We have to physically produce the fragrance and have experts smell it," says Richard Whitcomb, CTO of Osmo. Each evaluation cycle takes approximately two weeks. Trained evaluators assess samples for how well they match the input brief along with qualitative attributes like balance, wearability, and overall quality as a fine fragrance. This creates a tightly coupled feedback loop between model output and real-world evaluation. Because each loop requires physical production, iteration is slow and expensive, making the cost and flexibility of the underlying AI infrastructure critically important.

Osmo initially approached the problem using large frontier model APIs, but the constraints of proprietary platforms became hard to ignore as its ambitions grew. Training methods were limited to what the platform permitted, and in a domain this specialized, the ability to deeply customize model behavior wasn't optional. Neither was ownership: in a field where fine-tuned weights and proprietary training data are the competitive advantage, building on models Osmo didn't control was an increasingly uncomfortable place to be.

Moving from closed models to full-stack AI ownership

Meta Llama models on AWS gave Osmo the control and flexibility it needed to build its system the right way. “We think about fragrance creation like writing—building formulas ingredient by ingredient, similar to how language models generate text,” says Wesley Qian, VP of engineering and research at Osmo. “You lay down the major components first, then fill in the gaps, and the model learns those patterns across data.” With Llama’s open weights, Osmo was able to better support that structured, sequential process by shaping model behavior, customizing training approaches, and retaining control over its infrastructure and models. That included implementing techniques such as Direct Preference Optimization (DPO), building agentic loops for iterative fragrance design, and exploring reinforcement learning approaches like Group Relative Policy Optimization (GRPO).

To support this approach at scale, Osmo built a flexible training and deployment pipeline on AWS. Amazon SageMaker powers Osmo's training and fine-tuning workflows on proprietary datasets, while model weights are stored in Amazon Simple Storage Service (Amazon S3) and imported into Amazon Bedrock for deployment. Amazon Bedrock provides auto-scaling for inference endpoints, leading to cost-efficient serving across both experimentation and production workloads. "Moving to Llama on AWS changed how we think about building AI systems," says Whitcomb. "Instead of adapting our product to fit a model, we can now adapt the model to fit our product."

Throughout the process, AWS teams worked closely with Osmo, meeting weekly to evaluate data preparation strategies and model configurations. They provided reference implementations, debugging support, and training and inference scripts so Osmo could move quickly from experimentation to a stable production pipeline. After evaluating Meta Llama variants, AWS and Osmo identified Meta Llama 3.1 8B and 3.3 70B Instruct models as the strongest performer, with the 8B model also producing viable outputs. The collaboration extended into how Osmo refines its models over time: high-performing formulas are fed back into training through a filtered augmentation loop, adding successful outputs back into the dataset so that only validated results inform future model generations. "What AWS enabled for us is a true end-to-end workflow from training to deployment that we fully understand and control," says Whitcomb.

Achieving parity, cutting costs, and taking control

Evaluations conducted by trained evaluators proved the Meta Llama 3.3 70B model achieved performance parity with Osmo's production model built on large frontier APIs. There was no statistically significant difference across core metrics, including fit to descriptors and qualitative fragrance attributes. “We knew we were on the right track when our sensory panels couldn’t distinguish between the Meta Llama 3.3 70B outputs and those from much larger frontier models,” says Sam Barnett, senior machine learning engineer at Osmo. Most significantly, training costs dropped by approximately 10–20 times, and inference costs fell by up to 200 times depending on model size and throughput. It was a fundamental shift in what's economically possible. As Whitcomb says, "We can experiment more, iterate faster, and ultimately deliver better results to our customers, all while maintaining ownership of our models and data."

By eliminating dependency on proprietary AI providers, Osmo also removed the risk of forced deprecations or unexpected behavior changes. "This is really about moving from renting intelligence to building it ourselves," says Whitcomb, "and creating a digital perfumer that we fully control and can continuously improve." With costs reduced and control fully in hand, Osmo can iterate faster, generate more data, and refine its core models. “When you control the model and infrastructure, you can explore ideas that just aren’t possible in a more constrained system,” says Qian. This accelerates its broader mission to make olfactory intelligence as accessible and powerful as the tools we already have for sight and sound. "Once machines can interpret scent, you unlock entirely new applications," says Whitcomb, “from flagging spoiled food or sensing mold in a building to detecting disease and identifying environmental hazards."

Learn how Amazon SageMaker and Amazon Bedrock open up new possibilities for building, customizing, and deploying generative AI models with full control.

¿Qué le pareció este contenido?