Customer Stories / Software & Internet / United States 

2024
Perplexity logo

Perplexity Builds Advanced Search Engine Using Anthropic’s Claude 3 in Amazon Bedrock

Learn how Perplexity’s AI-powered search engine uses Amazon Bedrock and Anthropic’s Claude 3 for accurate, comprehensive answers to user queries.

Off-loads

management of ML infrastructure

Provides

multiple LLM options for users

Simplifies

access to open and proprietary models

Scales

to accommodate additional models

Overview

Perplexity wanted to offer a powerful alternative to the traditional online search engine, so it created an interactive search companion that provides personalized and conversational answers that are backed by a curated list of sources. Users can choose among multiple high-performing large language models (LLMs) for relevant, accurate, and understandable information.

To simplify access to proprietary models, such as the popular cutting-edge LLM Claude from Anthropic, and to fine-tune open-source LLMs, Perplexity needed a powerful global infrastructure for its search engine, Perplexity AI. The company chose to build Perplexity AI on Amazon Web Services (AWS), which provides a breadth of services offering enterprise-grade security and privacy, access to industry-leading foundation models (FMs), and applications powered by generative artificial intelligence (AI). In addition to running its own models on AWS, Perplexity offers its users access to Claude through Amazon Bedrock, a fully managed service that offers a choice of high-performing FMs from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities that organizations need to build generative AI applications with security, privacy, and responsible AI.

AWS re:Invent 2023 - Customer Keynote Perplexity | AWS Events

Opportunity | Building a Conversational Search Engine Using AWS

Launched in December 2022, Perplexity AI can gauge context and personalize interactions by learning a user’s interests and preferences over time. Users also gain visibility into the credibility of information because each search result is accompanied by a list of sources.

Since the inception of their public API service, Perplexity has been using Amazon SageMaker, a fully managed service that brings together a broad set of tools for high-performance, low-cost machine learning (ML) for virtually any use case. After evaluating several cloud providers, Perplexity chose AWS for training and inference of its models to complement its use of Amazon Bedrock. “Using AWS, we had access to GPUs and benefited from the technical expertise of the proactive AWS team,” says Denis Yarats, chief technology officer at Perplexity. The company tested instance types from Amazon Elastic Compute Cloud (Amazon EC2)—which delivers a broad choice of compute, networking up to 3,200 Gbps, and storage services purpose-built to optimize price performance for ML projects. Specifically, Perplexity uses Amazon EC2 P4de Instances, which are powered by NVIDIA A100 GPUs and are optimized for distributed training, to fine-tune open-source FMs.

Through Amazon Bedrock, Perplexity AI users can select a model from the Claude 3 family of models from Anthropic, an AWS Partner. Claude 3 models feature expert knowledge, accuracy, and contextual understanding in addition to state-of-the-art performance. “Using a high-performing service such as Amazon Bedrock means we are tapping into Anthropic's powerful models in a way that allows our team to effectively maintain the reliability and latency of our product” says William Zhang, technical team member at Perplexity.

kr_quotemark

Using a high-performing service such as Amazon Bedrock means we are tapping into Anthropic's powerful models in a way that allows our team to effectively maintain the reliability and latency of our product.”

William Zhang
Technical Team Member, Perplexity

Solution | Enhancing a Responsible and Accurate Search Experience Using Amazon Bedrock and Anthropic’s Claude 3

Because Claude provides information in concise, natural language, users can arrive at clear answers quickly. Users can also quickly upload and analyze large documents because Claude 3 models feature a context window of 200,000 tokens, the equivalent of roughly 150,000 words or more than 500 pages. “Ease of use is essential for making something part of our product,” says Zhang. “Using Claude 3 on Amazon Bedrock has been part of a great developer experience.”

Perplexity aims for every search result to be accurate and helpful by reducing hallucinations—inaccurate outputs of LLMs. Anthropic’s previous model, Claude 2.1, had already reduced their rate of hallucination by half. And Anthropic made further improvements in reducing hallucinations and increasing accuracy with the Claude 3 family, which has even further improved accuracy over Claude 2.1. As Anthropic works to drive model hallucinations to zero, Perplexity uses human annotators to further give its users accurate, safe, and trustworthy information. Additionally, Perplexity benefits from the commitment of Anthropic and AWS to responsible AI. “We appreciate that Amazon Bedrock has built-in content filters to alert us when people are trying to use our solution for unintended purposes,” says Aarash Heydari, cloud infrastructure engineer at Perplexity. As a safety and research company at its core, Anthropic is a market leader in combating “jailbreaks”—that is, attempts to generate harmful responses or misuse models.

Perplexity also continues to fine-tune other models on its AWS-powered infrastructure. In August 2023, Perplexity became an early beta tester of Amazon SageMaker HyperPod, which removes the undifferentiated heavy lifting involved in building and optimizing ML infrastructure for training FMs. Perplexity’s engineers worked alongside AWS solutions architects to create a groundbreaking scalable infrastructure that automatically splits training workloads across accelerated Amazon EC2 P4de Instances and processes them in parallel. Amazon SageMaker HyperPod is preconfigured with the distributed training libraries of Amazon SageMaker, further improving performance. “The speed of the training throughput doubled,” says Heydari. “The infrastructure was simple to manage, and the hardware-related failures dramatically reduced.”

To learn more about how Perplexity accelerates foundation model training by 40% with Amazon SageMaker HyperPod, read this case study.

After 2 months, Perplexity released a public API so that users can access its proprietary online models, Sonar Small and Medium, which are hosted on AWS and fine-tuned using Mistral 7B and Mixtral 8x7B. These online LLMs prioritize knowledge from the internet over training data to respond to time-sensitive queries. “Our infrastructure for model training and inference is all powered by Amazon SageMaker HyperPod, which was a critical factor for us in choosing AWS,” Heydari says. “Amazon SageMaker HyperPod has been instrumental in driving our AI innovation.”

Perplexity AI continues to offer users a selection of models that suit their needs, automatically accessing the recent iterations of Claude and driving the availability of new features for users.

“On AWS, we have a highly reliable experience with all the pieces of infrastructure that need to come together to make our complex product work,” says Heydari. “We stay on the cutting edge of AI capabilities, use powerful models, and are open to anything that enhances our user experience.”

About Perplexity

Perplexity AI is an AI-powered search engine and chatbot that uses advanced technologies such as natural language processing and Amazon Bedrock to provide accurate and comprehensive answers to queries from more than 10 million monthly users.

AWS Services Used

Amazon Bedrock

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.

Learn more »

Amazon SageMaker HyperPod

AmazonSageMaker HyperPod removes the undifferentiated heavy lifting involved in building and optimizing machine learning (ML) infrastructure for training foundation models (FMs), reducing training time by up to 40%.

Learn more »

Amazon EC2

Amazon Elastic Compute Cloud (Amazon EC2) offers the broadest and deepest compute platform, with over 750 instances and choice of the latest processor, storage, networking, operating system, and purchase model to help you best match the needs of your workload.

Learn more »

More Generative AI Customer Stories

no items found 

1

Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.