Cohere Rerank 3 Model - Multilingual

Rerank will return a sorted list of documents based on the semantic similarity between the query and documents in over 100 languages.

3.9

View purchase options

Try for free

Overview

Try agent mode

Create proposal

Ask question

Cohere's Rerank endpoint enables you to significantly improve search quality by augmenting traditional key-word based search systems with a semantic-based reranking system which can contextualize the meaning of a user's query beyond keyword relevance. Cohere's Rerank delivers much higher quality results than just embedding-based search, and it requires only adding a single line of code into your application. The endpoint supports documents and queries written in over 100 languages.

Please note that as of July 2025, the minimum requirement to deploy this model are the following:

NVIDIA driver version: 535
CUDA version: 12.2

Highlights

Cohere's Rerank endpoint can be applied to both keyword-based search systems and vector search systems. When using Elasticsearch or OpenSearch, the Rerank endpoint can be added to the end of an existing search workflow and will allow users to incorporate semantic relevance into their keyword search system without changing their existing infrastructure. This is an easy and low-complexity method of improving search results by introducing semantic search technology.
This endpoint is powered by our large language model that computes a score for the relevance of the query with each of the initial search results. Compared to embedding-based semantic search, it yields better search results especially for complex and domain-specific queries.
Rerank supports JSON objects as documents where users can specify at query time the fields or keys that semantic search should be applied over.

Details

Sold by

Cohere

Introducing multi-product solutions

You can now purchase comprehensive solutions tailored to use cases and industries.

Learn more

Explore multi-product solutions

Features and programs

Buyer guide

Gain valuable insights from real users who purchased this product, powered by PeerSpot.

Get the buyer guide

Financing for AWS Marketplace purchases

AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.

View financing details

Pricing

Free trial

Try for free

Try this product free for 7 days according to the free trial terms set by the vendor.

Cohere Rerank 3 Model - Multilingual

Info

View purchase options

Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.

Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator to estimate your infrastructure costs.

Usage costs (3)

Info

Dimension	Description	Cost/host/hour
ml.g5.2xlarge Inference (Batch) Recommended	Model inference on the ml.g5.2xlarge instance type, batch mode	$9.16
ml.g5.xlarge Inference (Real-Time) Recommended	Model inference on the ml.g5.xlarge instance type, real-time mode	$8.50
ml.g5.2xlarge Inference (Real-Time)	Model inference on the ml.g5.2xlarge instance type, real-time mode	$9.16

Vendor refund policy

No refunds.

How can we make this page better?

Tell us how we can improve this page, or report an issue with this product.

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Version

Delivery details

Amazon SageMaker model

An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.

Deploy the model on Amazon SageMaker AI using the following options:

Real-time inference

Deploy the model as an API endpoint for your applications. When you send data to the endpoint, SageMaker processes it and returns results by API response. The endpoint runs continuously until you delete it. You're billed for software and SageMaker infrastructure costs while the endpoint runs. AWS Marketplace models don't support Amazon SageMaker Asynchronous Inference. For more information, see Deploy models for real-time inference .

Batch transform

Deploy the model to process batches of data stored in Amazon Simple Storage Service (Amazon S3). SageMaker runs the job, processes your data, and returns results to Amazon S3. When complete, SageMaker stops the model. You're billed for software and SageMaker infrastructure costs only during the batch job. Duration depends on your model, instance type, and dataset size. AWS Marketplace models don't support Amazon SageMaker Asynchronous Inference. For more information, see Batch transform for inference with Amazon SageMaker AI .

Version release notes

A key feature update adjusts the default maximum token limit for per-model reranking to balance performance and resource use, with customization options available via API or configuration. Critical bug fixes resolve the "Empty EncodedTexts" issue in Rerank and Embed endpoints by improving chunking logic for oversized inputs and adding safeguards to ensure valid outputs.

Additional details

Inputs

Summary: The model accepts JSON requests that specifies the input objects to be reranked - the user can specify this at which keys to be reranked by adjusting the rank_fields parameter. Alternatively, the user can just send a list of texts to be reranked.

Req { “model”: “...”, "query": "...?", "documents": [“”...], "max_tokens_per_doc": 1, "top_n": 100 }

Res

{ "results": [ { "index": 0, "relevance_score": 0.0048297215 } ], }

Input MIME type: application/json

Real-time inference sample input data

https://github.com/cohere-ai/cohere-developer-experience/blob/main/notebooks/sagemaker/Rerank%20Models.ipynb

Batch transform sample input data

https://github.com/cohere-ai/cohere-developer-experience/blob/main/notebooks/sagemaker/Rerank%20Models.ipynb

Input data descriptions

The following table describes supported input data fields for real-time inference and batch transform.

Field name	Description	Constraints	Required
query	The search query	Type: FreeText	Yes
documents	A list of document objects or strings to rerank - if a document is provided the text fields is required or if the user specifies specific fields to rerank over, all other fields will be preserved in the response	Default value: [] Type: FreeText	No
top_n	The number of most relevant documents or indices to return, defaults to the length of the documents	Default value: [] Type: Integer Minimum: 0 Maximum: 1	No
return_documents	If false returns results without the doc text - the api will return a list of {index, relevance score} where index is inferred from the list passed into the request. If true returns results with the doc text passed in - the api will return an ordered list of {index, text, relevance score} where index + text refers to the list passed into the request.	Default value: FALSE Type: Categorical Allowed values: TRUE, FALSE	No
max_chunks_per_doc	The maximum number of chunks to produce internally from a document	Default value: [] Type: Integer Minimum: 0 Maximum: 10	No
rank_fields	If you sent a document object, you can specify the fields to rerank over	Default value: [] Type: FreeText	No

Resources

Vendor resources

Cohere

Cohere SDK

Rerank Blogpost

Support

Vendor support

Get support

AWS infrastructure support

AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Similar products

Cohere Rerank 3 Nimble Model - Multi

By Cohere

Rerank will return a sorted list of documents based on the semantic similarity between the query and documents in over 100 languages

View product

Cohere Embed 4

By Cohere

Embed 4 transforms multiple modalities, such as texts and images, into numerical vectors.

View product

Cohere Embed Model v3 - English

By Cohere

Cohere provides a representative AI model: Embed that translates text into numerical vectors that models can understand.

View product

Cohere Rerank v3.5

By Cohere

Rerank improves search systems by sorting documents based on their semantic similarity to a query.

View product

Cohere Embed Light v3 - Multilingual

By Cohere

Cohere provides a multilingual representative AI model, that translates texts and images into numerical vectors that models can understand.

View product

Customer reviews

Leave a review

Ratings and reviews

Info

3.9

10 ratings

5 star

4 star

3 star

2 star

1 star

20%

70%

10%

6 AWS reviews

4 external reviews

External reviews are from PeerSpot .

Diganth Sanghvi

Improved document chatbot has delivered accurate answers and builds stronger client trust

Reviewed on Mar 27, 2026

Review from a verified AWS customer

What is our primary use case?

My main use case for Cohere is that it's a good embedding model. I have used it with Titan, but Cohere came out better.

A specific example of how I've used Cohere for embeddings is when I was working with one of our clients where we were establishing a chatbot that can help us go through 31 PDFs. For embedding, we used Cohere and Titan, and Cohere was a superior product.

I have integrated Cohere in that chatbot project using SageMaker , and it was an easy API call that I used.

What is most valuable?

In my opinion, the best features Cohere offers are the embedding flexibility and the normal way the LLM reacted to the embeddings of Cohere. I used OpenSearch to integrate and store all the embeddings, and I used Titan as well to store the embeddings in OpenSearch , but the result was much better.

The flexibility I mentioned is evident because when we were using Titan, it was hallucinating a lot and not giving proper answers because I felt the embedding was poor. When we used it with Cohere, the embeddings were better and the chatbot with the LLM that used the embeddings from Cohere answered in a better way.

Cohere has positively impacted my organization as the project was a success. Clients were really happy with the results, and we received more business from them.

What needs improvement?

Cohere can be improved by having more integrations beyond its current offerings with Amazon. Integrations with Databricks , Azure , and Google Cloud would be beneficial.

For how long have I used the solution?

I have been using Cohere for the last two years.

What do I think about the stability of the solution?

Cohere is stable.

What do I think about the scalability of the solution?

Cohere's scalability is pretty good, as we used it for 31 PDFs and there was no bottleneck.

Cohere handles large-scale data and workloads really well. I did not see any bottleneck at all because the project was relatively small.

How are customer service and support?

We did not require any customer support.

Which solution did I use previously and why did I switch?

I did not previously use a different solution; this was the first time we were developing it and Cohere was from the start a competitive alternative to Titan, so we were always going to choose between Cohere or Titan.

How was the initial setup?

It was extremely easy for my team to learn and start using Cohere; it was just one API call.

What about the implementation team?

Cohere integrates really well with the tools I use, and I did not experience any challenges with OpenSearch or SageMaker .

What was our ROI?

Regarding the return on investment and any relevant metrics such as money or time saved or reduced resource needs, that is confidential.

What's my experience with pricing, setup cost, and licensing?

My experience with pricing, setup cost, and licensing was that it was all managed by AWS , and we had AWS credits, so I did not have to dive into that.

Which other solutions did I evaluate?

Before choosing Cohere, I evaluated AWS Titan.

The key factors that led me to choose Cohere over other AI development platforms are that I was experimenting, and Cohere was better than Titan.

What other advice do I have?

Cohere has helped my organization innovate and stay ahead in our industry as Cohere was better than Titan, and it helped us to secure the client's confidence and we moved from proof of concept to production.

The advice I would give to others looking into using Cohere is to go for it. Cohere is the best embedding model, and I would not recommend wasting time with AWS Titan.

I rate Cohere a 10 out of 10 because Cohere was way better compared to Titan.

reviewer2802159

Summarization and chat completion have improved workflows but still need built-in OCR support

Reviewed on Feb 11, 2026

Review from a verified AWS customer

What is our primary use case?

I work with Cohere and have been doing so for about two months.

Currently, I am working with AWS Cloud and cloud services, and we use models like GPT-4o mini, 2.1, and Cohere.

We primarily use English only, with no other languages.

What is most valuable?

I assess the value of Cohere's API support in my business operations as easy to integrate.

The specific benefits I have seen from using Cohere include saving time to summarize information and for chat completion and related tasks.

I use Cohere for chat completion purposes.

My thoughts on the summarization feature are that it is beneficial for impacting our data analysis tasks.

What needs improvement?

English is where the language understanding was specifically beneficial for us.

Cohere is a solid LLM that processes all files well.

I would appreciate additional features such as OCR and similar capabilities.

For how long have I used the solution?

I have been working with Cohere for about two months.

What do I think about the stability of the solution?

There are no disadvantages or drawbacks of Cohere in comparison to ChatGPT or other AI solutions.

What do I think about the scalability of the solution?

There are no complexities with Cohere; the setup process is straightforward.

How are customer service and support?

I have not escalated any questions to the technical support team.

How would you rate customer service and support?

Negative

Which solution did I use previously and why did I switch?

I have experience working with several different AI products, but I do not perceive any significant difference between them; they are nearly identical with accuracy varying slightly across certain areas.

How was the initial setup?

There are no complexities; the setup process is straightforward.

What was our ROI?

I have not observed any measurable benefits or return on investment with Cohere.

What's my experience with pricing, setup cost, and licensing?

In my opinion, the pricing is reasonable.

Which other solutions did I evaluate?

There are no key differences or notable advantages or disadvantages of Cohere in comparison to other AI tools and LLM products that I am working with.

What other advice do I have?

I intended to clarify that I use Cohere for chat completion.

My primary concern stems from the missing OCR capabilities.

My overall review rating for Cohere is seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

Shivam Singh

Controlled text generation has supported secure workflows and governed data privacy

Reviewed on Dec 12, 2025

Review from a verified AWS customer

What is our primary use case?

We adopted Cohere primarily for their command model to support enterprise-grade text generation and NLP workflows.

There was a use case for one of our customers where they required automated text generation and summarization of long documents and draft creation for internal content, so we used Cohere's command model with AWS Bedrock.

For another customer, there was a similar use case but they also wanted semantic search and RAG, and instruction-based responses for chat and workflow automation were required, so we used Cohere's command model for that.

What is most valuable?

Cohere's command model is particularly useful for scenarios where consistent controlled output is more important, especially where we need creative responses, so I think Cohere's command model fits better in that case. We also found it well suited for structured enterprise tasks such as policy drafting, knowledge extraction, and generating standardized text for operational workflows.

It struck a good balance between fluency and predictability, which helps our team and is valuable for our business-critical applications, giving better insight to our team.

One of the major benefits I saw was data isolation and governance since Cohere has been implemented.

Consistent output quality, strong instruction following, and excellent embedding performance for retrieval tasks have benefited our organization. It was also offered from Amazon Bedrock , so this complete offering and strength from Cohere's command model helped our customers, and it is enterprise-friendly with deployment options such as VPC and data isolation that helped significantly.

Data privacy was a major concern because we operate from Asia-Pacific, and there is strong governance for data privacy in our country, so data privacy is the major compliance that helped us here.

What needs improvement?

Cohere could improve in areas where the command model is not as creative as some larger LLMs available in the market, which is expected but noticeable in open-ended generative tasks.

Reporting and analytics in the dashboard could be more detailed and fine-tuned, which would enhance the experience.

Fine-tuning could be simplified to support broader teams without deep ML expertise.

For speeding up, what I have already suggested is that it can be more creative, and their reporting and analytics can be improved, as this would help teams without machine learning expertise and speed up their end goals.

The dashboard reporting can be improved.

For how long have I used the solution?

We have been using Cohere for around one year.

What do I think about the stability of the solution?

Cohere is stable.

What do I think about the scalability of the solution?

The scalability and performance are quite good.

How are customer service and support?

We have not reached out to customer support yet, but once we encounter an issue and need to raise a ticket, we will provide feedback.

What was our ROI?

Cohere helped us with all three aspects: money is saved, time is saved, and we needed fewer resources to meet our end goals.

What's my experience with pricing, setup cost, and licensing?

Compared to models available in the market, Cohere's pricing, setup cost, and licensing are better.

Which other solutions did I evaluate?

We have tried multiple models, but we found that Cohere's command was a better fit for our needs.

We explored models from Anthropic and AWS native models such as AWS Titan Text before choosing Cohere.

What other advice do I have?

Data privacy was a major concern because we operate from Asia-Pacific, and there is strong governance for data privacy in our country, so data privacy is a major compliance that helped us here.

Cohere offers great customization options.

If governance, consistency, and data privacy are priorities, Cohere meets our organization's requirements well.

I recommend that anyone, especially in environments where governance, consistency, and data privacy are priorities, should choose Cohere, particularly the command model for teams looking for a controlled enterprise-safe alternative for text generation, summarization, and instruction automation.

Currently, we have used Cohere from the AWS Bedrock offering only, but since AWS has changed their third-party model availability from partner accounts, in the future, we are going to be a reseller for Cohere.

The documentation and learning resources were very helpful.

Our overall review rating for Cohere is 8 out of 10.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

reviewer2784894

Fast document processing has improved tender workflows but documentation still needs work

Reviewed on Dec 04, 2025

Review provided by PeerSpot

What is our primary use case?

My main use case for Cohere is for LLM and chatbot development.

I use Cohere to fill boxes about documents, specifically about tenders.

Cohere helps me fill boxes about documents, and I work with docx documents for a private company.

What is most valuable?

The best features Cohere offers are that it is fast and great.

Speed has helped me in my day-to-day work, and I really notice the difference because it responds very quickly to LLM requests.

Cohere has positively impacted my organization because I use it with Oracle, and in an enterprise way, it helped me offer clients a unique place to develop and use LLM. I can tell you that it helped me offer clients a unique place to develop and use LLM, as I use Oracle services.

What needs improvement?

I am uncertain about how Cohere can be improved.

The documentation and support could be improved, as there is limited documentation available on the web.

What do I think about the stability of the solution?

Cohere is stable.

What do I think about the scalability of the solution?

I am uncertain about Cohere's scalability.

How are customer service and support?

I am uncertain about customer support.

Which solution did I use previously and why did I switch?

I used GPT-4 before Cohere, and it is great.

Before choosing Cohere, I evaluated other options, specifically GPT-4.

What was our ROI?

I am uncertain if I have seen a return on investment or any relevant metrics such as time saved, money saved, or fewer employees needed.

What's my experience with pricing, setup cost, and licensing?

My experience with pricing, setup cost, and licensing is that it is expensive to use all Oracle services.

What other advice do I have?

I do not want to add anything else about the features, including anything about accuracy or ease of use.

I do not have specific advice to give to others looking into using Cohere. I gave this review a rating of 6.

reviewer2784744

Reranking has boosted retrieval quality and has improved performance in my information systems

Reviewed on Dec 04, 2025

Review from a verified AWS customer

What is our primary use case?

My main use case for Cohere is Retrieval Augmented Generation.

A specific example of how I use Retrieval Augmented Generation with Cohere is for information retrieval systems.

What is most valuable?

The best feature Cohere offers is the Reranking model.

What stands out for me about the ranking model is that it improved performance in my work.

Cohere positively impacted my organization by improving the performance of my RAG system.

I noticed a 10% improvement in my log system after using Cohere.

What needs improvement?

Cohere is good enough, and I think it can be improved.

For how long have I used the solution?

I have been using Cohere for two years.

What do I think about the stability of the solution?

Cohere is stable.

What do I think about the scalability of the solution?

The scalability of Cohere is good.

How are customer service and support?

The customer support for Cohere is good.

How would you rate customer service and support?

Negative

How was the initial setup?

My experience with pricing, setup cost, and licensing for Cohere is good.

What was our ROI?

I have not seen metrics for return on investment, and I have no metrics to share.

What's my experience with pricing, setup cost, and licensing?

My experience with pricing, setup cost, and licensing for Cohere is good.

What other advice do I have?

My advice to others looking into using Cohere is to try it.

My company does not have a business relationship with this vendor other than being a customer.

I gave this review a rating of 8.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

View all reviews