Cohere Rerank v3.5

Sold by

Cohere

Rerank improves search systems by sorting documents based on their semantic similarity to a query.

Leave a review

Ratings and reviews

4.5

1 ratings

5 star

4 star

3 star

2 star

1 star

100%

1 AWS reviews

Filters

Review type

AWS Marketplace reviews

External reviews

Reviews (1)

Singh Aman

RAG assistant workflows have improved answer relevance and now deliver faster accurate decisions

Reviewed on May 23, 2026

Review from a verified AWS customer

What is our primary use case?

My main use case for Cohere Rerank v3.5 is that we used it in our enterprise RAG-based AI assistant solution hosted over AWS, which was integrated with Amazon Bedrock, OpenSearch, and vector embeddings. To improve document retrieval relevance for internal enterprise workflows, such as access management, ticketing, and knowledge retrieval, we have used Cohere Rerank v3.5.

A specific example of how we used Cohere Rerank v3.5 in one of those workflows is that we usually found some bugs when asking questions as the KB search was doing multiple calls. To avoid that, we thought we would use a reranking model to decrease the KB calls, ensuring we have proper KBs at the first go, and for that, we have used reranking.

What is most valuable?

The best features Cohere Rerank v3.5 offers include high-quality reranking relevance, better contextual retrieval, easy AWS integration, fast response time, and strong multilingual support, which are some of the better things that I have observed.

The easy AWS integration and fast response time help my team because the whole solution is on AWS, and AWS already provides Cohere as a provider where reranking is available. We just have to call the ARN of the reranking model as an interface, and it easily integrates into the KB search call, making integration straightforward.

The most valuable feature was the reranking quality. After introducing Cohere Rerank v3.5 into our pipeline, the relevance of the required chunks improved significantly, which directly reduced hallucination responses from the downstream LLMs, and the latency was quite good, making it acceptable for the enterprise-grade application.

Cohere Rerank v3.5 has positively impacted my organization by improving answer accuracy in our AI assistant workflows and reducing irrelevant retrieval results. This improved end-user trust in the system and helped move some proof of concept implementation closer to production readiness so that our end users can trust the answers.

We have significantly seen the outcomes, and the answer quality outcomes have improved after implementing the reranking.

What needs improvement?

For improvement purposes, latency can be improved for sure, as it is currently around one to one and a half seconds, and if we can improve it so that it takes much lesser time than whatever it is taking right now, that would be great. Other than that, I have not seen that much room for improvement because it is already a much improved version I am using right now.

I can see that better native observability can be implemented, and price transparency is not there on AWS. Other than that, AWS native analytics could also be helpful for developers, and if fine-tuning can be available for those reranking models, it could have much better control over the reranking model, in my opinion.

I do not think there are any other improvements for Cohere Rerank v3.5 that we have not discussed yet, as we have already talked about latency, integration, and everything else that is already there.

For how long have I used the solution?

I have been using Cohere Rerank v3.5 for the last one and a half years.

What do I think about the stability of the solution?

Cohere Rerank v3.5 is quite stable and fast.

What do I think about the scalability of the solution?

Cohere Rerank v3.5's scalability is something that works on-the-go, as AWS already supports scalability for enterprise-specific needs. If right now one hundred users are using it, fewer resources will be utilized, but if more than that or maybe one thousand to ten thousand users are using it, the load will scale accordingly, and we have not seen any performance degradation as user numbers increase, so scaling works very fast and much better.

How are customer service and support?

I have not yet visited the customer support for Cohere Rerank v3.5 because we have not required that. In terms of stability and scalability, we have found that the solution was stable during testing in enterprise workloads and was able to handle large document retrieval scenarios with acceptable performance degradation. The API integration through AWS was straightforward and reliable as all of this was mentioned in the AWS documentation, which was quite good.

Which solution did I use previously and why did I switch?

We were not initially using any reranking models, but after switching to reranking models, the performance and answer quality have improved significantly.

How was the initial setup?

The setup process is very easy as the inference model is already provided on the AWS documentation on how to utilize it, which I think is very good to have.

What was our ROI?

I have seen a return on investment, as time has been saved significantly because the end-user experience has improved considerably, and the end-user is impressed with the responses we are providing to them, appreciating the response quality greatly.

Which other solutions did I evaluate?

Before choosing Cohere Rerank v3.5, I evaluated other options, including AWS Titan, which provides embedding retrieval, as well as OpenSearch k-NN and some open-source reranking models from Hugging Face, but we found that Cohere performs much better than the options I mentioned earlier.

What other advice do I have?

I would suggest to any developer who wants to increase their RAG response quality to look into Cohere Rerank v3.5 for a significant improvement in response quality. I gave this product a rating of nine out of ten.