
Overview
At Cohere, we are committed to breaking down barriers and expanding access to cutting-edge NLP technologies that power projects across the globe. By making our innovative multilingual language models available to all developers, we continue to move toward our goal of empowering developers, researchers, and innovators with state-of-the-art NLP technologies that push the boundaries of Language AI.
Our Multilingual Model maps text to a semantic vector space, positioning text with a similar meaning in close proximity. This process unlocks a range of valuable use cases for multilingual settings. For example, one can map a query to this vector space during a search to locate relevant documents nearby. This often yields search results that are several times better than keyword search.
Highlights
- Humans speak over 7100 languages, yet the majority of language models only support the English language. This makes it incredibly challenging to build products and projects using multilingual language understanding. Cohere’s mission is to solve that by empowering our developers with technology that possesses the power of language. That’s why we’re introducing our first multilingual text understanding model that supports over 100 languages and delivers significantly better performance than existing open-source models.
- Our optimized containers enable low latency inference on a diverse set of hardware accelerators available on AWS providing different cost and performance points for Sagemaker customers.
- Multilingual, Semantic Search, Embeddings, Text Classification, Cross-Lingual Content Moderation
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Buyer guide

Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.g4dn.12xlarge Inference (Batch) Recommended | Model inference on the ml.g4dn.12xlarge instance type, batch mode | $14.67 |
ml.g5.xlarge Inference (Real-Time) Recommended | Model inference on the ml.g5.xlarge instance type, real-time mode | $4.23 |
ml.p3.2xlarge Inference (Real-Time) | Model inference on the ml.p3.2xlarge instance type, real-time mode | $11.475 |
ml.g5.2xlarge Inference (Real-Time) | Model inference on the ml.g5.2xlarge instance type, real-time mode | $4.56 |
ml.g4dn.xlarge Inference (Real-Time) | Model inference on the ml.g4dn.xlarge instance type, real-time mode | $2.2092 |
ml.g4dn.2xlarge Inference (Real-Time) | Model inference on the ml.g4dn.2xlarge instance type, real-time mode | $2.82 |
Vendor refund policy
No refunds.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
Patch release for minor issue with datatype.
Additional details
Inputs
- Summary
The model accepts JSON requests that specifies the input text to be embedded.
{ "texts": [ "hello", "goodbye" ], "truncate": "END" }
- Input MIME type
- application/json
Input data descriptions
The following table describes supported input data fields for real-time inference and batch transform.
Field name | Description | Constraints | Required |
|---|---|---|---|
texts | An array of strings for the model to embed. Maximum number of texts per call is 1024. We recommend reducing the length of each text to be under 256 tokens for optimal quality. | Type: FreeText | Yes |
truncate | One of NONE|LEFT|RIGHT to specify how the API will handle inputs longer than the maximum token length.
Passing LEFT will discard the start of the input. RIGHT will discard the end of the input. In both cases, input is discarded until the remaining input is exactly the maximum input token length for the model.
If NONE is selected, when the input exceeds the maximum input token length an error will be returned. | Default value: NONE
Type: Categorical
Allowed values: NONE, LEFT, RIGHT | No |
Resources
Vendor resources
Support
Vendor support
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products






Customer reviews
Improved document chatbot has delivered accurate answers and builds stronger client trust
What is our primary use case?
My main use case for Cohere is that it's a good embedding model. I have used it with Titan, but Cohere came out better.
A specific example of how I've used Cohere for embeddings is when I was working with one of our clients where we were establishing a chatbot that can help us go through 31 PDFs. For embedding, we used Cohere and Titan, and Cohere was a superior product.
I have integrated Cohere in that chatbot project using SageMaker , and it was an easy API call that I used.
What is most valuable?
In my opinion, the best features Cohere offers are the embedding flexibility and the normal way the LLM reacted to the embeddings of Cohere. I used OpenSearch to integrate and store all the embeddings, and I used Titan as well to store the embeddings in OpenSearch , but the result was much better.
The flexibility I mentioned is evident because when we were using Titan, it was hallucinating a lot and not giving proper answers because I felt the embedding was poor. When we used it with Cohere, the embeddings were better and the chatbot with the LLM that used the embeddings from Cohere answered in a better way.
Cohere has positively impacted my organization as the project was a success. Clients were really happy with the results, and we received more business from them.
What needs improvement?
Cohere can be improved by having more integrations beyond its current offerings with Amazon. Integrations with Databricks , Azure