
Overview
With extensive expertise in developing Spanish-language NLP products and services, the IIC aims to share its knowledge globally by offering top-tier NLP tools. The Rigo family comprises specialized Spanish-language NLP services rigorously tested and certified with the IIC quality seal.
RigoChat-7b, part of this family, addresses key Spanish NLP tasks like Tool Use, Summarization, Math, Code, and Abstractive-QA. It excels in various applications, especially in RAG (Retriever-Augmented Generation) systems with Spanish databases, delivering accurate, context-based responses while minimizing hallucinations.
Built on open-weight models for commercial use, RigoChat-7b is fine-tuned on high-quality Spanish datasets, offering strong performance and cost-efficient integration into RAG systems.
An open weight version of this model, limited to research and non-commercial purposes, can be found in IIC’s public Hugging Face profile: https://huggingface.co/IIC/RigoChat-7b-v2
Highlights
- **RigoChat-7b** is an LLM based on cutting-edge open weight models that has been specialized for the Spanish language. We used a combination of both public and private Spanish datasets designed in the IIC. By fine-tuning with Direct Preference Optimization (DPO), we have managed to **outperform most state-of-the-art models over several high-quality Spanish corpora**. This demonstrates RigoChat's overall improved performance and robustness on Spanish generalist tasks, making it a valuable tool for any scenario.
- We recommend using this model as **a general chatbot or within applications designed for specific tasks**, such as RAG systems, SQL queries or as an autonomous agent to facilitate the use of tools. RigoChat-7b is ideal for integration into architectures with **low GPU cost**.
- Text Generation, Summarization, Machine Translation, Tool Use, Math, Code, Abstractive Question Answering, Named Entity Recognition (NER), Automated Writing Assistance, Chatbots and Conversational Agents, Retrieval Augmented Generation
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Free trial
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.g5.8xlarge Inference (Batch) Recommended | Model inference on the ml.g5.8xlarge instance type, batch mode | $1.50 |
ml.g5.8xlarge Inference (Real-Time) Recommended | Model inference on the ml.g5.8xlarge instance type, real-time mode | $1.50 |
ml.g5.12xlarge Inference (Batch) | Model inference on the ml.g5.12xlarge instance type, batch mode | $1.50 |
ml.g5.4xlarge Inference (Batch) | Model inference on the ml.g5.4xlarge instance type, batch mode | $1.50 |
ml.g5.16xlarge Inference (Batch) | Model inference on the ml.g5.16xlarge instance type, batch mode | $1.50 |
ml.g5.12xlarge Inference (Real-Time) | Model inference on the ml.g5.12xlarge instance type, real-time mode | $1.50 |
ml.g5.2xlarge Inference (Real-Time) | Model inference on the ml.g5.2xlarge instance type, real-time mode | $1.50 |
ml.g5.4xlarge Inference (Real-Time) | Model inference on the ml.g5.4xlarge instance type, real-time mode | $1.50 |
ml.g5.16xlarge Inference (Real-Time) | Model inference on the ml.g5.16xlarge instance type, real-time mode | $1.50 |
Vendor refund policy
No refunds
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
We have updated the model inference endpoint with several security and performance patches.
Additional details
Inputs
- Summary
The text generation model supports JSON inputs in the same format as OpenAI’s Chat Completion API.
{"messages": [{"role": "system", "content": "Eres un asistente de consultas por chat."}, {"role": "user", "content": "Hola, ¿qué puedes hacer por mí?"}], "temperature": 0.5, "max_tokens": 1024}
- Input MIME type
- application/json
Input data descriptions
The following table describes supported input data fields for real-time inference and batch transform.
Field name | Description | Constraints | Required |
|---|---|---|---|
messages | List of messages | Each one must have a “content” field with the textual interaction, and a “role” field which must be one of [“user “assistant”, “system”]. | Yes |
max_tokens | Maximum number of tokens that can be generated. | Must be a positive integer. Defaults to 8196. | No |
temperature | Sampling temperature. Controls randomness of the generations, lower values ensure less random completions. | Must be a float value between 0.0 and 2.0. Defaults to 0.7. | No |
top_p | Alternative to temperature sampling. Only the most probable tokens with probabilities that add up to top_p or higher are kept for generation. | Must be a float value between 0.0 and 1.0. Defaults to 0.8. | No |
top_k | The number of highest probability vocabulary tokens to keep for top-k-filtering. | Must be a positive integer. Defaults to 20. | No |
Resources
Vendor resources
Support
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.