Introducing Llama 3.2
Introducing Llama 3.2 from Meta, a new generation of vision and lightweight models that fit on edge devices, enabling more personalized AI experiences. Llama 3.2 includes small and medium-sized vision LLMs (11B and 90B) and lightweight, text-only models (1B and 3B) that support image reasoning and on-device use cases. The new models are designed to be more accessible and efficient, with a focus on responsible innovation and system-level safety.
Benefits
Meet Llama
For over the past decade, Meta has been focused on putting tools into the hands of developers, and fostering collaboration and advancements among developers, researchers, and organizations. Llama models are available in a range of parameter sizes, enabling developers to select the model that best fits their needs and inference budget. Llama models in Amazon Bedrock open up a world of possibilities because developers don't need to worry about scalability or managing infrastructure. Amazon Bedrock is a very simple turnkey way for developers to get started using Llama.
Use cases
Llama models excel at image understanding and visual reasoning, language nuances, contextual understanding, and complex tasks like visual data analysis, image captioning, dialogue generation, translation and dialogue generation, and can handle multi-step tasks effortlessly. Additional use cases Llama models are a great fit for include sophisticated visual reasoning and understanding, image-text-retrieval, visual grounding, document visual question answering, text summarization and accuracy, text classification, sentiment analysis and nuance reasoning, language modeling, dialog systems, code generation, and following instructions.
Model versions
Llama 3.2 90B
Multimodal model that takes both text and image inputs and outputs. Ideal for applications requiring sophisticated visual intelligence, such as image analysis, document processing, multimodal chatbots, and autonomous systems.
Max tokens: 128K
Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Fine-tuning supported: No
Supported use cases: Image understanding, visual reasoning, and multimodal interaction, enabling advanced applications such as image captioning, image-text retrieval, visual grounding, visual question answering, and document visual question answering, with a unique ability to reason and draw conclusions from visual and textual inputs.
Llama 3.2 11B
Multimodal model that takes both text and image inputs and outputs. Ideal for applications requiring sophisticated visual intelligence, such as image analysis, document processing, and multimodal chatbots.
Max tokens: 128K
Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Fine-tuning supported: No
Supported use cases: Image understanding, visual reasoning, and multimodal interaction, enabling advanced applications such as image captioning, image-text retrieval, visual grounding, visual question answering, and document visual question answering.
Llama 3.2 3B
Text-only lightweight model built to deliver highly accurate and relevant results. Designed for applications requiring low-latency inferencing and limited computational resources. Ideal for query and prompt rewriting, mobile AI-powered writing assistants, and customer service applications, particularly on edge devices where its efficiency and low latency enable seamless integration into various applications, including mobile AI-powered writing assistants and customer service chatbots.
Max tokens: 128K
Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Fine-tuning supported: No
Supported use cases: Advanced text generation, summarization, sentiment analysis, emotional intelligence, contextual understanding, and common sense reasoning.
Llama 3.2 1B
Text-only lightweight model built to deliver fast and accurate responses. Ideal for edge devices and mobile applications. The model enables on-device AI capabilities while preserving user privacy and minimizing latency.
Max tokens: 128K
Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Fine-tuning supported: No
Supported use cases: Multilingual dialogue use cases such as personal information management, multilingual knowledge retrieval, and rewriting tasks.
Llama 3.1 8B
Ideal for limited computational power and resources, faster training times, and edge devices.
Max tokens: 128K
Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Fine-tuning supported: Yes
Supported use cases: Text summarization, text classification, sentiment analysis, and language translation.
Llama 3.1 70B
Ideal for content creation, conversational AI, language understanding, research development, and enterprise applications. With new latency-optimized inference capabilities available in public preview, this model sets a new performance benchmark for AI solutions that process extensive text inputs, enabling applications to respond more quickly and handle longer queries more efficiently.
Max tokens: 128K
Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Fine-tuning supported: Yes
Supported use cases: Text summarization, text classification, sentiment analysis, and language translation.
Llama 3.1 405B
Ideal for enterprise level applications, research and development, synthetic data generation and model distillation. With latency-optimized inference capabilities available in public preview, this model delivers exceptional performance and scalability, enabling organizations to accelerate their AI initiatives while maintaining high quality outputs across diverse use cases.
Max tokens: 128K
Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Fine-tuning supported: Coming soon
Supported use cases: General knowledge, long-form text generation, machine translation, enhanced contextual understanding, advanced reasoning and decision making, better handling of ambiguity and uncertainty, increased creativity and diversity, steerability, math, tool use, multilingual translation, and coding.
Llama 3 8B
Ideal for limited computational power and resources, faster training times, and edge devices.
Max tokens: 8K
Languages: English
Fine-tuning supported: No
Supported use cases: Text summarization, text classification, sentiment analysis, and language translation
Llama 3 70B
Ideal for content creation, conversational AI, language understanding, research development, and enterprise applications.
Max tokens: 8K
Languages: English
Fine-tuning supported: No
Supported use cases: Text summarization and accuracy, text classification and nuance, sentiment analysis and nuance reasoning, language modeling, dialogue systems, code generation, and following instructions.
Llama 2 13B
Fine-tuned model in the parameter size of 13B. Suitable for smaller-scale tasks such as text classification, sentiment analysis, and language translation.
Max tokens: 4K
Languages: English
Fine-tuning supported: Yes
Supported use cases: Assistant-like chat
Llama 2 70B
Fine-tuned model in the parameter size of 70B. Suitable for larger-scale tasks such as language modeling, text generation, and dialogue systems.
Max tokens: 4K
Languages: English
Fine-tuning supported: Yes
Supported use cases: Assistant-like chat