Artificial Intelligence

Kazuki Fujii

Author: Kazuki Fujii

Training Llama 3.3 Swallow: A Japanese sovereign LLM on Amazon SageMaker HyperPod

The Institute of Science Tokyo has successfully trained Llama 3.3 Swallow, a 70-billion-parameter large language model (LLM) with enhanced Japanese capabilities, using Amazon SageMaker HyperPod. The model demonstrates superior performance in Japanese language tasks, outperforming GPT-4o-mini and other leading models. This technical report details the training infrastructure, optimizations, and best practices developed during the project.