
Overview
Syn is a high-performance, lightweight 10B-class Japanese LLM co-developed by Upstage and Karakuri for enterprise AI applications. Combining Upstage's global AI expertise with Karakuri's advanced Japanese NLP, Syn offers enterprise-grade quality, reliability, and adaptability.
Powered by AWS Trainium, Syn ensures exceptional accuracy, seamless fine-tuning, and cost efficiency, making it a future-ready AI solution. Designed for industry- and domain-specific tasks, it excels in Finance, Law, Healthcare, and other key verticals while optimizing structured text processing. With its lightweight architecture, Syn reduces infrastructure costs while maintaining top-tier performance, enabling businesses to deploy AI efficiently and scale with ease.
Highlights
- **Enterprise-Optimized Japanese AI**: Developed specifically for Japanese enterprises, ensuring high precision in business applications.
- **Domain-specific Adaptation & Fine-Tunability**: Highly adaptable for critical industries such as Finance, Law, Healthcare, and Manufacturing, allowing seamless fine-tuning for specific business needs.
- **Cost-Effective AI Deployment**: Optimized with AWS Trainium, Syn maximizes performance while minimizing operational costs, ensuring efficient and scalable AI adoption. **Advanced Multilingual Capabilities**: Built on Upstage's Solar Mini foundation, offering seamless integration with key languages, including English, Korean, and Japanese.
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.m5.12xlarge Inference (Batch) Recommended | Model inference on the ml.m5.12xlarge instance type, batch mode | $0.80 |
ml.g5.12xlarge Inference (Real-Time) Recommended | Model inference on the ml.g5.12xlarge instance type, real-time mode | $0.80 |
ml.g6.12xlarge Inference (Real-Time) | Model inference on the ml.g6.12xlarge instance type, real-time mode | $0.80 |
ml.g4dn.12xlarge Inference (Real-Time) | Model inference on the ml.g4dn.12xlarge instance type, real-time mode | $0.80 |
ml.g5.24xlarge Inference (Real-Time) | Model inference on the ml.g5.24xlarge instance type, real-time mode | $0.80 |
Vendor refund policy
We do not support any refunds currently.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
We are finally released Syn (Solar Japanese).
Additional details
Inputs
- Summary
We support the request payload compatible with OpenAI's Chat completion endpoint.
- Limitations for input type
- Syn(solar-japanese) supports a maximum context length of 32k (32,768) for input and generated tokens.
- Input MIME type
- application/json
Input data descriptions
The following table describes supported input data fields for real-time inference and batch transform.
Field name | Description | Constraints | Required |
|---|---|---|---|
messages | List of messages that contains role and content. Role must be one of [system, user, assistant] | Type: FreeText | Yes |
frequency_penalty | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | Default value: 0.0
Type: Continuous
Minimum: -2.0
Maximum: 2.0 | No |
presence_penalty | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | Default value: 0.0
Type: Continuous
Minimum: -2.0
Maximum: 2.0 | No |
max_tokens | The maximum number of tokens that can be generated in the chat completion. Syn support maximum 32k(32,768) context for input and generated tokens. | Default value: 16
Type: Integer
Minimum: 0
Maximum: 32768 | No |
temperature | What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. | Default value: 1.0
Type: Continuous
Minimum: 0.0
Maximum: 2.0 | No |
top_p | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. | Default value: 1.0
Type: Continuous
Minimum: 0.0
Maximum: 1.0 | No |
stream | Specifies whether to stream the response. | Default value: false
Type: Categorical
Allowed values: true, false | No |
Resources
Vendor resources
Support
Vendor support
Contact us for model fine-tuning and enterprise integration inquiries.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products

