
Overview
Palmyra-Fin-70B-32K is a model built by Writer specifically to meet the needs of the financial industry. It is a leading LLM on financial benchmarks, outperforming other large language models in various financial tasks and evaluations.
Highlights
- **Palmyra-Fin-70B-32K** is a state-of-the-art model for financial applications, achieving 100% accuracy on needle-in-haystack tasks across its entire 32,768 token context window. It's the first model to pass the CFA Level III test with a 73% score, surpassing the average passing score of 60%. On the long-fin-eval benchmark, it outperforms both open-source and proprietary models, demonstrating superior capabilities in processing and synthesizing information from lengthy financial documents.
- With its extended **32,768** token context window, Palmyra-Fin-70B-32K excels at analyzing and summarizing complex financial reports, market data, and economic indicators. It enhances financial decision-making through advanced entity recognition and deep understanding of financial terminology. The model's long-context capabilities make it ideal for in-depth financial research, investment analysis, and knowledge discovery from extensive financial sources.
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.g5.48xlarge Inference (Batch) Recommended | Model inference on the ml.g5.48xlarge instance type, batch mode | $57.08 |
ml.p4d.24xlarge Inference (Real-Time) Recommended | Model inference on the ml.p4d.24xlarge instance type, real-time mode | $57.08 |
Vendor refund policy
All fees are non-refundable and non-cancellable except as required by law.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
Initial release: Batch transform is not supported.
Additional details
Inputs
- Summary
The model accepts JSON requests with parameters that can be used to control the generated text. See examples and fields descriptions below.
- Input MIME type
- application/json
Input data descriptions
The following table describes supported input data fields for real-time inference and batch transform.
Field name | Description | Constraints | Required |
|---|---|---|---|
messages | Text input for the model to respond to. | Type: FreeText | Yes |
stream | If set to `true`, the system will return a stream of JSON events as the response. The stream concludes with a final event marked by an `event_type` of `"stream-end"`, which contains the full response. This streaming approach is particularly useful for user interfaces that display content incrementally as it's generated, allowing for a more dynamic and responsive experience. | Default value: FALSE
Type: Categorical
Allowed values: TRUE, FALSE | No |
temperature | To make the response more predictable and less random, choose a lower value for this setting. If you want to increase the variety and unpredictability in the output, you can do so by further raising the value of the `top_p` parameter instead. | Default value: 0.0
Type: Continuous
Minimum: 0
Maximum: 2 | No |
max_tokens | The maximum number of tokens the model will generate as part of the response. Note: Setting a low value may result in incomplete generations. | Default value: 1024
Type: Integer
Minimum: 0
Maximum: 32768 | No |
Top P (top_p) | Use a lower value to ignore less probable options. | Default value: 0.99
Type: Continuous
Minimum: 0.01
Maximum: 0.99 | No |
presence_penalty | Used to reduce repetitiveness of generated tokens. Similar to `frequency_penalty`, except that this penalty is applied equally to all tokens that have already appeared, regardless of their exact frequencies. | Default value: 0.0
Type: Continuous
Minimum: -2.0
Maximum: 2.0 | No |
frequency_penalty | Reduces repetition in the output. Higher values apply a stronger penalty to tokens that have already appeared, based on their frequency in the prompt or previous generation | Default value: 0.0
Type: Continuous
Minimum: -2.0
Maximum: 2.0 | No |
Resources
Support
Vendor support
Email support services are available from Monday to Friday. support@writer.comÂ
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products


