
Overview
Palmyra-Med-70b-32k, created by Writer, builds upon the foundation of Palmyra-Med-70b and offers an extended context length and meets the needs of the healthcare industry. The leading LLM on biomedical benchmarks, with an average score of 85.87%, outperforming GPT-4, claude Opus, Gemini and Med-PaLM-2 base model and a medically trained human test-taker.
Highlights
- **Palmyra-Med-70B-32k**, developed by **Writer**, is the leading LLM on biomedical benchmarks with an average score of 85.87%. It outperforms larger models like GPT-4, Gemini, and Med-PaLM-1 across 9 diverse biomedical datasets, showcasing state-of-the-art results in critical areas such as Clinical KG, Medical Genetics, and PubMedQA. This exceptional performance underscores its robust grasp of biomedical knowledge and its potential to significantly advance healthcare applications.
- With an extended context window of **32,768 tokens**, Palmyra-Med-70B-32k excels in processing lengthy medical documents and complex healthcare scenarios. This capability, combined with specialized training on high-quality biomedical data, enables superior performance in analyzing clinical notes, summarizing EHR data, and extracting key information from research articles. These features make it an invaluable tool for enhancing clinical decision-making, supporting medical research, and advancing healthcare informatics.
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.g5.48xlarge Inference (Batch) Recommended | Model inference on the ml.g5.48xlarge instance type, batch mode | $57.08 |
ml.p4d.24xlarge Inference (Real-Time) Recommended | Model inference on the ml.p4d.24xlarge instance type, real-time mode | $57.08 |
Vendor refund policy
All fees are non-refundable and non-cancellable except as required by law.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
Initial release: Batch transform is not supported.
Additional details
Inputs
- Summary
The model accepts JSON requests with parameters that can be used to control the generated text. See examples and fields descriptions below.
- Input MIME type
- application/json
Input data descriptions
The following table describes supported input data fields for real-time inference and batch transform.
Field name | Description | Constraints | Required |
|---|---|---|---|
messages | Text input for the model to respond to. | Type: FreeText | Yes |
stream | If set to `true`, the system will return a stream of JSON events as the response. The stream concludes with a final event marked by an `event_type` of `"stream-end"`, which contains the full response. This streaming approach is particularly useful for user interfaces that display content incrementally as it's generated, allowing for a more dynamic and responsive experience. | Default value: FALSE
Type: Categorical
Allowed values: TRUE, FALSE | No |
temperature | To make the response more predictable and less random, choose a lower value for this setting. If you want to increase the variety and unpredictability in the output, you can do so by raising the value of the `p` parameter instead. | Default value: 0.0
Type: Continuous
Minimum: 0
Maximum: 2 | No |
max_tokens | The maximum number of tokens the model will generate as part of the response. Note: Setting a low value may result in incomplete generations. | Default value: 1024
Type: Integer
Minimum: 0
Maximum: 32768 | No |
Top P (top_p) | Use a lower value to ignore less probable options. | Default value: 0.99
Type: Continuous
Minimum: 0.01
Maximum: 0.99 | No |
presence_penalty | Used to reduce repetitiveness of generated tokens. Similar to `frequency_penalty`, except that this penalty is applied equally to all tokens that have already appeared, regardless of their exact frequencies. | Default value: 0
Type: Continuous
Minimum: -2.0
Maximum: 2.0 | No |
frequency_penalty | Reduces repetition in the output. Higher values apply a stronger penalty to tokens that have already appeared, based on their frequency in the prompt or previous generation. | Default value: 0.0
Type: Continuous
Minimum: -2.0
Maximum: 2.0 | No |
Resources
Support
Vendor support
Email support services are available from Monday to Friday. support@writer.comÂ
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products


