Listing Thumbnail

    Command A (H100)

     Info
    Sold by: Cohere 
    Deployed on AWS
    Command A is a highly efficient generative model that excels at agentic and multilingual use cases.

    Overview

    Command A is Cohere's flagship generative model, optimized for companies that require fast, secure, and highly-performant AI solutions. Command A delivers maximum performance with minimal hardware costs when compared to leading proprietary and open-weights models, such as GPT-4o and DeepSeek-V3. For private deployments, Command A excels on business-critical agentic and multilingual tasks, and can be deployed on just 2 GPUs, compared to competitive models that typically require as many as 32 GPUs. In head-to-head human evaluation across business, STEM, and coding tasks, Command A matches or outperforms its larger and slower competitors, while offering superior throughput and increased efficiency.

    Highlights

    • Command A is very effective at adapting in real time and solving multiple step problems based on context and the objectives it is given. This unlocks various points of automation and assistance for businesses across the globe.
    • Command A is highly balanced. It performs incredibly well at critical uses cases without sacrificing performance in essential areas, making it a very good general model. This balanced starting point also makes Command A particularly apt to being customized and fine tuned to specific use cases as needed.
    • Typically such a powerful model would be incredibly expensive to serve, but Command A is incredibly efficient, operating on 2 Ax100s or Hx100s.

    Details

    Sold by

    Delivery method

    Latest version

    Deployed on AWS

    Unlock automation with AI agent solutions

    Fast-track AI initiatives with agents, tools, and solutions from AWS Partners.
    AI Agents

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Command A (H100)

     Info
    Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    Usage costs (2)

     Info
    Dimension
    Description
    Cost/host/hour
    ml.g4dn.12xlarge Inference (Batch)
    Recommended
    Model inference on the ml.g4dn.12xlarge instance type, batch mode
    $34.25
    ml.p5.48xlarge Inference (Real-Time)
    Recommended
    Model inference on the ml.p5.48xlarge instance type, real-time mode
    $34.25

    Vendor refund policy

    There are no refunds.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    Amazon SageMaker model

    An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.

    Deploy the model on Amazon SageMaker AI using the following options:
    Deploy the model as an API endpoint for your applications. When you send data to the endpoint, SageMaker processes it and returns results by API response. The endpoint runs continuously until you delete it. You're billed for software and SageMaker infrastructure costs while the endpoint runs. AWS Marketplace models don't support Amazon SageMaker Asynchronous Inference. For more information, see Deploy models for real-time inference  .
    Deploy the model to process batches of data stored in Amazon Simple Storage Service (Amazon S3). SageMaker runs the job, processes your data, and returns results to Amazon S3. When complete, SageMaker stops the model. You're billed for software and SageMaker infrastructure costs only during the batch job. Duration depends on your model, instance type, and dataset size. AWS Marketplace models don't support Amazon SageMaker Asynchronous Inference. For more information, see Batch transform for inference with Amazon SageMaker AI  .
    Version release notes

    We've updated our SageMaker integration with a major version release for Embed and Rerank models, including notebook updates. The "/invocation" endpoint now defaults to API V2, ensuring a seamless transition to the latest version. Please see the notebook on how to use this model with the API update: https://github.com/cohere-ai/cohere-aws/blob/main/notebooks/sagemaker/Command%20Models.ipynb 

    New Features: API Version Control: Users can now specify the API version (v1 or v2) in the endpoint URL, providing greater flexibility and control over API interactions. Bug Fixes: Billing Token Issue: Resolved an issue where billing tokens were consistently returning as 0 for embed requests. Image Processing Error: Addressed a problem where the inference server failed to process valid base64 image URIs, resulting in "failed to parse image" errors. This issue was specific to the inference server and did not affect other routes.

    Additional details

    Inputs

    Summary

    The model accepts JSON requests with parameters that can be used to control the generated text. See examples and fields descriptions below.

    Input MIME type
    application/json
    { ""messages"" = [ {""role"": ""SYSTEM"", ""content"": ""You are a helpful assistant""}, {""role"": ""USER"", ""content"": ""What is an interesting new role in AI if I don't have an ML background?""}, {""role"": ""ASSISTANT"", ""content"": ""You could explore being a prompt engineer!""}, {""role"": ""USER"", ""content"": ""What are some skills I should have?""}, ], }
    N/A

    Input data descriptions

    The following table describes supported input data fields for real-time inference and batch transform.

    Field name
    Description
    Constraints
    Required
    message
    A list of chat messages in chronological order, representing a conversation between the user and the model. The data input type is text. Messages can be from User, Assistant, Tool and System roles.
    -
    Yes
    documents
    A list of relevant documents that the model can cite to generate a more accurate reply. Each document is either a string or document object with content and metadata. The input data type is text.
    -
    No
    tools
    A list of available tools (functions) that the model may suggest invoking before producing a text response. When tools is passed (without tool_results), the text content in the response will be empty and the tool_calls field in the response will be populated with a list of tool calls that need to be made. If no calls need to be made, the tool_calls array will be empty. The input data type is text.
    -
    No
    citation_options
    Options for controlling citation generation. Input data type is text.
    -
    No
    logprobs
    Defaults to false. When set to true, the log probabilities of the generated tokens will be included in the response. Input data type is categorical. If categorical is chosen then TRUE, FALSE. Default value is FALSE.
    -
    No
    stop_sequences
    A list of up to 5 strings that the model will use to stop generation. If the model generates a string that matches any of the strings in the list, it will stop generating tokens and return the generated text up to that point not including the stop sequence. Input data type is text.
    -
    No
    strict_tools
    When set to true, tool calls in the Assistant message will be forced to follow the tool definition strictly. Note: The first few requests with a new set of tools will take longer to process.
    Input data type is categorical. If categorical is chosen then TRUE, FALSE
    No
    tool_choice
    Allowed values: REQUIRED, NONE Used to control whether or not the model will be forced to use a tool when answering. When REQUIRED is specified, the model will be forced to use at least one of the user-defined tools, and the tools parameter must be passed in the request. When NONE is specified, the model will be forced not to use one of the specified tools, and give a direct response. If tool_choice isn’t specified, then the model is free to choose whether to use the specified tools or not.
    If Categorical is chosen: REQUIRED, NONE
    No
    stream
    Defaults to false. When true, the response will be a SSE stream of events. Streaming is beneficial for user interfaces that render the contents of the response piece by piece, as it gets generated.
    Input data type is Categorical. If Categorical is chosen then TRUE, FALSE.
    No
    safety_mode
    Allowed values: CONTEXTUAL STRICT OFF Used to select the safety instruction inserted into the prompt. Defaults to CONTEXTUAL. When OFF is specified, the safety instruction will be omitted. Safety modes are not yet configurable in combination with tools, tool_results and documents parameters.
    -
    No

    Support

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Similar products

    Customer reviews

    Ratings and reviews

     Info
    0 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    0%
    0%
    0%
    0%
    0 AWS reviews
    No customer reviews yet
    Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.