AWS Marketplace: Fireworks AI (Pay-As-You-Go) Reviews

Fireworks AI (Pay-As-You-Go)

Fireworks AI

Reviews from AWS customer

3 AWS reviews

5 star

0
4 star

2
3 star

1
2 star

0
1 star

0

External reviews

5 reviews

from and

External reviews are not included in the AWS star rating for the product.

3-star reviews ( Show all reviews )

Mdpman 김

Automation has accelerated agent workflows and now needs broader connections for enterprise data

May 02, 2026
Review provided by PeerSpot

What is our primary use case?

Fireworks AI was my choice because, with keywords like low latency, enterprise automation, and corporate automation, it was able to provide expert-level insights. I built a system for memory, autonomous collaboration, and controlled generation in the projects where I applied Fireworks AI. After introducing Fireworks AI's high-speed inference engine, I found that communication speed between agents became about twice as fast compared to before. The function calling capability for agents to invoke external tools was very stable, and I confirmed that it was possible to perfectly implement complex workflows that query and reflect enterprise data in real time. This was the most decisive differentiator that allowed us to practically apply AI automation in enterprise environments.

A concrete example of how Fireworks AI helped would be a system with an access control system. By developing an agent, we built a system where, when people enter a factory inside a company, we receive their medical examination documents and, based on the information in those documents, we can determine whether to approve or deny their access registration or entry. By automating that, we quickly verify employees' health conditions and can impose entry restrictions.

What is most valuable?

The most satisfying feature of Fireworks AI was the combination of efficient inference speed and stable function calling. The core of an autonomous agent system is the model's ability to interact with external tools in real time. Fireworks AI is not just fast in plain text generation; it is innovative in that it reduces the latency that occurs in the process where agents perform complex tasks and, through that, choose and call tools.

Current LLM models have evolved from traditional foundation models into hybrid models, and while inference speed has improved, response time has become slower because they use things like Chain-of-Thought (CoT). To gain that inference speed, optimization of external function calling and similar aspects must be perfect; otherwise, the final answer will not come out quickly. By optimizing that through Fireworks, we were able to speed up the response time, which is a weakness of existing LLM models. A major advantage is that customers or business users can obtain answers quickly through text.

In terms of metrics, in the case of health checkup data, it is at least two to three pages of PDF files or scans, so when a human reads it, it takes at least about one to three minutes. Using LLMs and Fireworks, we built an integrated system that can make a determination in about thirty seconds to one minute and then pass that result on to other systems based on that.

What needs improvement?

In the current function calling, if Fireworks AI could be added as part of our RAG system not only with the function calling we are using now but also with a variety of other connections, then an even better situation would be possible. Fireworks is based on tool calling, so it needs to add more different kinds of connections to enable faster data retention and optimization.

Although multiple optimal optimization or measurement methodologies for using LLMs are being discussed, when using them inside enterprises, the main thing is actually measuring work handling capability or work processing speed. Based on that, and also through what might be called interviews with business-side staff, we measured the speed improvements in a somewhat indirect manner.

For how long have I used the solution?

I have been using Fireworks AI for about two years.

What was our ROI?

The companies we usually work with are enterprise-level companies in Korea, so we cannot really provide actual company names or detailed data. However, in order to make decisions, when customers have certain requirements, we can quickly create agents for them, and in connecting those agents through connections like A2A and MCP, Fireworks has helped a lot. As a result, we experienced an innovative situation where time spent on simple repetitive tasks was reduced by over sixty percent. Additionally, task processing speed improved by about thirty percent. This naturally led to cost reduction or cost optimization. To be specific, if one person used to complete one unit of work before, it is now optimized so that they can do one point five or more units of work.

What other advice do I have?

Based on my experience, I give Fireworks AI a rating of seven out of ten. Due to the fact that various connections are still somewhat lacking, I deducted about three points for this rating. Since we are basically a CSP partner, we use a public cloud as our base. However, depending on customer needs, enterprise-level customers want to apply it via their own in-house LLM or local LLM, so the hybrid concept is also under consideration. Our company fundamentally aims for a multi-cloud approach, so we use GCP, AWS, and Azure all together. Currently, I am mainly focused on the Azure side, so we deal only with Azure-based systems.

Pratiksh S.

Review for Fireworks AI

September 05, 2024
Review provided by G2

What do you like best about the product?

They have categorised the models according to users requirements and user have to pay for the products they use. No extra costing.

What do you dislike about the product?

They need to use more dependable parameters. And should increase their serverless model limits.

What problems is the product solving and how is that benefiting you?

AI is the booming condition in the industry and with the Fireworks it feels easy to deploy models to organisational servers. Additionally they use Meta Llama.

showing 1 - 2