Speech-to-Text & Text-to-Speech GenAI API

Info

Sold by: Deepgram

Deepgram, Language AI models to power your apps.

4.7

3 AWS reviews

4 external reviews

View purchase options

Try for free

Request private offer

Request demo

Overview

Product video

Deepgram voice AI models power your apps with world-class speech-to-text and domain-specific language models (DSLMs). Effortlessly accurate. Blazing fast. Enterprise-ready scale. Unbeatable pricing. Everything developers need to build with confidence and ship faster.

Deepgram Datasheet - https://drive.google.com/file/d/1YngGzUJZhnH8nj-ZFuhrSiaLD4w9HFd4/view?usp=sharing
Deepgram API Playground to tryout all features and models (free tier) - https://playground.deepgram.com/?smart_format=true&language=en&model=nova
Deepgram Summarization (domain specific language model) - https://developers.deepgram.com/docs/summarization
Generative AI Demo with partners: OneReach.ai, Vonage and Deepgram Partner to Revolutionize Conversational AI - https://www.youtube.com/watch?v=CFTk0S6tGF8 (2min)
Introducing Nova-2: The Fastest, Most Accurate Speech-to-Text API (video 9min) - https://www.youtube.com/watch?v=PSaVX6ST-FM

For questions and custom quote options, reach out to us at aws@deepgram.com .

Highlights

Transcription (STT) - 20x faster: Transcribe in real-time or an hour of pre-recorded audio in about 12 seconds. - <300ms latency: The fastest real-time transcription speeds for human-like conversational AI experiences, real-time analytics, and enablement. - >90% accuracy: Deepgram leads the industry with most accurate models in market across use case categories.
Understanding - Summarization - Sentiment analysis - Sentiment analysis - Language translation - Speaker diarization - Language Detection - And more...
Custom Model Training - Deepgram will support customer specific custom model training to ensure your model works to meet your business objectives.

Details

Sold by

Deepgram

Unlock automation with AI agent solutions

Fast-track AI initiatives with agents, tools, and solutions from AWS Partners.

Explore AI agent solutions

Features and programs

Buyer guide

Gain valuable insights from real users who purchased this product, powered by PeerSpot.

Get the buyer guide

Financing for AWS Marketplace purchases

AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.

View financing details

Pricing

Free trial

Try for free

Try this product free according to the free trial terms set by the vendor.

Speech-to-Text & Text-to-Speech GenAI API

Info

View purchase options

Pricing is based on the duration and terms of your contract with the vendor. This entitles you to a specified quantity of use for the contract duration. If you choose not to renew or replace your contract before it ends, access to these entitlements will expire.

Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator to estimate your infrastructure costs.

1-month contract (2)

Info

Dimension	Description	Cost/month
Enterprise Offering	Custom Enterprise Offering	$10,000,000.00
Cost per Transcription Hour	Deepgram charges per transcription hour	$1,250.00

Vendor refund policy

Deepgram Terms of Service: https://deepgram.com/terms/

Custom pricing options

Request private offer

Request a private offer to receive a custom quote.

How can we make this page better?

We'd like to hear your feedback and ideas on how to improve this page.

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Request demo

Delivery details

Software as a Service (SaaS)

SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

Resources

Vendor resources

https://deepgram.com/product-overview/

Support

Vendor support

For sales, contracting and usage inquires, please email aws@deepgram.com

AWS infrastructure support

AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Product comparison

Info

Updated weekly

Speech-to-Text & Text-to-Speech GenAI API

By Deepgram

AssemblyAI

By AssemblyAI

Customer-Led Conversational Assistant

By PolyAI

Accolades

Info

Top

In Scheduling & Coordination, Speech Recognition, Sales & Marketing

Top

In Speech to Text, Customer Support, Speech Recognition

Top

100

In Natural Language Processing

Customer reviews

Info

Sentiment is AI generated from actual customer reviews on AWS and G2

Reviews

Functionality

Ease of use

Customer service

Cost effectiveness

2 reviews

Insufficient data

85 reviews

Positive

0 reviews

Insufficient data

Positive reviews

Mixed reviews

Negative reviews

Overview

Info

AI generated from product descriptions

Speech Recognition Speed

Real-time transcription with processing speed of 20x faster than traditional methods, capable of transcribing an hour of audio in approximately 12 seconds

Latency Performance

Ultra-low latency under 300 milliseconds for near-instantaneous speech-to-text conversion

Accuracy Metrics

Speech recognition accuracy exceeding 90% across multiple use case categories

Language Understanding Capabilities

Advanced natural language processing features including summarization, sentiment analysis, speaker diarization, language detection, and translation

Model Customization

Support for customer-specific custom model training to adapt speech recognition for unique business requirements

Speech Recognition

Advanced multilingual speech recognition with high accuracy and low word error rates

Language Processing

Support for 99+ languages with automatic language detection and custom vocabulary capabilities

Audio Intelligence

Comprehensive suite of AI models including speaker diarization, sentiment analysis, content moderation, and PII redaction

Large Language Model Integration

LeMUR framework for processing audio transcripts using advanced language model capabilities

Transcription Flexibility

Support for async and real-time transcription with multiple file type compatibility across 33 audio and video formats

Natural Language Understanding

Advanced proprietary Large Language Model (ConveRT) trained specifically for customer service applications

Speech Recognition Technology

Spoken language understanding system capable of processing diverse accents, dialects, and background noise

Conversational AI Architecture

Customer-led conversational assistant platform enabling natural language interaction with interruption and topic flexibility

Language Processing Capability

Multi-language support with ability to understand and respond across different linguistic contexts

Dialogue Management

Customizable conversational assistant deployment with continuous improvement through expert dialogue systems scientists and machine learning developers

Contract

Info

Standard contract

Customer reviews

Leave a review

Ratings and reviews

Info

4.7

3 ratings

5 star

4 star

3 star

2 star

1 star

33%

67%

3 AWS reviews

4 external reviews

Star ratings include only reviews from verified AWS customers. External reviews can also include a star rating, but star ratings from external reviews are not averaged in with the AWS customer star ratings.

Arunkumar HG

A Powerful, Adaptable, and Constantly Evolving STT Solution for Voice Automation

Reviewed on Oct 17, 2025

Review from a verified AWS customer

What is our primary use case?

For the last two years, our primary use case for Deepgram has been to power sophisticated, AI-driven voice bots for major US clients.

The technical workflow is as follows:

A client initiates a call to a Twilio number.
Our system captures the audio and streams it in real-time to Deepgram 's Speech-to-Text service.
Deepgram transcribes the speech into text with high accuracy.
This text is then passed to a Large Language Model (LLM) to analyze and determine the user's intent.
Based on the identified intent, we trigger the appropriate backend functions to generate a relevant response.
Finally, we use a Text-to-Speech (TTS) engine, such as ElevenLabs , to convert the response back into audio and play it for the user.

The entire process is built upon the speed and reliability of Deepgram's transcription. Our environment is deployed on the Public Cloud, specifically using Amazon Web Services (AWS ).

What is most valuable?

Of course. Based on my review, here are the features I've found most valuable:

Continuous Innovation and Responsiveness: I find it incredibly valuable that Deepgram is not a static product. They are constantly evolving and genuinely listen to user feedback. The evolution from their Nova models to the new Flux model, which was specifically designed to solve end-of-speech detection for conversational AI, is a perfect example. It shows they are committed to solving real-world problems for their users.
High Accuracy and Reliability: For my voice bot solutions, accuracy is non-negotiable. The models are remarkably accurate, performing at 90-92% efficiency even with challenging conditions like background noise and a wide range of international accents. Furthermore, the service has been incredibly stable; in my four years of using it, we've never experienced downtime.
Excellent Configurability and Ease of Integration: Deepgram offers a level of granular control that allows me to fine-tune the STT engine's behavior, which is a significant advantage over competitors. This flexibility, combined with straightforward integration, extensive documentation, and robust code examples, allows my team to be highly efficient.
Cost-Effectiveness and Scalability: The pay-as-you-go pricing model is both affordable and transparent. It provides a significant return on investment because it satisfies all our primary requirements—technical accuracy, ease of integration, and low implementation cost—within a scalable and predictable financial model.
Outstanding Customer Support: The support team is brilliant and always ready to assist. Having access to official support channels, active community forums, and frequent webinars ensures that we are never without resources, which is crucial for a business-critical application.

What needs improvement?

Honestly, Deepgram has been exceptionally proactive in addressing the primary area that needed improvement. My main challenge was with the real-time detection of when a user has finished speaking in a live conversation, which is critical for a responsive voice bot. They directly solved this by releasing their Flux model.

Because Flux is a recent release, I haven't yet had enough time to thoroughly test it and identify new limitations. At this stage, any "improvement" would be more of a "nice-to-have" feature rather than a fix for an existing problem. The core service is already very robust and meets all of our current needs.

What additional features should be included in the next release?

Looking toward the future, here are a few features that could add even more value to an already excellent platform:

Advanced Built-in Analytics: While I can get the raw transcript and build my own analytics pipeline, it would be powerful to have features like sentiment analysis, emotion detection, or automatic summarization offered directly through the API. This would save significant development time.
More Granular Speaker Diarization: For calls with multiple participants, enhancing the real-time speaker diarization (labeling who is speaking) to be even more precise would be a fantastic addition for creating detailed call analyses.
Tighter Integration with TTS: Since Deepgram is also expanding into Text-to-Speech (TTS), offering a more seamlessly integrated STT-to-TTS pipeline could simplify the development stack for creating voice agents from start to finish.
Specialized, Pre-Trained Industry Models: While the general models are highly accurate, offering even more specialized, pre-trained models for specific industries like finance, healthcare, or legal-which are heavy on specific jargon-could push the accuracy even higher for those niche use cases.

For how long have I used the solution?

I have been using the solution for four years.

What do I think about the stability of the solution?

Based on my experience, my impression is that the solution is exceptionally stable.

We have never experienced any downtime. Their service is very transparent, and they even provide a status page where you can check the availability of their systems. It's a reliable and robust platform that we can depend on for our business-critical voice bot applications.

What do I think about the scalability of the solution?

We have never faced any issues with downtime or performance, even as our usage has grown. The architecture is clearly built to handle high volumes of real-time transcription. Furthermore, its pay-as-you-go, usage-based pricing model directly supports this scalability, making it financially viable to grow our services without being locked into a rigid plan. It's a system that scales seamlessly both technically and financially.

How are customer service and support?

Based on my experience, the customer service and support from Deepgram have been outstanding.

The support team is brilliant, highly reachable, and always ready to assist whenever we have a question or need help. It's a comprehensive support system that goes beyond just a direct contact channel; we have access to official support, very active community forums, and they frequently schedule webinars to share announcements and updates.

I've always felt that there are plenty of resources available, and we've never been left without a solution. It's a very real and accessible support system - a simple email or call gets you the assistance you need.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

Yes, I did. Initially, I used AssemblyAI in parallel with Deepgram while evaluating the best solution for our needs.

I made the switch to using Deepgram exclusively because of its superior configurability. While AssemblyAI is a solid product, I found that Deepgram provides a much deeper, more granular level of control. It allows me to fine-tune the behavior of the STT engine down to a micro-level, which is critical for optimizing the performance and accuracy of our voice bots. That ability to precisely tailor the service to our specific use case is why Deepgram ultimately stood out as the better choice for us.

How was the initial setup?

The initial setup was very straightforward.

It was a simple "Do-It-Yourself" (DIY) process that our in-house team handled entirely on our own, without needing to involve any external vendors. The primary reasons it was so easy were the extensive resources Deepgram provides:

Excellent Documentation: The documentation is clear, comprehensive, and easy to follow.
Rich Code Samples: They have robust GitHub repositories filled with plenty of examples and code samples in multiple languages, including Python, Java, and JavaScript. This made integration into our existing systems much faster.
Strong Community and Support: The availability of an active support community meant that if we had any questions, resources were readily available.

These factors combined made the implementation and integration process smooth and efficient.

What about the implementation team?

We implemented the solution entirely with our in-house team. It was a straightforward process, and we did not involve any vendors.

What was our ROI?

Our return on investment (ROI) with Deepgram has been excellent, although I don't track it as a specific percentage. The value comes from several key areas:

Low Implementation Cost: The solution is very developer-friendly with great documentation, which allowed our in-house team to integrate it quickly without needing to hire external vendors. This significantly reduced our initial investment.
Cost-Effective Operational Model: The pay-as-you-go pricing is transparent and affordable. It scales directly with our usage, which means our costs are always aligned with our business volume, preventing large, unnecessary expenses.
High-Value Enabler: The primary ROI comes from the fact that Deepgram's high accuracy and reliability are the foundation of our voice bot service. It enables us to deliver a high-quality product to our clients, which in turn generates our revenue. The investment in Deepgram directly translates to our ability to operate and grow our business.

In short, the ROI is demonstrated by low initial costs, predictable operational expenses, and the high quality of the core technology that powers our entire service offering.

Which other solutions did I evaluate?

Yes, before committing to Deepgram as our primary solution, I evaluated other options. The main competitor I looked at was AssemblyAI.

I used both AssemblyAI and Deepgram in parallel for a period to directly compare their performance in our real-world use cases. While AssemblyAI is also a good service, I ultimately chose Deepgram because it offered significantly more configurability. This allowed me to fine-tune the Speech-to-Text engine at a much more granular level, which was crucial for achieving the highest possible accuracy and performance for our specific voice bot applications.

What other advice do I have?

Yes, I absolutely have some advice for anyone considering or currently using Deepgram.

Don't Settle for the Defaults: The single biggest advantage of Deepgram over its competitors is its deep configurability. My advice is to really spend time with their documentation and API parameters. You can fine-tune the models to your specific audio environment, the accents you typically encounter, and the vocabulary relevant to your industry. This is where you can move from 90% accuracy to 95% or higher for your specific use case.
Stay Engaged with Their Updates: Deepgram innovates at a rapid pace. The release of the Flux model is a perfect example of how they solve real-world problems their users are facing. I highly recommend subscribing to their newsletters and attending their webinars. You might find that they've released a new feature or model that directly addresses a challenge you're working on, saving you significant development effort.
Leverage the Full Ecosystem: Think of Deepgram as the first crucial step in a larger data pipeline. The real power is unlocked when you connect its highly accurate transcripts to other services. As in my use case, feeding the text into an LLM for intent recognition, sentiment analysis, or summarization opens up a world of possibilities. You can analyze sales calls, automate customer support, or create detailed meeting summaries.
Use the Community and Support: Don't hesitate to engage with their support channels or community forums if you run into issues. My experience has been that they are incredibly responsive and helpful. The community is also active, and it's likely someone else has faced and solved a similar problem to yours.

In summary, my advice is to be an active user. The more you explore the platform's capabilities and stay current with its evolution, the greater the return on your investment will be. It's a top-tier solution that rewards a hands-on approach.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

reviewer2764962

Has reduced response time and replaced human support with accurate bilingual transcriptions

Reviewed on Oct 13, 2025

Review from a verified AWS customer

What is our primary use case?

I use Deepgram for a company that requested me to implement an AI voice agent for a security application that warns other neighbors of near alerts of some incidents that may occur in their neighborhoods.

I implemented this in January 2025, using Deepgram as a transcriber for those conversations for three months, and I love the technology because it transcribes very well all the conversations, making the implementation relatively easy.

My main use case for Deepgram is just for transcribing, and since this company is a Spanish company, I got deep into some use cases and settings configurations to adjust those transcriptions that include both Spanish and English words.

Deepgram handled one of these bilingual conversations by adjusting some settings, such as the name of the company being in English while the conversation was in Spanish, so we needed to configure it to transcribe accurately because Vapi utilized that transcription for the LLM agent to speech those words through an agent voice. Regarding my experience with those bilingual transcriptions, I think the transcriptions were quite precise, and while there is room for improvement, the results met expectations, making Deepgram a good fit for that work.

What is most valuable?

The best features Deepgram offers for me include mainly the transcription option, which I think is the robust solution among other providers since Deepgram does the job quite well.

Deepgram's transcription stands out compared to other solutions primarily due to its speed and accuracy; those are important points for me because not all providers or tools handled Spanish well, but Deepgram adjusted perfectly for that use case, and we also chose 11Labs voice, a South American voice, which worked very well with Deepgram.

Deepgram has positively impacted my organization by achieving our desired results, which is very good from the overall technology perspective, saving a lot of time for the support team since the voice agent replaced the human agents managing the calls, thus improving response time and reducing the time dedicated by those human agents.

What needs improvement?

Regarding improvements for Deepgram, I think the quality of the transcriptions could be enhanced, as the Spanish accent poses challenges, making it harder to transcribe some words, and considering additional accents from Chilean or Argentine speakers could improve the model's performance with local words.

I don't have any additional improvements for Deepgram besides those I mentioned earlier about Spanish accents and transcription quality.

For how long have I used the solution?

I have been working in my current field for 14 years, and since I am 40, I have quite a bit of experience.

What do I think about the stability of the solution?

In my experience, Deepgram is very stable, as I haven't encountered any downtime or issues.

What do I think about the scalability of the solution?

Deepgram's scalability has been fine; there were some limit issues with Vapi , but those issues stemmed from the Vapi platform and not Deepgram itself.

How are customer service and support?

I did not interact with customer support for Deepgram, so I cannot comment on my experience with them.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

Before Deepgram, we used other transcribers, but I don't remember the specific ones because they didn't work so well, prompting us to switch.

What was our ROI?

I have seen a return on investment in terms of time saved and fewer employees needed for both.

What's my experience with pricing, setup cost, and licensing?

My experience with pricing, setup cost, and licensing was good, as I found it to be cheaper without any problems.

Which other solutions did I evaluate?

Before choosing Deepgram, I did evaluate other options, but it was mainly a decision based on the integration with Vapi.

What other advice do I have?

My advice for others looking into using Deepgram is to read the documentation because the API is very flexible, and I encourage them to just test it out as it's a wonderful technology.

I was offered a gift card in AWS for this review.

On a scale of 1-10, I rate Deepgram a 9.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)

Umar Ijaz

Handles large data, good documentation is available and powerful model

Reviewed on Jun 27, 2024

Review provided by PeerSpot

What is our primary use case?

I use Deepgram for audio transcriptions and speech recognition. I am working on a feedback survey app where users provide verbal feedback that Deepgram transcribes into text.

We receive the results and implement features like punctuation and Smart Format.

How has it helped my organization?

Deepgram has significantly improved our transcription process in terms of speed and accuracy. It has allowed us to efficiently convert verbal feedback into text, enabling quicker analysis and implementation of new features.

Integrating Deepgram has streamlined our workflow, enhancing productivity and delivering more accurate transcription results.

What is most valuable?

We previously used IBM Watson, which was slow and had limitations in accurately transcribing words. After evaluating OpenAI's Whisper model, we discovered Deepgram, which incorporates Whisper and adds the powerful Nova model.

Deepgram's latency is impressively low, around 0.5 to 1 second, making it a superior choice.

What needs improvement?

Live transcription could be improved. Sometimes, Deepgram's WebSocket is disposed of due to redundancy issues. Enhanced stability in live transcription would be beneficial.

For how long have I used the solution?

I have been using Deepgram for one and a half years.

What do I think about the stability of the solution?

Initially, we encountered some stability issues, but Deepgram has since improved its architecture. With the addition of hooks for status updates, the accuracy has improved to approximately 90 to 95%, which is better than other models we've tested.

What do I think about the scalability of the solution?

It's scalable. Our platform handles 50 to 60 users simultaneously without compromising accuracy. For instance, a 20-minute audio file was transcribed within a second, demonstrating its ability to handle large volumes of audio data effectively.

How are customer service and support?

My experience with customer service and support has been positive. They are responsive and helpful, and they provide timely resolutions to any issues.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We previously used IBM Watson, but it didn't deliver appropriate results. We searched for alternatives and found OpenAI's Whisper model, which was initially slow. After thorough analysis, we discovered Deepgram. It proved to be superior, leading to our decision to migrate. We used a detailed spreadsheet to compare various models before making the switch.

How was the initial setup?

Thanks to clear documentation, the initial setup was very easy. If you have prerequisite knowledge of the programming language you're using, it’s straightforward to follow the documentation and implement it into your system. When I started, I closely followed the documentation, which made the process very manageable.

Deployment model: We last deployed it on the Google Cloud Platform (GCP).

What about the implementation team?

The implementation was done in-house.

What was our ROI?

Our ROI has increased due to enhanced transcription accuracy and speed, leading to more efficient workflows and better user satisfaction.

What's my experience with pricing, setup cost, and licensing?

The pricing is moderate. While live transcription may incur some charges when the connection is open, they become minimal over time. So, it's a balanced option—neither cheap nor overly expensive.

Which other solutions did I evaluate?

Yes, besides IBM Watson, we evaluated OpenAI's Whisper model.

What other advice do I have?

Deepgram is highly recommended. Users don’t need to do anything special before using it, as the documentation is comprehensive. I am a Node.js developer and have used Deepgram packages for Node.js. Understanding your programming language is key, whether it's Node.js, Python, or others.

AI Features:

I have integrated various AI models into our application. Deepgram's sentiment analysis feature allows us to create graphs and analyses to determine if words are positive, negative, or neutral. This helps us summarize feedback and derive actionable insights.

My ratings:

I would rate it an eight out of ten. The live transcription feature needs improvement as the WebSocket sometimes gives errors or breaks down during live streams.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Google

Arslan Rasheed

Used for TTS (Text-to-Speech) and STT (Speech-to-Text) purposes

Reviewed on Jun 25, 2024

Review provided by PeerSpot

What is our primary use case?

We use the solution for TTS (Text-to-Speech) and STT (Speech-to-Text) purposes.

What is most valuable?

The solution's Speech-to-Text conversion feature is really awesome.

What needs improvement?

Deepgram is currently restricted to only the English variants, but it should include other languages, such as German or French.

For how long have I used the solution?

I have been using Deepgram for five to six months.

What do I think about the stability of the solution?

Deepgram is a stable solution.

What do I think about the scalability of the solution?

The Deepgram cloud can handle large volumes of audio data. Around three to four people use the solution in our organization.

How was the initial setup?

The solution’s initial setup is easy.

What's my experience with pricing, setup cost, and licensing?

Deepgram is a cheap solution. We can create an account for $200, which we can initially use for the Deepgram services.

What other advice do I have?

I have used Deepgram with Twilio for the calling system. I would recommend Deepgram to users who want to use it for speech-to-text purposes.

Overall, I rate the solution an eight out of ten.

Boris Morozov

Offers great speed during transcription compared to other tools

Reviewed on Jun 24, 2024

Review provided by PeerSpot

What is our primary use case?

I am the software development team lead in my company, and we are creating a speech recognition product based on a few different engines. The company works in the legal process, so my team generates a machine-based transcript and then converts it to a readable format. It saves time for the person using the machine-based transcript as a starting point.

What is most valuable?

The solution's most valuable feature is its speed of transcription. It is one of the fastest tools, especially if you compare it to the second fastest solution that you can get, which is 20 times faster. Thus, it is not just a marginally faster product.

What needs improvement?

In comparison to Deepgram, I would say that the transcript accuracy offered by other products is much higher. In our company, we had five jobs, and each job was generated by two different engines. One of the aforementioned engines was Deepgram, and the transcribers had two versions, and the users didn't know which one was which version. After the aforementioned process was carried out, the users had to choose which version they thought was less work to convert to the final or the perfect legally binding document since the tool was getting paid per page. Deepgram had a clear incentive to do some little work as fast as possible and to get a particular amount of money. In our company, we didn't have a huge sample size because we are a small company. We did five jobs with five different transcribers, and each job had two versions. A blind test was done, and we found that the other tools were marginally more accurate than Deepgram.

I would like it to be more accurate. I can get maintainability and faster transcripts with the perfect features with an improved tool.

For how long have I used the solution?

I have been using Deepgram for a year. I use the solution's latest version. I am an end-user of the tool.

What do I think about the stability of the solution?

I don't use all of the features in the tool because Deepgram adds new features every few months. The features that I have been using in the tool have been very stable. I have never had any issues with the tool and it has never crashed. In our company, we just update to the latest version, and see if there are no issues until the next update.

What do I think about the scalability of the solution?

I use the tool as a software developer and transcriber. Though I don't know the number of transcribers that work in our company , I can say that it runs into hundreds.

How are customer service and support?

I contacted the solution's technical support for help, and I got a nice, decent service and had no complaints at all. At the moment, whenever you want to update Deepgram by yourself, it is a very easy process. You just get the version of the tool and pull it from Docker. If you want to update the model, you have to contact the support anyway. I haven't contacted the support team very often. I might have contacted the product's support team three times just to update the new versions of the model. I got decent support from the tool's support team.

Which solution did I use previously and why did I switch?

My company still uses multiple products in parallel to Deepgram. For example, in transcript-related business, some clients want to get the transcript super fast or in a few hours. You need to produce the machine transcript in a few minutes and give it to the transcriber, who should start working immediately on it. In such cases, there is a huge improvement when one uses Deepgram, which is a major advantage.

How was the initial setup?

If I consider and compare the other engines I have used with Deepgram, I would say that the ease of installation is one of the strong points of the product. Compared to all other engines, the installation of Deepgram has been simpler and far more stable. It just gets updated, and it runs properly. The good thing is that the tool is modular. In the tool, the modules and the engine itself are separate objects. If you want to update only the module, then there is no need to redeploy anything since most other engines that we have in our company are on an on-premises model.

The solution is deployed on on-premises and cloud environments. It is deployed in the private section of our company's cloud. We aren't using the API and prefer to use our own deployment.

What's my experience with pricing, setup cost, and licensing?

When using Deepgram, one needs to pay for the hours or minutes for which the transcription is needed. The more hours you commit to in advance, the cheaper the price. It is slightly cheaper than the other engines we used. You should take into account that you usually pay for cloud resources or even if you are doing just an on-premises deployment, and since it is a fast process, theoretically, you can save twice, like, once on the money you pay for usage and the second time you pay for cloud resources because if you can, like, finish transcribing in a minute instead of an hour, you can know, if your pipeline scales down, you will only pay for that minute.

What other advice do I have?

Whether you should use the tool or not depends on your use cases and the main use cases where it is used. Based on the engines and factors like maintainability or how easy it is to maintain, and if you consider them to be your priorities, you definitely go with Deepgram.

Speaking about how Deepgram handles large volumes of audio data without compromising inaccuracy, it is always a trade-off. As I said, Deepgram is not as accurate as the other engines we're using, but the difference is marginal, and if speed is more important to you, you should go with Deepgram. I think if the accuracy of the transcript is far more important, other engines give better results.

For an end user, the tool offers on-premises deployment, and it also has an API. Using API is super easy since you just log in to the site and create an account, and you can start using it. If you want to deploy it on an on-premises model, you need to have a basic understanding of the cloud and how it works. The tool has a step-by-step guide on its website, which is very nice, and I still use it even because it is super simple and can be easily understood.

We haven't actually used our AI features at the moment because we're not quite sure how we can use them in our company because we are working on the legal transcript, and those have to be developed word by word even if the person who speaks, says certain things incorrectly. We have to maintain 100 percent accuracy. I am not sure how we can apply AI features in the tool, but we are always looking at the AI aspect in our company.

Deepgram offers some clear advantages over other applications. If you want a tool that produces transcripts in a very fast manner, there is nothing that comes even close to being disputed with Deepgram.

I rate the tool an eight or nine out of ten.

Which deployment model are you using for this solution?

Hybrid Cloud

View all reviews