Listing Thumbnail

    Speech-to-Text & Text-to-Speech GenAI API

     Info
    Sold by: Deepgram 
    Free Trial
    Deepgram, Language AI models to power your apps.

    Overview

    Play video

    Deepgram voice AI models power your apps with world-class speech-to-text and domain-specific language models (DSLMs). Effortlessly accurate. Blazing fast. Enterprise-ready scale. Unbeatable pricing. Everything developers need to build with confidence and ship faster.

    For questions and custom quote options, reach out to us at aws@deepgram.com .

    Highlights

    • Transcription (STT) - 20x faster: Transcribe in real-time or an hour of pre-recorded audio in about 12 seconds. - <300ms latency: The fastest real-time transcription speeds for human-like conversational AI experiences, real-time analytics, and enablement. - >90% accuracy: Deepgram leads the industry with most accurate models in market across use case categories.
    • Understanding - Summarization - Sentiment analysis - Sentiment analysis - Language translation - Speaker diarization - Language Detection - And more...
    • Custom Model Training - Deepgram will support customer specific custom model training to ensure your model works to meet your business objectives.

    Details

    Delivery method

    Deployed on AWS

    Unlock automation with AI agent solutions

    Fast-track AI initiatives with agents, tools, and solutions from AWS Partners.
    AI Agents

    Features and programs

    Buyer guide

    Gain valuable insights from real users who purchased this product, powered by PeerSpot.
    Buyer guide

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Free trial

    Try this product free according to the free trial terms set by the vendor.

    Speech-to-Text & Text-to-Speech GenAI API

     Info
    Pricing is based on the duration and terms of your contract with the vendor. This entitles you to a specified quantity of use for the contract duration. If you choose not to renew or replace your contract before it ends, access to these entitlements will expire.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    1-month contract (2)

     Info
    Dimension
    Description
    Cost/month
    Enterprise Offering
    Custom Enterprise Offering
    $10,000,000.00
    Cost per Transcription Hour
    Deepgram charges per transcription hour
    $1,250.00

    Vendor refund policy

    Deepgram Terms of Service: https://deepgram.com/terms/ 

    Custom pricing options

    Request a private offer to receive a custom quote.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    Software as a Service (SaaS)

    SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

    Resources

    Support

    Vendor support

    For sales, contracting and usage inquires, please email aws@deepgram.com 

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Product comparison

     Info
    Updated weekly

    Accolades

     Info
    Top
    10
    In Scheduling & Coordination, Speech Recognition, Sales & Marketing
    Top
    10
    In Speech to Text, Customer Support, Speech Recognition
    Top
    100
    In Natural Language Processing

    Customer reviews

     Info
    Sentiment is AI generated from actual customer reviews on AWS and G2
    Reviews
    Functionality
    Ease of use
    Customer service
    Cost effectiveness
    2 reviews
    Insufficient data
    Insufficient data
    Insufficient data
    Insufficient data
    0 reviews
    Insufficient data
    Insufficient data
    Insufficient data
    Insufficient data
    Positive reviews
    Mixed reviews
    Negative reviews

    Overview

     Info
    AI generated from product descriptions
    Speech Recognition Speed
    Real-time transcription with processing speed of 20x faster than traditional methods, capable of transcribing an hour of audio in approximately 12 seconds
    Latency Performance
    Ultra-low latency under 300 milliseconds for near-instantaneous speech-to-text conversion
    Accuracy Metrics
    Speech recognition accuracy exceeding 90% across multiple use case categories
    Language Understanding Capabilities
    Advanced natural language processing features including summarization, sentiment analysis, speaker diarization, language detection, and translation
    Model Customization
    Support for customer-specific custom model training to adapt speech recognition for unique business requirements
    Speech Recognition
    Advanced multilingual speech recognition with high accuracy and low word error rates
    Language Processing
    Support for 99+ languages with automatic language detection and custom vocabulary capabilities
    Audio Intelligence
    Comprehensive suite of AI models including speaker diarization, sentiment analysis, content moderation, and PII redaction
    Large Language Model Integration
    LeMUR framework for processing audio transcripts using advanced language model capabilities
    Transcription Flexibility
    Support for async and real-time transcription with multiple file type compatibility across 33 audio and video formats
    Natural Language Understanding
    Advanced proprietary Large Language Model (ConveRT) trained specifically for customer service applications
    Speech Recognition Technology
    Spoken language understanding system capable of processing diverse accents, dialects, and background noise
    Conversational AI Architecture
    Customer-led conversational assistant platform enabling natural language interaction with interruption and topic flexibility
    Language Processing Capability
    Multi-language support with ability to understand and respond across different linguistic contexts
    Dialogue Management
    Customizable conversational assistant deployment with continuous improvement through expert dialogue systems scientists and machine learning developers

    Contract

     Info
    Standard contract
    No
    No
    No

    Customer reviews

    Ratings and reviews

     Info
    4.7
    3 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    33%
    67%
    0%
    0%
    0%
    3 AWS reviews
    |
    4 external reviews
    Star ratings include only reviews from verified AWS customers. External reviews can also include a star rating, but star ratings from external reviews are not averaged in with the AWS customer star ratings.
    Arunkumar HG

    A Powerful, Adaptable, and Constantly Evolving STT Solution for Voice Automation

    Reviewed on Oct 17, 2025
    Review from a verified AWS customer

    What is our primary use case?

    For the last two years, our primary use case for Deepgram  has been to power sophisticated, AI-driven voice bots for major US clients.

    The technical workflow is as follows:

    1. A client initiates a call to a Twilio  number.
    2. Our system captures the audio and streams it in real-time to Deepgram 's Speech-to-Text service.
    3. Deepgram transcribes the speech into text with high accuracy.
    4. This text is then passed to a Large Language Model (LLM) to analyze and determine the user's intent.
    5. Based on the identified intent, we trigger the appropriate backend functions to generate a relevant response.
    6. Finally, we use a Text-to-Speech (TTS) engine, such as ElevenLabs , to convert the response back into audio and play it for the user.

    The entire process is built upon the speed and reliability of Deepgram's transcription. Our environment is deployed on the Public Cloud, specifically using Amazon Web Services  (AWS ).

    What is most valuable?

    Of course. Based on my review, here are the features I've found most valuable:

    • Continuous Innovation and Responsiveness: I find it incredibly valuable that Deepgram is not a static product. They are constantly evolving and genuinely listen to user feedback. The evolution from their Nova models to the new Flux  model, which was specifically designed to solve end-of-speech detection for conversational AI, is a perfect example. It shows they are committed to solving real-world problems for their users.
    • High Accuracy and Reliability: For my voice bot solutions, accuracy is non-negotiable. The models are remarkably accurate, performing at 90-92% efficiency even with challenging conditions like background noise and a wide range of international accents. Furthermore, the service has been incredibly stable; in my four years of using it, we've never experienced downtime.
    • Excellent Configurability and Ease  of Integration: Deepgram offers a level of granular control that allows me to fine-tune the STT engine's behavior, which is a significant advantage over competitors. This flexibility, combined with straightforward integration, extensive documentation, and robust code examples, allows my team to be highly efficient.
    • Cost-Effectiveness and Scalability: The pay-as-you-go pricing model is both affordable and transparent. It provides a significant return on investment because it satisfies all our primary requirements—technical accuracy, ease of integration, and low implementation cost—within a scalable and predictable financial model.
    • Outstanding Customer Support: The support team is brilliant and always ready to assist. Having access to official support channels, active community forums, and frequent webinars ensures that we are never without resources, which is crucial for a business-critical application.

    What needs improvement?

    Honestly, Deepgram has been exceptionally proactive in addressing the primary area that needed improvement. My main challenge was with the real-time detection of when a user has finished speaking in a live conversation, which is critical for a responsive voice bot. They directly solved this by releasing their Flux  model.

    Because Flux is a recent release, I haven't yet had enough time to thoroughly test it and identify new limitations. At this stage, any "improvement" would be more of a "nice-to-have" feature rather than a fix for an existing problem. The core service is already very robust and meets all of our current needs.

    What additional features should be included in the next release?


    Looking toward the future, here are a few features that could add even more value to an already excellent platform:

    1. Advanced Built-in Analytics: While I can get the raw transcript and build my own analytics pipeline, it would be powerful to have features like sentiment analysis, emotion detection, or automatic summarization offered directly through the API. This would save significant development time.
    2. More Granular Speaker Diarization: For calls with multiple participants, enhancing the real-time speaker diarization (labeling who is speaking) to be even more precise would be a fantastic addition for creating detailed call analyses.
    3. Tighter Integration with TTS: Since Deepgram is also expanding into Text-to-Speech (TTS), offering a more seamlessly integrated STT-to-TTS pipeline could simplify the development stack for creating voice agents from start to finish.
    4. Specialized, Pre-Trained Industry Models: While the general models are highly accurate, offering even more specialized, pre-trained models for specific industries like finance, healthcare, or legal-which are heavy on specific jargon-could push the accuracy even higher for those niche use cases.

    For how long have I used the solution?

    I have been using the solution for four years.

    What do I think about the stability of the solution?

    Based on my experience, my impression is that the solution is exceptionally stable.

    We have never experienced any downtime. Their service is very transparent, and they even provide a status page where you can check the availability of their systems. It's a reliable and robust platform that we can depend on for our business-critical voice bot applications.

    What do I think about the scalability of the solution?

    We have never faced any issues with downtime or performance, even as our usage has grown. The architecture is clearly built to handle high volumes of real-time transcription. Furthermore, its pay-as-you-go, usage-based pricing model directly supports this scalability, making it financially viable to grow our services without being locked into a rigid plan. It's a system that scales seamlessly both technically and financially.

    How are customer service and support?

    Based on my experience, the customer service and support from Deepgram have been outstanding.

    The support team is brilliant, highly reachable, and always ready to assist whenever we have a question or need help. It's a comprehensive support system that goes beyond just a direct contact channel; we have access to official support, very active community forums, and they frequently schedule webinars to share announcements and updates.

    I've always felt that there are plenty of resources available, and we've never been left without a solution. It's a very real and accessible support system - a simple email or call gets you the assistance you need.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    Yes, I did. Initially, I used AssemblyAI  in parallel with Deepgram while evaluating the best solution for our needs.

    I made the switch to using Deepgram exclusively because of its superior configurability. While AssemblyAI  is a solid product, I found that Deepgram provides a much deeper, more granular level of control. It allows me to fine-tune the behavior of the STT engine down to a micro-level, which is critical for optimizing the performance and accuracy of our voice bots. That ability to precisely tailor the service to our specific use case is why Deepgram ultimately stood out as the better choice for us.

    How was the initial setup?

    The initial setup was very straightforward.

    It was a simple "Do-It-Yourself" (DIY) process that our in-house team handled entirely on our own, without needing to involve any external vendors. The primary reasons it was so easy were the extensive resources Deepgram provides:

    • Excellent Documentation: The documentation is clear, comprehensive, and easy to follow.
    • Rich Code Samples: They have robust GitHub  repositories filled with plenty of examples and code samples in multiple languages, including Python, Java, and JavaScript. This made integration into our existing systems much faster.
    • Strong Community and Support: The availability of an active support community meant that if we had any questions, resources were readily available.

    These factors combined made the implementation and integration process smooth and efficient.

    What about the implementation team?

    We implemented the solution entirely with our in-house team. It was a straightforward process, and we did not involve any vendors.

    What was our ROI?

    Our return on investment (ROI) with Deepgram has been excellent, although I don't track it as a specific percentage. The value comes from several key areas:

    1. Low Implementation Cost: The solution is very developer-friendly with great documentation, which allowed our in-house team to integrate it quickly without needing to hire external vendors. This significantly reduced our initial investment.
    2. Cost-Effective Operational Model: The pay-as-you-go pricing is transparent and affordable. It scales directly with our usage, which means our costs are always aligned with our business volume, preventing large, unnecessary expenses.
    3. High-Value Enabler: The primary ROI comes from the fact that Deepgram's high accuracy and reliability are the foundation of our voice bot service. It enables us to deliver a high-quality product to our clients, which in turn generates our revenue. The investment in Deepgram directly translates to our ability to operate and grow our business.

    In short, the ROI is demonstrated by low initial costs, predictable operational expenses, and the high quality of the core technology that powers our entire service offering.

    Which other solutions did I evaluate?

    Yes, before committing to Deepgram as our primary solution, I evaluated other options. The main competitor I looked at was AssemblyAI.

    I used both AssemblyAI and Deepgram in parallel for a period to directly compare their performance in our real-world use cases. While AssemblyAI is also a good service, I ultimately chose Deepgram because it offered significantly more configurability. This allowed me to fine-tune the Speech-to-Text engine at a much more granular level, which was crucial for achieving the highest possible accuracy and performance for our specific voice bot applications.

    What other advice do I have?

    Yes, I absolutely have some advice for anyone considering or currently using Deepgram.

    1. Don't Settle for the Defaults: The single biggest advantage of Deepgram over its competitors is its deep configurability. My advice is to really spend time with their documentation and API parameters. You can fine-tune the models to your specific audio environment, the accents you typically encounter, and the vocabulary relevant to your industry. This is where you can move from 90% accuracy to 95% or higher for your specific use case.
    2. Stay Engaged with Their Updates: Deepgram innovates at a rapid pace. The release of the Flux model is a perfect example of how they solve real-world problems their users are facing. I highly recommend subscribing to their newsletters and attending their webinars. You might find that they've released a new feature or model that directly addresses a challenge you're working on, saving you significant development effort.
    3. Leverage the Full Ecosystem: Think of Deepgram as the first crucial step in a larger data pipeline. The real power is unlocked when you connect its highly accurate transcripts to other services. As in my use case, feeding the text into an LLM for intent recognition, sentiment analysis, or summarization opens up a world of possibilities. You can analyze sales calls, automate customer support, or create detailed meeting summaries.
    4. Use the Community and Support: Don't hesitate to engage with their support channels or community forums if you run into issues. My experience has been that they are incredibly responsive and helpful. The community is also active, and it's likely someone else has faced and solved a similar problem to yours.

    In summary, my advice is to be an active user. The more you explore the platform's capabilities and stay current with its evolution, the greater the return on your investment will be. It's a top-tier solution that rewards a hands-on approach.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    reviewer2764962

    Has reduced response time and replaced human support with accurate bilingual transcriptions

    Reviewed on Oct 13, 2025
    Review from a verified AWS customer

    What is our primary use case?

    I use Deepgram  for a company that requested me to implement an AI voice agent for a security application that warns other neighbors of near alerts of some incidents that may occur in their neighborhoods.

    I implemented this in January 2025, using Deepgram  as a transcriber for those conversations for three months, and I love the technology because it transcribes very well all the conversations, making the implementation relatively easy.

    My main use case for Deepgram is just for transcribing, and since this company is a Spanish company, I got deep into some use cases and settings configurations to adjust those transcriptions that include both Spanish and English words.

    Deepgram handled one of these bilingual conversations by adjusting some settings, such as the name of the company being in English while the conversation was in Spanish, so we needed to configure it to transcribe accurately because Vapi  utilized that transcription for the LLM agent to speech those words through an agent voice. Regarding my experience with those bilingual transcriptions, I think the transcriptions were quite precise, and while there is room for improvement, the results met expectations, making Deepgram a good fit for that work.

    What is most valuable?

    The best features Deepgram offers for me include mainly the transcription option, which I think is the robust solution among other providers since Deepgram does the job quite well.

    Deepgram's transcription stands out compared to other solutions primarily due to its speed and accuracy; those are important points for me because not all providers or tools handled Spanish well, but Deepgram adjusted perfectly for that use case, and we also chose 11Labs voice, a South American voice, which worked very well with Deepgram.

    Deepgram has positively impacted my organization by achieving our desired results, which is very good from the overall technology perspective, saving a lot of time for the support team since the voice agent replaced the human agents managing the calls, thus improving response time and reducing the time dedicated by those human agents.

    What needs improvement?

    Regarding improvements for Deepgram, I think the quality of the transcriptions could be enhanced, as the Spanish accent poses challenges, making it harder to transcribe some words, and considering additional accents from Chilean or Argentine speakers could improve the model's performance with local words.

    I don't have any additional improvements for Deepgram besides those I mentioned earlier about Spanish accents and transcription quality.

    For how long have I used the solution?

    I have been working in my current field for 14 years, and since I am 40, I have quite a bit of experience.

    What do I think about the stability of the solution?

    In my experience, Deepgram is very stable, as I haven't encountered any downtime or issues.

    What do I think about the scalability of the solution?

    Deepgram's scalability has been fine; there were some limit issues with Vapi , but those issues stemmed from the Vapi platform and not Deepgram itself.

    How are customer service and support?

    I did not interact with customer support for Deepgram, so I cannot comment on my experience with them.

    How would you rate customer service and support?

    Neutral

    Which solution did I use previously and why did I switch?

    Before Deepgram, we used other transcribers, but I don't remember the specific ones because they didn't work so well, prompting us to switch.

    What was our ROI?

    I have seen a return on investment in terms of time saved and fewer employees needed for both.

    What's my experience with pricing, setup cost, and licensing?

    My experience with pricing, setup cost, and licensing was good, as I found it to be cheaper without any problems.

    Which other solutions did I evaluate?

    Before choosing Deepgram, I did evaluate other options, but it was mainly a decision based on the integration with Vapi.

    What other advice do I have?

    My advice for others looking into using Deepgram is to read the documentation because the API is very flexible, and I encourage them to just test it out as it's a wonderful technology.

    I was offered a gift card in AWS  for this review.

    On a scale of 1-10, I rate Deepgram a 9.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Umar Ijaz

    Handles large data, good documentation is available and powerful model

    Reviewed on Jun 27, 2024
    Review provided by PeerSpot

    What is our primary use case?

    I use Deepgram  for audio transcriptions and speech recognition. I am working on a feedback survey app where users provide verbal feedback that Deepgram  transcribes into text. 

    We receive the results and implement features like punctuation and Smart Format.

    How has it helped my organization?

    Deepgram has significantly improved our transcription process in terms of speed and accuracy. It has allowed us to efficiently convert verbal feedback into text, enabling quicker analysis and implementation of new features. 

    Integrating Deepgram has streamlined our workflow, enhancing productivity and delivering more accurate transcription results.

    What is most valuable?

    We previously used IBM Watson, which was slow and had limitations in accurately transcribing words. After evaluating OpenAI's Whisper model, we discovered Deepgram, which incorporates Whisper and adds the powerful Nova model.

    Deepgram's latency is impressively low, around 0.5 to 1 second, making it a superior choice.

    What needs improvement?

    Live transcription could be improved. Sometimes, Deepgram's WebSocket is disposed of due to redundancy issues. Enhanced stability in live transcription would be beneficial.

    For how long have I used the solution?

    I have been using Deepgram for one and a half years.

    What do I think about the stability of the solution?

    Initially, we encountered some stability issues, but Deepgram has since improved its architecture. With the addition of hooks for status updates, the accuracy has improved to approximately 90 to 95%, which is better than other models we've tested.

    What do I think about the scalability of the solution?

    It's scalable. Our platform handles 50 to 60 users simultaneously without compromising accuracy. For instance, a 20-minute audio file was transcribed within a second, demonstrating its ability to handle large volumes of audio data effectively.

    How are customer service and support?

    My experience with customer service and support has been positive. They are responsive and helpful, and they provide timely resolutions to any issues.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    We previously used IBM Watson, but it didn't deliver appropriate results. We searched for alternatives and found OpenAI's Whisper model, which was initially slow. After thorough analysis, we discovered Deepgram. It proved to be superior, leading to our decision to migrate. We used a detailed spreadsheet to compare various models before making the switch.

    How was the initial setup?

    Thanks to clear documentation, the initial setup was very easy. If you have prerequisite knowledge of the programming language you're using, it’s straightforward to follow the documentation and implement it into your system. When I started, I closely followed the documentation, which made the process very manageable.

    Deployment model: We last deployed it on the Google Cloud  Platform (GCP).

    What about the implementation team?

    The implementation was done in-house.

    What was our ROI?

    Our ROI has increased due to enhanced transcription accuracy and speed, leading to more efficient workflows and better user satisfaction.

    What's my experience with pricing, setup cost, and licensing?

    The pricing is moderate. While live transcription may incur some charges when the connection is open, they become minimal over time. So, it's a balanced option—neither cheap nor overly expensive.

    Which other solutions did I evaluate?

    Yes, besides IBM Watson, we evaluated OpenAI's Whisper model.

    What other advice do I have?

    Deepgram is highly recommended. Users don’t need to do anything special before using it, as the documentation is comprehensive. I am a Node.js developer and have used Deepgram packages for Node.js. Understanding your programming language is key, whether it's Node.js, Python, or others.

    AI Features:

    I have integrated various AI models into our application. Deepgram's sentiment analysis feature allows us to create graphs and analyses to determine if words are positive, negative, or neutral. This helps us summarize feedback and derive actionable insights.

    My ratings:

    I would rate it an eight out of ten. The live transcription feature needs improvement as the WebSocket sometimes gives errors or breaks down during live streams.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Google
    Arslan Rasheed

    Used for TTS (Text-to-Speech) and STT (Speech-to-Text) purposes

    Reviewed on Jun 25, 2024
    Review provided by PeerSpot

    What is our primary use case?

    We use the solution for TTS (Text-to-Speech) and STT (Speech-to-Text) purposes.

    What is most valuable?

    The solution's Speech-to-Text conversion feature is really awesome.

    What needs improvement?

    Deepgram is currently restricted to only the English variants, but it should include other languages, such as German or French.

    For how long have I used the solution?

    I have been using Deepgram for five to six months.

    What do I think about the stability of the solution?

    Deepgram is a stable solution.

    What do I think about the scalability of the solution?

    The Deepgram cloud can handle large volumes of audio data. Around three to four people use the solution in our organization.

    How was the initial setup?

    The solution’s initial setup is easy.

    What's my experience with pricing, setup cost, and licensing?

    Deepgram is a cheap solution. We can create an account for $200, which we can initially use for the Deepgram services.

    What other advice do I have?

    I have used Deepgram with Twilio for the calling system. I would recommend Deepgram to users who want to use it for speech-to-text purposes.

    Overall, I rate the solution an eight out of ten.

    Boris Morozov

    Offers great speed during transcription compared to other tools

    Reviewed on Jun 24, 2024
    Review provided by PeerSpot

    What is our primary use case?

    I am the software development team lead in my company, and we are creating a speech recognition product based on a few different engines. The company works in the legal process, so my team generates a machine-based transcript and then converts it to a readable format. It saves time for the person using the machine-based transcript as a starting point.

    What is most valuable?

    The solution's most valuable feature is its speed of transcription. It is one of the fastest tools, especially if you compare it to the second fastest solution that you can get, which is 20 times faster. Thus, it is not just a marginally faster product.

    What needs improvement?

    In comparison to Deepgram, I would say that the transcript accuracy offered by other products is much higher. In our company, we had five jobs, and each job was generated by two different engines. One of the aforementioned engines was Deepgram, and the transcribers had two versions, and the users didn't know which one was which version. After the aforementioned process was carried out, the users had to choose which version they thought was less work to convert to the final or the perfect legally binding document since the tool was getting paid per page. Deepgram had a clear incentive to do some little work as fast as possible and to get a particular amount of money. In our company, we didn't have a huge sample size because we are a small company. We did five jobs with five different transcribers, and each job had two versions. A blind test was done, and we found that the other tools were marginally more accurate than Deepgram.

    I would like it to be more accurate. I can get maintainability and faster transcripts with the perfect features with an improved tool.

    For how long have I used the solution?

    I have been using Deepgram for a year. I use the solution's latest version. I am an end-user of the tool.

    What do I think about the stability of the solution?

    I don't use all of the features in the tool because Deepgram adds new features every few months. The features that I have been using in the tool have been very stable. I have never had any issues with the tool and it has never crashed. In our company, we just update to the latest version, and see if there are no issues until the next update.

    What do I think about the scalability of the solution?

    I use the tool as a software developer and transcriber. Though I don't know the number of transcribers that work in our company , I can say that it runs into hundreds.

    How are customer service and support?

    I contacted the solution's technical support for help, and I got a nice, decent service and had no complaints at all. At the moment, whenever you want to update Deepgram by yourself, it is a very easy process. You just get the version of the tool and pull it from Docker. If you want to update the model, you have to contact the support anyway. I haven't contacted the support team very often. I might have contacted the product's support team three times just to update the new versions of the model. I got decent support from the tool's support team.

    Which solution did I use previously and why did I switch?

    My company still uses multiple products in parallel to Deepgram. For example, in transcript-related business, some clients want to get the transcript super fast or in a few hours. You need to produce the machine transcript in a few minutes and give it to the transcriber, who should start working immediately on it. In such cases, there is a huge improvement when one uses Deepgram, which is a major advantage.

    How was the initial setup?

    If I consider and compare the other engines I have used with Deepgram, I would say that the ease of installation is one of the strong points of the product. Compared to all other engines, the installation of Deepgram has been simpler and far more stable. It just gets updated, and it runs properly. The good thing is that the tool is modular. In the tool, the modules and the engine itself are separate objects. If you want to update only the module, then there is no need to redeploy anything since most other engines that we have in our company are on an on-premises model.

    The solution is deployed on on-premises and cloud environments. It is deployed in the private section of our company's cloud. We aren't using the API and prefer to use our own deployment.

    What's my experience with pricing, setup cost, and licensing?

    When using Deepgram, one needs to pay for the hours or minutes for which the transcription is needed. The more hours you commit to in advance, the cheaper the price. It is slightly cheaper than the other engines we used. You should take into account that you usually pay for cloud resources or even if you are doing just an on-premises deployment, and since it is a fast process, theoretically, you can save twice, like, once on the money you pay for usage and the second time you pay for cloud resources because if you can, like, finish transcribing in a minute instead of an hour, you can know, if your pipeline scales down, you will only pay for that minute.

    What other advice do I have?

    Whether you should use the tool or not depends on your use cases and the main use cases where it is used. Based on the engines and factors like maintainability or how easy it is to maintain, and if you consider them to be your priorities, you definitely go with Deepgram.

    Speaking about how Deepgram handles large volumes of audio data without compromising inaccuracy, it is always a trade-off. As I said, Deepgram is not as accurate as the other engines we're using, but the difference is marginal, and if speed is more important to you, you should go with Deepgram. I think if the accuracy of the transcript is far more important, other engines give better results.

    For an end user, the tool offers on-premises deployment, and it also has an API. Using API is super easy since you just log in to the site and create an account, and you can start using it. If you want to deploy it on an on-premises model, you need to have a basic understanding of the cloud and how it works. The tool has a step-by-step guide on its website, which is very nice, and I still use it even because it is super simple and can be easily understood.

    We haven't actually used our AI features at the moment because we're not quite sure how we can use them in our company because we are working on the legal transcript, and those have to be developed word by word even if the person who speaks, says certain things incorrectly. We have to maintain 100 percent accuracy. I am not sure how we can apply AI features in the tool, but we are always looking at the AI aspect in our company.

    Deepgram offers some clear advantages over other applications. If you want a tool that produces transcripts in a very fast manner, there is nothing that comes even close to being disputed with Deepgram.

    I rate the tool an eight or nine out of ten.

    Which deployment model are you using for this solution?

    Hybrid Cloud
    View all reviews