 
                        AssemblyAI
AssemblyAIReviews from AWS customer
                            
                            0 AWS reviews
                        
                        - 
                                5 star0
- 
                                4 star0
- 
                                3 star0
- 
                                2 star0
- 
                                1 star0
External reviews
                                
                                90 reviews 
                            
                            from
                            
                                
                                    
                                    
                                    
                                    
                                
                            
                        External reviews are not included in the AWS star rating for the product.
High-quality speech recognition with robust diarization and smart API design
What do you like best about the product?
AssemblyAI impresses with its high transcription quality, even when dealing with messy or low-quality audio inputs. The diarization capabilities are particularly strong—accurately distinguishing between speakers in less-than-perfect recordings. The API suite is fast, well-documented, and returns a rich, detailed output format that makes post-processing straightforward and powerful. I also found the Word Boost feature especially helpful: being able to prioritize tricky or uncommon words significantly improves recognition accuracy in niche use cases. Overall, it’s a developer-friendly platform that balances precision with flexibility.
What do you dislike about the product?
Honestly, there’s little to complain about. The pricing model is reasonable for the level of quality and features provided, and I haven’t encountered any significant drawbacks in my usage
What problems is the product solving and how is that benefiting you?
Transcription and diarization of complex audios
                        
                            Great transcription for Spanish, quicker than other providers
What do you like best about the product?
It's really great for Spanish specifically and user diarization. Also, it's quick compared to Speechmatics API; it's really slow, so kudos on that also, and it's been really cost-effective. I must have transcribed 800-1000 calls with the free credits, so that's really great. Overall super solid though.
What do you dislike about the product?
I think the worst part about Assembly has been that the API itself is a bit complicated to work with, since with recordings you've got to make them into links first and then send the links and transcript IDs to a separate endpoint. I can still work with it and have done lots of things, but it would be easier if it was a single API if I'm working with recordings that did this in the background.
What problems is the product solving and how is that benefiting you?
It is the only API we've found that reliably transcribes some of our more lower quality/foreign accents calls in Spanish with correct dieratization. We haven't found another API that did this well after trying most of the popular API's (e.g. deepgram, speechmatics)
                        
                            Opens new doors for text analysis research
What do you like best about the product?
I'm an academic- I recently started using Assembly AI for a project I've been interested in doing for years. I just didn't have a good way to generate transcripts off of videos. Thus, I've been using it extensively over the past few weeks. I imagine it will be a case where I use it a lot in brief spurts over the coming months/years.
I reached out with a question about academic use and was surprised by how quickly AAI responded (but, please recognize .edu as a valid work e-mail).
I started working with Assembly AI on the free credits (which is a great way to "test drive"). It took me a while to get things just as I wanted, but once I got there, it has been smooth sailing and largely automated its integration into my research workflow. I've found the transcription quite accurate (this is the standard model, not the fancy new one). Processing time is fast- and everything is readily scriptable. There is rather nice documentation.
I reached out with a question about academic use and was surprised by how quickly AAI responded (but, please recognize .edu as a valid work e-mail).
I started working with Assembly AI on the free credits (which is a great way to "test drive"). It took me a while to get things just as I wanted, but once I got there, it has been smooth sailing and largely automated its integration into my research workflow. I've found the transcription quite accurate (this is the standard model, not the fancy new one). Processing time is fast- and everything is readily scriptable. There is rather nice documentation.
What do you dislike about the product?
I think there are two things I would like to see in the future.
First, I think the documentation is kind of balkanized. It would be nice if it was more streamlined. In my case, this really goes for formatting the output. More sample scripts for the output would be great. This would have made initial implementation a fair bit easier (I'd call it a 5/10 difficulty... and I'd call myself an ok-ish Python user).
Second, I would like to see interruption/overlay detection. I get that might be hard without multiple microphones. For this one, I'm just going to hold out hope for the steady march of progress.
First, I think the documentation is kind of balkanized. It would be nice if it was more streamlined. In my case, this really goes for formatting the output. More sample scripts for the output would be great. This would have made initial implementation a fair bit easier (I'd call it a 5/10 difficulty... and I'd call myself an ok-ish Python user).
Second, I would like to see interruption/overlay detection. I get that might be hard without multiple microphones. For this one, I'm just going to hold out hope for the steady march of progress.
What problems is the product solving and how is that benefiting you?
In my research, I'm keen to build transcripts for text analysis. I'm dealing with a corpus that isn't written down- it just exists as audio/video recordings. AAI is helping me construct those documents. I've always been excited by my research- but I am REALLY excited by where AAI can help me take it!
                        
                            I am happy with the speed and the quality of the text recognition
What do you like best about the product?
Speaker recognition and the precision of the word recognition
What do you dislike about the product?
That the live speech to text doesn’t recognize speaker
What problems is the product solving and how is that benefiting you?
Use the transcript to translate meeting and then passing it to Llm to do some summary
                        
                            Analysis of conversations from our Customer Service and Sales
What do you like best about the product?
We use AssemblyAI to transcribe WhatsApp audios and calls to analyze the conversations with another AI.
What do you dislike about the product?
I have no negative comments about the tool.
What problems is the product solving and how is that benefiting you?
call analysis, audio analysis, to validate the quality of technical support
                        
                            CallCenter transcribe calls
What do you like best about the product?
Precision and confidence are great for my use.
What do you dislike about the product?
Some features are not available for my language (pt_BR)
What problems is the product solving and how is that benefiting you?
Recognize problems in phone calls without listening to all recordings
                        
                            API is amazing for quick implementation. Cost is very reasonable.
What do you like best about the product?
Ease of implementation. Did not encounter any issues so far. Response is fairly faster than whisper speech to text.
What do you dislike about the product?
Difficulties with multiple speech detection with people with English accent. I wish there was a calibration feature for individual voice.
What problems is the product solving and how is that benefiting you?
I use it for notes recording and conversations. Easy to record whole length of conversation and apply backend algorithm to brief the conversation details due to accuracy of recording.
                        
                            Great Trial period | Easy API to Work with | Accurate transcription
What do you like best about the product?
- Easy to configure due to good documentation
- I am not a developer but figured it out
- Integrated into N8N for my automation
- Nano model is very cost effective
- Great speaker detection
- I am not a developer but figured it out
- Integrated into N8N for my automation
- Nano model is very cost effective
- Great speaker detection
What do you dislike about the product?
- Took a little testing to get my settings correct but good documentation helped
- Works flawlessly once I got off free level, I was throttled before that but understandable due to free account
- Works flawlessly once I got off free level, I was throttled before that but understandable due to free account
What problems is the product solving and how is that benefiting you?
I wanted to have clear speaker identified from my wav files that are recorded in my CRM/ATS. I wanted an automation when i drop a file in a folder to return a transcription to the same folder. N8N and assemblyAI made this possible.
                        
                            Using AssemblyAI to get podcast episodes transcripts
What do you like best about the product?
I use AssemblyAI to get transcripts of my podcast episodes, and the accuracy is pretty good.
The timestamp associated with each word allow us to easily make a connection with the podcast audio and jump right where we need.
Customer support has been great.
The timestamp associated with each word allow us to easily make a connection with the podcast audio and jump right where we need.
Customer support has been great.
What do you dislike about the product?
Nothing to complain.
Sometimes it's a bit tricky when the podcaster say the spelling of the promo code he uses.
For example, if the promocode is SUMMER. I may get S-U-M-M-E-R, which is not easy to work with. But I it's an edge case.
Sometimes it's a bit tricky when the podcaster say the spelling of the promo code he uses.
For example, if the promocode is SUMMER. I may get S-U-M-M-E-R, which is not easy to work with. But I it's an edge case.
What problems is the product solving and how is that benefiting you?
Get the podcast episodes transcript, associating each word with a timestamp.
Give lot of insight to what podcasters are saying and how are promoting our promo codes
                        
                            Give lot of insight to what podcasters are saying and how are promoting our promo codes
a great solution to build into your product
What do you like best about the product?
We recently started using the AssaemblyAI api to transcribe videos from our educational channels. The API works quickly and reliably. So far we have never encountered any limitations of the platform, although our videos are quite large. The quality of recognition is very high, the price is about the same as with OpenAI analogs, but there is no limit of 25 minutes per video fragment.
What do you dislike about the product?
I wish the price was even lower, we have so many more videos to process. Also it is not quite clear how formatting into paragraphs works, according to the api we get exactly the text without paragraphs, although in the version available for free through the interface, the recognized text is already formatted
What problems is the product solving and how is that benefiting you?
We are using the AssaemblyAI api to transcribe videos from our educational channels to build RAG system
                        
                            
                    
            showing 31 - 40