
Overview

Product video
AssemblyAI offers Speech AI models via an API that product teams and developers can use to build powerful AI solutions based on voice data. Thousands of developers build on AssemblyAI's Speech AI models every day to run Speech-to-Text on multilingual speech, and harness the power of Large Language Models to extract the full value from that voice data - including answering questions from voice data, generating content, and extracting metadata in seconds. AssemblyAI offers async transcription, with most audio files completing in well under 45 seconds regardless of audio duration, as well as real-time transcription with high accuracy and <600 ms of latency.
AssemblyAI gives you access to state-of-the-art Speech AI models and capabilities for real-world use cases, so you can build smarter applications in a fraction of the time. Models and features include:
- Speech recognition
- Speaker diarization
- Auto punctuation and casing
- Auto language detection
- Summarization
- Content moderation
- Sentiment analysis
- Auto highlights
- PII redaction
- Topic detection (IAB classification)
- Entity detection
- Auto chapters
- Custom spelling
- Custom vocabulary
- Dual channel transcription
- Export SRT or VTT caption files
- Filler word filtering
- Profanity filtering
In addition, LeMUR, which allows users to leverage the capabilities of Large Language Models, can quickly process audio transcripts for single or multiple audio files for tasks like summarization, question & answer, and AI coaching feedback.
Our Speech AI products support 33 different audio and video file types and 99+ languages. Our models are used by thousands of breakthrough startups and dozens of global enterprises for mission-critical workloads.
.
Highlights
- Unparalleled Human-Level Accuracy: Our multilingual speech recognition AI models deliver industry-leading performance with the lowest word error rates on the market, outperforming competitors by over 60% when recognizing challenging content like rare words and proper nouns. Trusted by more than 3,000 innovative companies, including Zoom, our platform provides the foundation for mission-critical speech applications at scale.
- Built for enterprise-grade performance, our APIs deliver unmatched scalability for high-concurrency applications. Security is embedded with SOC 2 Type 2, PCI DSS, and GDPR compliance. For healthcare applications, AssemblyAI offers Business Associate Agreements (BAAs). Choose flexible hosting options in both US and EU regions.
- Comprehensive Audio Intelligence Suite: Our advanced models summarize conversations, identify speakers through diarization, analyze sentiment, moderate content, automatically redact PII, and much more, all in a single platform. Our LeMUR framework seamlessly connects spoken data with large language models, enabling unlimited possibilities for voice-powered applications.
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Free trial
Dimension | Description | Cost/month |
---|---|---|
Pay As You Go | State-of-the-art production-ready AI models | $0.00 |
Slam_1_STT | Slam-1 speech-to-text (core) | $0.37 |
haiku3_5_input | Claude 3.5 Haiku 1k token input (LeMur) | $0.001 |
haiku3_5_output | Claude 3.5 Haiku 1k token output (LeMur) | $0.004 |
sonnet3_7_input | Claude 3.7 Sonnet 1k token input (LeMur) | $0.003 |
sonnet3_7_output | Claude 3.7 Sonnet 1k token output (LeMur) | $0.015 |
The following dimensions are not included in the contract terms, which will be charged based on your usage.
Dimension | Cost/unit |
---|---|
Async Transcription (core) | $0.37 |
Nano Speech-to-Text (core) | $0.12 |
Real-Time Transcription (core) | $0.47 |
Auto Chapters (Audio Intelligence) | $0.08 |
Content Moderation (Audio Intelligence) | $0.15 |
Entity Detection (Audio Intelligence) | $0.08 |
Key Phrases (Auto Highlights) | $0.01 |
PII Redaction (Audio Intelligence) | $0.08 |
PII Audio Redaction (Audio Intelligence) | $0.05 |
Sentiment Analysis (Audio Intelligence) | $0.02 |
Vendor refund policy
All fees are non-refundable and non-cancellable except as required by law.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Software as a Service (SaaS)
SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.
Resources
Vendor resources
Support
Vendor support
Support is available via chat and email 24/7. support@assemblyai.comÂ
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.



FedRAMP
GDPR
HIPAA
ISO/IEC 27001
PCI DSS
SOC 2 Type 2
Standard contract
Customer reviews
My experience with AssemblyAI API
Developer-Friendly and Accurate Transcripts
Great, cost effective solution
Accurate and Cheap!
AssemblyAI STT: Simple, Affordable, but Not Without Tradeoffs
✅ Ridiculously easy to use – The API is straightforward and well-documented. I was up and running in minutes without needing to dig into edge-case docs.
🔧 Effortless integration – Plugged it right into our existing STT pipeline with minimal changes. It felt like it was designed to just fit in.
💸 Cost-effective – It gave us solid transcription quality at a much lower price point compared to other providers, which made it a no-brainer from a budgeting standpoint.
🕒 Inconsistent response times – We noticed variability in transcription latency, especially during higher-load windows. This made it tricky to rely on for real-time-ish workflows.
⚙️ Limited customization – The API didn’t offer much flexibility in tailoring the model to domain-specific vocab or acoustic quirks. If you're working in a niche industry or need fine-tuned accuracy, you're boxed in a bit.
We’re leveraging AssemblyAI to automate transcription of all our cold calls, and it’s solving a very specific but critical pain point:
📞 Manual note-taking is dead – No more wasting time jotting down call summaries or missing important details. Every conversation is accurately logged.
🧠Instant access to customer insights – Having clean, searchable transcripts helps our sales and marketing teams quickly analyze conversations, spot objections, and refine messaging.
🔄 Improved workflow automation – Transcriptions feed into our CRM and internal tools, enabling follow-ups, QA, and even training analysis without human bottlenecks.
The real win? Time savings, better visibility, and a more scalable cold-calling process.