Pricing overview
Deepgram's pricing model is primarily usage-based, with costs calculated per minute of audio processed for its Speech-to-Text (Aura) and Audio Intelligence (Analyze) services, and per character for Text-to-Speech (Aura). This model is designed to scale with application demand, from initial development to high-volume production deployments. The official Deepgram pricing page outlines the available tiers and associated rates.
The core components of Deepgram's pricing are:
- Speech-to-Text (Aura): Billed per minute of audio transcribed. Different models (e.g., Base, Enhanced) and features (e.g., diarization, sentiment analysis) may have varying per-minute rates.
- Text-to-Speech (Aura): Billed per character of text converted into speech.
- Audio Intelligence (Analyze): Billed per minute of audio analyzed, often in conjunction with Speech-to-Text services. This includes features like summarization, topic detection, and entity recognition.
- Customization: Training custom speech models may incur additional costs, either as a one-time fee or a higher per-minute rate for inference.
Deepgram also offers volume discounts that automatically apply as usage increases, reducing the per-minute cost at higher consumption levels. For very large-scale or specialized requirements, custom enterprise plans are available.
Plans and tiers
Deepgram organizes its pricing into several tiers, each offering different per-minute rates and access to features. The primary tiers are generally structured to cater to varying levels of usage and technical requirements.
| Plan | Starting Price (per minute) | Key Features & Limits | Best For |
|---|---|---|---|
| Free | $0 | 10,000 requests per month. Access to core models. | Prototyping, small projects, initial testing |
| Standard | $0.004 | Base models, basic features. Volume discounts apply automatically. | Startups, growing applications, general transcription needs |
| Premium | Custom | Enhanced models, advanced features (e.g., custom models, more granular audio intelligence). Dedicated support. | High-volume applications, specialized use cases, applications requiring higher accuracy or specific features |
| Enterprise | Custom | All Deepgram features, custom SLAs, dedicated infrastructure, on-premise deployment options. | Large enterprises, highly regulated industries, mission-critical applications |
Specific pricing for Premium and Enterprise tiers requires direct contact with Deepgram sales, as these plans are tailored to individual client needs, including factors like custom model training, dedicated resources, and specific service level agreements (SLAs).
Free tier and limits
Deepgram provides a free tier designed to allow developers to experiment with their APIs without upfront cost. This free tier includes 10,000 requests per month. This allowance is sufficient for initial development, proof-of-concept projects, and testing the API's capabilities across different models and audio types.
Key aspects of the free tier:
- Usage Allowance: The free tier typically grants a specific number of requests or minutes of processing credit each month. Once these limits are exceeded, usage is billed at the standard paid tier rates, provided a payment method is on file.
- Feature Access: The free tier generally provides access to Deepgram's core speech models and basic transcription features. Advanced features or specialized models might be limited or require a paid subscription.
- Account Management: Users can monitor their usage through the Deepgram developer console to track consumption against free tier limits.
Developers should review the current free tier terms on the Deepgram pricing page, as specific allowances can be subject to change.
Real-world cost examples
To illustrate Deepgram's usage-based pricing, consider several common scenarios:
Scenario 1: Small-scale application with 100 hours of audio per month
- Usage: 100 hours (6,000 minutes) of audio for basic Speech-to-Text.
- Tier: Standard (assuming usage exceeds free tier).
- Estimated Cost: 6,000 minutes * $0.004/minute = $24.00 per month.
- Notes: This assumes no advanced features are used and that the volume discount for this usage tier is applied according to the published rates.
Scenario 2: Medium-sized application with 1,000 hours of audio and diarization
- Usage: 1,000 hours (60,000 minutes) of audio for Speech-to-Text with diarization.
- Tier: Standard (with volume discount). Diarization might add a small per-minute surcharge.
- Estimated Cost: If the base rate drops to $0.0035/minute due to volume, and diarization adds $0.0005/minute, the total might be $0.004/minute. 60,000 minutes * $0.004/minute = $240.00 per month.
- Notes: Volume discounts are automatically applied. The exact surcharge for features like diarization is detailed on the Deepgram pricing page.
Scenario 3: Large-scale call center transcription with 5,000 hours of audio, custom models, and audio intelligence
- Usage: 5,000 hours (300,000 minutes) of audio.
- Tier: Likely Premium or Enterprise due to custom model requirements and advanced audio intelligence.
- Estimated Cost: This scenario would involve custom pricing. If a blended rate of $0.003/minute is negotiated, the cost would be 300,000 minutes * $0.003/minute = $900.00 per month. This would also include costs for custom model training and potentially dedicated support.
- Notes: Enterprise solutions often include more than just per-minute costs, such as setup fees for custom models or dedicated infrastructure.
Scenario 4: Text-to-Speech for a voice assistant
- Usage: Generating 10 million characters of speech per month.
- Tier: Standard Text-to-Speech.
- Estimated Cost: If the rate is $0.000015 per character, 10,000,000 characters * $0.000015/character = $150.00 per month.
- Notes: Text-to-Speech pricing is separate from Speech-to-Text and is based on character count.
How the pricing compares
When evaluating Deepgram's pricing, it is useful to compare it against other Speech-to-Text and audio intelligence providers. Competitors such as AssemblyAI, Rev.ai, and cloud providers like AWS Transcribe offer similar services with varying pricing structures. The AWS Transcribe pricing documentation provides a detailed breakdown of their usage-based costs, which are also often calculated per second of audio processed.
Key comparison points often include:
- Per-minute rates: Deepgram's starting rate of $0.004 per minute for standard Speech-to-Text is competitive. Some alternatives may offer lower initial rates but might have higher costs for advanced features or less aggressive volume discounts.
- Feature pricing: The cost of advanced features like diarization, sentiment analysis, custom models, and real-time processing can vary significantly. Some providers bundle these, while others charge them as add-ons.
- Free tier generosity: Deepgram's 10,000 requests per month free tier provides a substantial allowance for testing compared to some competitors who might offer fewer minutes or a shorter trial period. For example, AssemblyAI's pricing page details a free tier that typically includes a set number of free hours.
- Volume discounts: Deepgram's automatic volume discounts are a factor for scaling applications. High-volume users should compare the discount tiers across providers.
- Text-to-Speech pricing: When comparing Text-to-Speech services, the per-character cost, available voices, and language support are crucial.
- Enterprise options: For large-scale deployments, custom pricing, SLAs, and dedicated support become important. The ability to deploy on-premise or in specific cloud environments can also influence the total cost of ownership.
Developers often conduct a total cost of ownership (TCO) analysis, factoring in not just the per-minute or per-character rates, but also the developer experience, ease of integration (via SDKs and documentation), and the accuracy and latency of the transcription and audio intelligence services for their specific use case.