What is Deepgram's basic pricing model?

Deepgram primarily uses a usage-based pricing model, where costs are calculated per minute of audio processed for Speech-to-Text and Audio Intelligence, and per character for Text-to-Speech. Volume discounts apply automatically as usage increases.

Does Deepgram offer a free tier?

Yes, Deepgram provides a free tier that includes 10,000 requests per month, allowing developers to test and prototype applications without an initial cost.

How much does Deepgram's Speech-to-Text cost per minute?

The starting paid tier for Deepgram's Standard Speech-to-Text service is $0.004 per minute. This rate can decrease with higher usage due to volume discounts.

Are there additional costs for advanced features like diarization or custom models?

Yes, advanced features like diarization, sentiment analysis, or the use of custom-trained models may incur additional per-minute surcharges or custom fees, which are detailed on the official pricing page or through a sales consultation.

How is Deepgram's Text-to-Speech priced?

Deepgram's Text-to-Speech (Aura) service is typically priced per character generated, separate from the Speech-to-Text costs. The specific rate depends on the volume of characters.

What is the difference between Deepgram's Standard and Enterprise plans?

The Standard plan offers core models and features with volume discounts, suitable for growing applications. Enterprise plans are custom-quoted for large-scale needs, offering enhanced features, custom SLAs, dedicated support, and potential on-premise deployment options.

How do I monitor my Deepgram usage and costs?

Users can monitor their API usage and track costs against their free tier limits or paid subscriptions through the Deepgram developer console.

Deepgram Pricing: Usage-Based Speech AI Costs (2026)

Deepgram pricing: Deepgram offers a usage-based pricing model for its speech-to-text, text-to-speech, and audio intelligence APIs, starting with a free tier and scaling based on consumption. Costs are primarily calculated per minute of audio processed or generated, with volume discounts available for higher usage and custom enterprise solutions.

Pricing overview

Deepgram's pricing model is primarily usage-based, with costs calculated per minute of audio processed for its Speech-to-Text (Aura) and Audio Intelligence (Analyze) services, and per character for Text-to-Speech (Aura). This model is designed to scale with application demand, from initial development to high-volume production deployments. The official Deepgram pricing page outlines the available tiers and associated rates.

The core components of Deepgram's pricing are:

Speech-to-Text (Aura): Billed per minute of audio transcribed. Different models (e.g., Base, Enhanced) and features (e.g., diarization, sentiment analysis) may have varying per-minute rates.
Text-to-Speech (Aura): Billed per character of text converted into speech.
Audio Intelligence (Analyze): Billed per minute of audio analyzed, often in conjunction with Speech-to-Text services. This includes features like summarization, topic detection, and entity recognition.
Customization: Training custom speech models may incur additional costs, either as a one-time fee or a higher per-minute rate for inference.

Deepgram also offers volume discounts that automatically apply as usage increases, reducing the per-minute cost at higher consumption levels. For very large-scale or specialized requirements, custom enterprise plans are available.

Plans and tiers

Deepgram organizes its pricing into several tiers, each offering different per-minute rates and access to features. The primary tiers are generally structured to cater to varying levels of usage and technical requirements.

Plan	Starting Price (per minute)	Key Features & Limits	Best For
Free	$0	10,000 requests per month. Access to core models.	Prototyping, small projects, initial testing
Standard	$0.004	Base models, basic features. Volume discounts apply automatically.	Startups, growing applications, general transcription needs
Premium	Custom	Enhanced models, advanced features (e.g., custom models, more granular audio intelligence). Dedicated support.	High-volume applications, specialized use cases, applications requiring higher accuracy or specific features
Enterprise	Custom	All Deepgram features, custom SLAs, dedicated infrastructure, on-premise deployment options.	Large enterprises, highly regulated industries, mission-critical applications

Specific pricing for Premium and Enterprise tiers requires direct contact with Deepgram sales, as these plans are tailored to individual client needs, including factors like custom model training, dedicated resources, and specific service level agreements (SLAs).

Free tier and limits

Deepgram provides a free tier designed to allow developers to experiment with their APIs without upfront cost. This free tier includes 10,000 requests per month. This allowance is sufficient for initial development, proof-of-concept projects, and testing the API's capabilities across different models and audio types.

Key aspects of the free tier:

Usage Allowance: The free tier typically grants a specific number of requests or minutes of processing credit each month. Once these limits are exceeded, usage is billed at the standard paid tier rates, provided a payment method is on file.
Feature Access: The free tier generally provides access to Deepgram's core speech models and basic transcription features. Advanced features or specialized models might be limited or require a paid subscription.
Account Management: Users can monitor their usage through the Deepgram developer console to track consumption against free tier limits.

Developers should review the current free tier terms on the Deepgram pricing page, as specific allowances can be subject to change.

Real-world cost examples

To illustrate Deepgram's usage-based pricing, consider several common scenarios:

Scenario 1: Small-scale application with 100 hours of audio per month

Usage: 100 hours (6,000 minutes) of audio for basic Speech-to-Text.
Tier: Standard (assuming usage exceeds free tier).
Estimated Cost: 6,000 minutes * $0.004/minute = $24.00 per month.
Notes: This assumes no advanced features are used and that the volume discount for this usage tier is applied according to the published rates.

Scenario 2: Medium-sized application with 1,000 hours of audio and diarization

Usage: 1,000 hours (60,000 minutes) of audio for Speech-to-Text with diarization.
Tier: Standard (with volume discount). Diarization might add a small per-minute surcharge.
Estimated Cost: If the base rate drops to $0.0035/minute due to volume, and diarization adds $0.0005/minute, the total might be $0.004/minute. 60,000 minutes * $0.004/minute = $240.00 per month.
Notes: Volume discounts are automatically applied. The exact surcharge for features like diarization is detailed on the Deepgram pricing page.

Scenario 3: Large-scale call center transcription with 5,000 hours of audio, custom models, and audio intelligence

Usage: 5,000 hours (300,000 minutes) of audio.
Tier: Likely Premium or Enterprise due to custom model requirements and advanced audio intelligence.
Estimated Cost: This scenario would involve custom pricing. If a blended rate of $0.003/minute is negotiated, the cost would be 300,000 minutes * $0.003/minute = $900.00 per month. This would also include costs for custom model training and potentially dedicated support.
Notes: Enterprise solutions often include more than just per-minute costs, such as setup fees for custom models or dedicated infrastructure.

Scenario 4: Text-to-Speech for a voice assistant

Usage: Generating 10 million characters of speech per month.
Tier: Standard Text-to-Speech.
Estimated Cost: If the rate is $0.000015 per character, 10,000,000 characters * $0.000015/character = $150.00 per month.
Notes: Text-to-Speech pricing is separate from Speech-to-Text and is based on character count.

How the pricing compares

When evaluating Deepgram's pricing, it is useful to compare it against other Speech-to-Text and audio intelligence providers. Competitors such as AssemblyAI, Rev.ai, and cloud providers like AWS Transcribe offer similar services with varying pricing structures. The AWS Transcribe pricing documentation provides a detailed breakdown of their usage-based costs, which are also often calculated per second of audio processed.

Key comparison points often include:

Per-minute rates: Deepgram's starting rate of $0.004 per minute for standard Speech-to-Text is competitive. Some alternatives may offer lower initial rates but might have higher costs for advanced features or less aggressive volume discounts.
Feature pricing: The cost of advanced features like diarization, sentiment analysis, custom models, and real-time processing can vary significantly. Some providers bundle these, while others charge them as add-ons.
Free tier generosity: Deepgram's 10,000 requests per month free tier provides a substantial allowance for testing compared to some competitors who might offer fewer minutes or a shorter trial period. For example, AssemblyAI's pricing page details a free tier that typically includes a set number of free hours.
Volume discounts: Deepgram's automatic volume discounts are a factor for scaling applications. High-volume users should compare the discount tiers across providers.
Text-to-Speech pricing: When comparing Text-to-Speech services, the per-character cost, available voices, and language support are crucial.
Enterprise options: For large-scale deployments, custom pricing, SLAs, and dedicated support become important. The ability to deploy on-premise or in specific cloud environments can also influence the total cost of ownership.

Developers often conduct a total cost of ownership (TCO) analysis, factoring in not just the per-minute or per-character rates, but also the developer experience, ease of integration (via SDKs and documentation), and the accuracy and latency of the transcription and audio intelligence services for their specific use case.

Deepgram Pricing: Usage-Based Speech AI Costs (2026)

Pricing overview

Plans and tiers

Free tier and limits

Real-world cost examples

Scenario 1: Small-scale application with 100 hours of audio per month

Scenario 2: Medium-sized application with 1,000 hours of audio and diarization

Scenario 3: Large-scale call center transcription with 5,000 hours of audio, custom models, and audio intelligence

Scenario 4: Text-to-Speech for a voice assistant

How the pricing compares

Frequently asked questions

Reviews

Discussion

Written by