Deepgram added Aura-2 voice controls to its text-to-speech API, enabling fine-grained adjustments to speaking speed (0.7x–1.5x) and pronunciation overrides using IPA notation. Both REST and WebSocket …
Deepgram added the `InjectAgentMessage` feature to its Voice Agent, allowing servers to inject agent statements during live conversations. The feature supports two behaviors: `default` (waits for sile…
Deepgram’s Voice Agent now supports a broader range of LLM providers and models, including OpenAI’s latest GPT-5 series, Anthropic’s Claude 4 models, Google’s Gemini 3 series, Groq’s GPT OSS 20B, and …
Deepgram’s Voice Agent now supports NVIDIA’s Nemotron-3-Nano-30B-A3B model under the `nvidia` provider type, alongside existing OpenAI, Anthropic, Google, Groq, and AWS Bedrock options. The update als…
Deepgram’s Voice Agent API now supports multiple TTS providers beyond its native models, including managed Cartesia TTS and third-party options like OpenAI, ElevenLabs, Amazon Polly, and Cartesia. Use…
Deepgram launched Flux, a conversational ASR model with model-native turn detection for voice agents, and Nova-3, a high-accuracy general-purpose model with 54.2% lower WER in streaming and 47.4% in b…
Deepgram introduced a built-in MCP server in its `dg` CLI, enabling AI coding assistants like Claude Code, Cursor, and Windsurf to directly access Deepgram APIs for transcription, speech synthesis, te…
Deepgram released a multi-agent architecture for voice agents, replacing single-agent systems with a phased approach using specialized agents (Qualifier, Advisor, Closer) for focused tasks. The system…
Deepgram launched reusable agent configurations via API, allowing users to store and reference agent blocks by UUID instead of repeating full configurations in Settings messages. The feature supports …
Deepgram updated its CLI installation process with expanded support for Homebrew, pipx, and uv, alongside existing script-based and pip methods. Homebrew now auto-installs dependencies like ffmpeg, si…
Pricing updated for Deepgram:
- Custom: allowance changed from $4K+ / year to For businesses with large volumes, data or deployment requirements, or support needs
- New tier: Free ($0/mo — promo: $200…
Cartesia deprecated its speed and emotion controls feature for text-to-speech, previously available via API and playground. The feature was experimental and subject to breaking changes, with controls …
Deepgram introduced Flux Multilingual (flux-general-multi), a single real-time streaming model handling 10 languages with automatic detection and code-switching. A new `language_hint` parameter biases…
Deepgram released a six-phase roadmap for implementing real-time transcription from proof-of-concept to production, covering requirements, benchmarking, integration, accuracy tuning, scaling, and comp…
A detailed comparison of Deepgram, Speechmatics, and Rev AI highlights architectural differences in concurrency limits, pricing models, latency, and compliance. Deepgram leads in managed real-time sca…
Deepgram introduced Flux Multilingual, a single conversational speech model supporting 10 languages (English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, Dutch) with monoli…
Deepgram released Flux Multilingual, a general-availability conversational speech recognition model supporting 10 languages with monolingual-grade accuracy and real-time language switching. The model …
Deepgram introduced OpenAI's GPT-5.5 as a managed LLM in its Voice Agent API and added speed control for Cartesia TTS. The Llama Nemotron Super 49B model was removed due to poor performance.
Deepgram introduced a new Voice Agent API enabling real-time conversational AI with integrated LLM capabilities. The API supports multi-language audio input/output, configurable models (e.g., OpenAI's…
Deepgram’s CLI text-to-speech tool now supports additional output formats (WAV, MP3, FLAC), voice selection via `--voice` and `--list-voices`, language selection, and low-latency streaming via WebSock…
Deepgram expanded its CLI with text intelligence capabilities including sentiment analysis, topic detection, summarization, and intent recognition. Users can now analyze documents, URLs, or piped inpu…
Deepgram introduced shell completion scripts for its CLI, enabling tab-completion for commands like 'dg listen' and subcommands such as '--mic' or '-o json'. Users can generate and install completions…
Deepgram’s CLI now supports a plugin system allowing users to install, update, and uninstall Python-based plugins that add new commands. Plugins run in isolated environments, can access Deepgram confi…
Deepgram’s Nova-3 speech-to-text model now supports Gujarati (language codes `gu`, `gu-IN`). Users can access this by setting `model="nova-3"` and the relevant language code in API requests.
Deepgram published a comparative analysis of its speech-to-text (STT) service against Google Cloud and Azure, focusing on real total cost of ownership (TCO), latency, compliance, and deployment flexib…
Deepgram published a practitioner-level guide on building and deploying Voice AI agents, focusing on real-time distributed systems that coordinate speech, reasoning, and audio under strict latency con…
Deepgram, Speechmatics, and AssemblyAI were compared across latency, pricing, language support, and deployment for speech-to-text (STT) workloads. Deepgram excels in real-time voice agent infrastructu…
Deepgram introduced Keyterm Prompting, a runtime vocabulary customization feature in its Nova-3 model, to address out-of-vocabulary (OOV) terms in large vocabulary speech recognition. Keyterm Promptin…
Deepgram released Self-Hosted version 260416, introducing Flux Multilingual for real-time multilingual conversational speech-to-text (STT) with code-switching support across 10 languages. The update r…
Deepgram introduced Flux Multilingual (`flux-general-multi`), a single model supporting 10 languages with near-monolingual accuracy when language hints are provided. The model auto-detects languages i…
Deepgram introduced a real-time `Configure` control message for its Flux streaming speech recognition system, enabling mid-stream adjustments to key recognition parameters without disconnecting. This …
Deepgram launched Flux, the first conversational speech recognition model designed specifically for voice agents, moving beyond traditional speech-to-text (STT) by understanding conversational flow an…
Deepgram introduced a new real-time conversational speech recognition API endpoint, `/v2/listen`, designed for natural voice conversations with contextual turn detection. The API enables developers to…
Deepgram highlights its Voice AI platform's adoption across diverse industries, including startups, NASA, and contact centers like Five9. The company emphasizes its ability to process millions of audi…
Deepgram has announced a new funding round and the acquisition of Of.One, marking a significant expansion in its voice AI capabilities. The company highlights its growing influence in the voice AI eco…
Deepgram introduces a unified Voice Agent API that consolidates speech-to-text, text-to-speech, and LLM orchestration into a single interface, reducing complexity, latency, and cost for businesses. Th…
Deepgram has launched a new Voice Agent API endpoint, enabling developers to build real-time conversational voice agents using a WebSocket-based interface. The API introduces a bidirectional message s…
NVIDIA updated the API reference documentation for its Llama 3.3 Nemotron Super 49B v1.5 model, introducing new parameters for text generation control. The changes include support for token generation…
NVIDIA updated the API reference documentation for its Nemotron-3-Nano-30B-A3B-Infer model, clarifying key parameters for text generation. The changes specify the required structure of conversation me…
Deepgram and Five9 have integrated Deepgram’s Nova-2 automatic speech recognition (ASR) model into Five9’s Intelligent Virtual Agent (IVA) Studio 7 to enhance contact center AI capabilities. The integ…
NASA has adopted Deepgram’s AI Speech Platform, Tailored Speech Models, and Audio Search to address critical challenges in space mission communications. The primary change is NASA’s shift from manual …
Cartesia has expanded its Sonic-3 text-to-speech (TTS) model with new controls for volume, speed, and emotion, enabling more expressive and customizable speech generation. Users can now adjust these p…
Deepgram’s advanced speech-to-text (STT) technology has been integrated into MaxContact’s cloud contact center platform to improve transcription accuracy, particularly for mono recorded calls. This in…
Speech recognition technology converts spoken language into text, with production-grade systems requiring more than benchmark accuracy to handle real-world conditions like noise, accents, and domain-s…
Deepgram introduced an agentic menu integration pipeline to normalize unstructured and chaotic menu data from restaurant POS systems, enabling AI-driven voice ordering. The tool ingests raw POS data, …
Speech recognition has evolved from a multi-stage traditional ASR pipeline to a single neural network model that maps audio directly to text, simplifying the stack but making architecture choice centr…
Deepgram has raised a Series C funding round and acquired Of.One, marking a significant milestone in its decade-long journey to dominate the voice AI ecosystem. The company now powers the majority of …
Deepgram and AWS announced a joint webinar demonstrating an AI-powered outbound dialing architecture for healthcare, specifically targeting patient outreach challenges like clinical trial recruitment,…
Deepgram has launched Flux Multilingual, a single speech-to-text model supporting 10 languages (English, Spanish, French, German, Hindi, Russian, Portuguese, Japanese, Italian, and Dutch) with convers…
Deepgram introduced reusable agent configurations and template variables via API, allowing users to store and reference agent setups by UUID instead of resending full configurations per session. The c…
Deepgram introduced reusable agent configurations via its API, allowing users to store and manage agent setups and template variables by UUID instead of resending full configurations per WebSocket ses…
Deepgram has added NVIDIA as a supported LLM provider for its Voice Agent API, introducing two new models—`llama-nemotron-super-49B` and `nemotron-3-nano-30B-A3B`—available in the Standard pricing tie…
Deepgram’s April 2, 2026 Self-Hosted release (260402) introduces two key changes: a fix to the Engine certificate endpoint path to align with other container images and the addition of a canonical_nam…
Deepgram’s Voice Agent API has added an optional `thought_signature` field to function call messages, specifically for Google’s Gemini 3.0 and 3.1 models, to address degraded function calling performa…
Suivez Deepgram en pilote automatique
· Brief IA hebdomadaire — résumé narratif de ce qui a été livré, chaque lundi 9 h
· Alertes par e-mail ou Slack, ou discutez avec l'archive depuis votre tableau de bord
· Ajoutez Deepgram + jusqu'à 4 autres concurrents gratuitement, sans carte bancaire