pwshub.com

Exclusive: Deepgram launches voice agent API that brings AI conversations to life

Deepgram Inc., the developer of a speech recognition engine that provides its service via application programming interfaces, today announced a powerful addition to its platform that enables natural-sounding conversations between humans and artificial intelligence agents at large scale in real time.

Using speech recognition and voice synthesis AI models, Deepgram’s voice agent systems ensure human-like responsiveness by making. In this release, the company is offering a system that packages all the pieces together under a single API.

All users have to do is set up a prompt and tell it what they want it to do such as tell it what they want it to do and the system manages the rest. In the past, using Deepgram developers would have had to connect together multiple parts of the system such as hooking in a large language model provider, the company’s voice-to-text speech recognition model and the speech synthesis model.

“We have a big shift that’s happening in the world right now,” Scott Stephenson, co-founder and chief executive of Deepgram, said in an interview with SiliconANGLE. “AI went mainstream over the last two years and voice AI has gone mainstream over the last two to six months. There’s a fundamental shift around the nature of how work is going to be done.”

Deepgram’s system allows users to listen to AI-synthesized speech and reply just like they’re talking to another human being. It’s also highly responsive on a conversational level, waits for appropriate times to break in and doesn’t interrupt the train of thought. It’s interruptible just like another person and doesn’t lose track of the conversation, allowing for smooth interactions.

Stephenson said that having voice intractability fits into any place where a device has a microphone and a speaker such as websites, phones, mobile, AI pendants and even the drive-through. One example of where AI agents are already in use across the industry is in call centers, where agents can pick up the phone quickly so that there’s little to no wait time for customers so that they can have questions answered or easy situations resolved.

“If you can service a customer’s need without having them to talk to a live agent that can save costs and that leads to a very satisfied customer,said Stephenson.If they can call in and they’re instantly connected with an AI agent and that agent can immediately ask questions, get information and get the conversation going, essentially filling out CRM information so that when a live agent is available now, they’re contextualized. Now they can complete their job in one minute.”

Developers can choose any LLM they want to connect the API with, including models from OpenAI, Anthropic PBC and Meta Platforms Inc. That makes it easy for them to choose what they want to run the underlying AI experience. Deepgram’s voice synthesis options include 12 different voices for customers to choose from.

“As we watch our children use their smartphones, it’s obvious that voice-to-voice will become a standard method of human and machine interactions,”said  Kevin Petrie, vice president of research at BARC US.Deepgram’s Voice Agent API addresses this market opportunity and makes customer service — already a top use case for gen AI — easier by converting text conversations to speech. Deepgram also broadens the market opportunity by integrating with a wide array of large language models.”

This year saw the launch of several LLMs that can deliver natural voice conversation capabilities. The biggest examples include OpenAI’s GPT-4o, Gemini Live from Google LLC and Tenyx Voice from Tenyx Inc.

Stephenson said that Deepgram doesn’t necessarily need to be voice-to-voice, it can also integrate easily with text-to-voice as well, allowing people to maintain privacy. For example, when they are wearing a headset on a crowded train and just want to type on their phone and listen to a reply on their headset. Not everyone will want to have one-sided conversations with their phones, he said, on the other hand, some people might dive into long-winded talks with AI models.

“The initial phase will be adding the voice option to text boxes,Stephenson said. Once people realize you can have a human-like interruptible talking experience with a voice agent, we think that people will use it a lot.”

Source: siliconangle.com

Related stories
2 weeks ago - Some Russian companies are facing growing delays and rising costs on payments with trading partners in China, leaving transactions worth tens of billions of yuan in limbo, Russian sources with direct knowledge of the issue told Reuters. ...
2 weeks ago - Intel CEO Pat Gelsinger and key executives are expected to present a plan later this month to the company’s board of directors to slice off unnecessary businesses and revamp capital spending, according to a source familiar with the...
2 weeks ago - A top 10 Chinese fund manager has asked senior executives to return pay received over the past five years that exceeds a new cap, to tally with a government initiative promoting economic equality, said two people with direct knowledge of...
1 week ago - Samsung Electronics, the world's top maker of smartphones, TVs and memory chips, is cutting up to 30% of its overseas staff at some divisions, three sources with direct knowledge of the matter told Reuters. South Korea-based Samsung has...
1 week ago - Cloud cost optimization service provider Vantage today is adding support for Microsoft Corp.’s GitHub software version control and collaboration platform to its growing list of native integrations with cloud platforms and applications....
Other stories
11 minutes ago - (Reuters) -Nike said on Thursday that former senior executive Elliott Hill will rejoin the company to succeed John Donahoe as president and CEO, as the sportswear giant shakes up its top rank amid efforts to revive sales and battle rising...
11 minutes ago - Trump maintains a roughly 60% stake in Trump Media & Technology Group, which trades on the Nasdaq under the ticker symbol "DJT."
11 minutes ago - FedEx and other transportation firms expanded operations during the pandemic-fueled online shipping boom. The company has been trying to cut billions in overhead costs after demand normalized. In June, FedEx completed a restructuring...
11 minutes ago - On CNBC's “Mad Money Lightning Round,” Jim Cramer said Wells Fargo & Company (NYSE:WFC) is going to go higher, adding that it's a “winner.” On Sept. 17, the San Francisco-based bank launched specialized Application Programming Interfaces...
11 minutes ago - Wall Street has absorbed the Fed's message that a deep cut will prove positive for the economy.