100-to-end voice language large model published with a maximum claimed cost reduction of 90%

March 31, 2011 - In today'sBaidu AI DAY.Baidu Releases First Cross-Attention Basedend-to-end phonetic language macromodel, announced the realization of ultra-low latency and ultra-low cost, with call costs dropping by about 50%-90% compared to industry averages in voice Q&A scenarios on telephone voice channels.

Baidu's End-to-End Speech-Language Big Model Released, Costs Claimed to Drop by Up to 90%

On that day.Wen Xiaoyin Announces Brand Refresh, First to Access the ModelIt also brings upgraded functions such as multi-model fusion scheduling and picture Q&A. After accessing the model, Wen Xiaoyan can not only support more simulated language chat effect, but also support Chongqing, Guangxi, Henan, Guangdong, Shandong and other special dialects. According to reports, the voice model has very low training and use costs, very fast reasoning response speed, voice interaction, can reduce the user waiting time from the industry's common 3-5 seconds to about 1 second.

The updated Wen Xiaoyan also supports "multi-model fusion scheduling".It integrates Baidu's self-developed models such as Wenshin X1 and Wenshin 4.5, and accesses third-party quality models such as DeepSeek-R1, realizing intelligent collaboration between multiple models. Users can choose "automatic mode" to call the optimal model combination with one click, or select a single model to complete a specific task according to demand, improving response speed and task processing capability.

1AI learned from the event thatWen Xiaoyan has also enhanced the photo quiz featureThe user shoots or uploads a picture and asks a question in text or voice to get an in-depth analysis directly. For example, shooting a math problem can generate real-time solutions and video analysis; uploading multiple product images can compare parameters and prices to assist shopping decisions.

In addition, Wen Xiaoyan added "Try a cold one.With the function of "History Scholar", users can preset "history scholar", "science and technology expert" and other personalized perspectives to give a multi-dimensional interpretation of the same picture. For example, when the user asks "Cat Window Mystery, why do cats love the scientific truths around the window?" Wen Xiaoyan can give a unique interpretation from the perspectives of hunting instincts, energy acquisition, territorial awareness, etc.

Jia Lei, chief architect of Baidu Speech, reveals that the model is the first in the industry to introduce an end-to-end speech-language grand model based on the new Cross-Attention. "In voice scenarios meeting certain interaction metrics, the cost of large model calls is lower than the industry average 50%-90%In addition, the inference response speed is extremely fast, compressing the waiting time of voice interaction to about 1 second, greatly improving the interaction fluency. At the same time, with the support of the big model, streaming word-by-word LLM-driven multi-emotional speech synthesis is realized, with full, realistic and anthropomorphic emotions, and the interactive listening sense is greatly improved."

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

{{userData.name}}Verify

Baidu's End-to-End Speech-Language Big Model Released, Costs Claimed to Drop by Up to 90%

Nanjing University and Aliyun Jointly Launch Artificial Intelligence Talent Training Cooperation Program to Cultivate AI Innovative Talents

OpenAI Announces Release of First Open Weighted Language Model Since GPT-2 with Inference

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

Nanjing University and Aliyun Jointly Launch Artificial Intelligence Talent Training Cooperation Program to Cultivate AI Innovative Talents

OpenAI Announces Release of First Open Weighted Language Model Since GPT-2 with Inference

2024 Baidu AI Developer Conference will be held on April 16-17, and the event schedule has been announced

Baidu announces that "Wenxin KuaiMa" has become the first AI architect of the Internet

Baidu Search PC Goes Live with DeepSeek-R1 Full-Blooded Version, Offers Networking Services

Baidu Wenshin Big Model 4.5 Announced for March 16th Release, Native Multimodal, Deep Thinking

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow