-
FACEWALL SMART RELEASE 0.5B ARGUMENT VOICE MODEL, SOUND ECHOES HUMAN
On September 19th, yesterday afternoon, the smart face wall announced a new series of "Small Steel Guns": the VoxCPM base model for voice generation of 0.5B parameter sizes was introduced. The VoxCPM base model for voice generation was officially launched at the Human Voice Interactive Laboratory of the International Graduate School of Shenzhen, Singhua University. The model parameter size is 0.5B, with an industry SOTA level in terms of voice naturality, sound similarity and rhythm performance. Performance performance: RTF ≈ 0.17, support current output VoxCPM in Seed-TTS-EV.. -
Microsoft launches its first self-developed AI model: MAI-Voice-1 generates audio in seconds, MAI-1-preview points to Copilot text scenes
On Thursday, August 29, Microsoft's Artificial Intelligence division officially launched its first two homegrown AI models -- the MAI-Voice-1 voice model and the MAI-1-preview general-purpose model. According to Microsoft, the new MAI-Voice-1 voice model requires only a single GPU to generate a minute-long audio in less than a second, while the MAI-1-preview model "gives users a glimpse of Copilot's future functionality". Currently, Microsoft has made the MAI-Voice-...- 1.2k
-
OpenAI Releases New Generation of Speech Models to Enable AI Intelligents to Speak More Naturally
March 21 news, OpenAI yesterday (March 20) released a blog post, announcing the launch of speech-to-text (speech-to-text) and text-to-speech (text-to-speech) models, to improve voice processing capabilities, support developers to build more accurate, customizable voice interaction system, and further promote the commercialization of AI voice technology applications. In terms of speech-to-text models, OpenAI has launched gpt-4o-transcribe and gpt-4o-mini-transcribe...- 2.1k
-
MiniMax Halo Speech AI Product Launched: Supports 17 Languages and Up to 10,000 Characters
January 21st, MiniMax announced yesterday that it has brought the newly upgraded T2A-01 series of voice models and launched Conch AI products globally. According to the introduction, relying on the T2A-01 series of voice models, users can generate natural and smooth super humanoid voices by inputting text in Conch AI, and the maximum length of input can be up to 10,000 characters. At the same time, users can freely configure the mood, speech rate, pitch, and even adjust the timbre effect of the output voice to meet the refined needs of complex scenarios. 1AI notes that Conch Voice supports Chinese,...- 4.3k
-
Wisdom Spectrum Clear Speech Launches Emotional Speech Model GLM-4-Voice: Understanding Emotions, Emotional Expression and Empathy
Wisdom Spectrum announced the launch of GLM-4-Voice end-to-end emotional voice model. Officially, GLM-4-Voice is able to understand emotions, express and resonate emotions, self-adjust its speech rate, support multiple languages and dialects, have lower latency, and can be interrupted at any time, which can be experienced by users on the "Wisdom Spectrum Clear Speech" App from now on. According to the introduction, GLM-4-Voice has the following features: Emotional expression and emotional resonance: the voice has different emotions and subtle changes, such as happy, sad, angry, scared, etc. Adjusting speech speed: In the same round of conversation, you can ask TA to speak faster or slower...- 9.4k
-
Alibaba releases new voice model Qwen2-Audio, surpassing OpenAI Whisper
Recently, Alibaba launched a new open source voice model Qwen2-Audio based on its Qwen-Audio. This model not only performs well in voice recognition, translation and audio analysis, but also achieves significant improvements in functions and performance. Qwen2-Audio provides a basic version and a command fine-tuning version. Users can ask questions to the audio model through voice, and recognize and analyze the content. For example, users can ask a woman to say a paragraph, and Qwen2-Audio can determine her age or analyze her emotions; if a noisy voice is input…- 11.8k
-
Claiming to be better than XTTS! VoiceCraft: A voice model that supports voice cloning and modifying original audio text
Recently, a voice model called VoiceCraft has attracted widespread attention in the industry. According to official announcements, the performance of this model has surpassed XTTS, which undoubtedly brings new breakthroughs in the field of AI audio processing. Project address: https://github.com/jasonppy/VoiceCraft The biggest highlight of VoiceCraft is its powerful audio cloning ability. Users only need to provide a piece of original audio, and VoiceCraft can use deep learning technology to copy new audio that is extremely similar to the original audio.- 4.4k
❯
Search
Scan to open current page
Top
Checking in, please wait
Click for today's check-in bonus!
You have earned {{mission.data.mission.credit}} points today!
My Coupons
-
¥CouponsLimitation of useExpired and UnavailableLimitation of use
before
Limitation of usePermanently validCoupon ID:×Available for the following products: Available for the following products categories: Unrestricted use:Available for all products and product types
No coupons available!
Unverify
Daily tasks completed:






