Microsoft unlocks new AI voiceover skills: generate up to 90 seconds of multi-character narration with more real-life-like voices

August 31, 2011 - Tech media outlet Windows Latest published a blog post yesterday, August 29, reporting thatMicrosoftIntroducing the new AI Speech Generation Tool Copilot Audio Expressions.Emotive and Story modes can be used to generate more emotional English speech.

Microsoft unlocks new AI voiceover skills: generate up to 90 seconds of multi-character narration with more real-life-like voices

Note: Copilot Audio Expressions is an AI voice generation tool that functions to make the output audio closer to a real person, with creative touches added as needed. Users can experience it without registering and can download the audio in MP3 format for easy playback on any device.

The tool currently offers Emotive and Story modes.

After testing the Emotive mode, the media chose the "Oak" tone and the "narration" narrative style to input the script of the analog train station into the system.

The generated audio not only reads the text aloud, but also automatically adds details and adjusts the wording to make the expression more vivid. The maximum length of a single audio clip is 59 seconds, and more than ten combinations of voices and styles are supported.

In Story mode, the system automatically selects the tone and style, and the user only needs to provide a theme cue.

For example, if you type in "tell a story about a cat stalking for food in the dark", the AI generates a 90-second multi-character narrative: the narrator uses an American accent, the cat's dialogue is in British accent, and the interactions are cleverly interspersed to form a natural and smooth dialog effect.

Test results show that Story mode excels in plot construction, character differentiation and voice integration, and the output is less like a monotonous machine reading and more like a voice-over collaboration, making the tool not only suitable for simple recitations, but also for creative productions with multiple characters.

The tool currently only supports English, Chinese and other languages users can not directly generate native audio, Microsoft has not yet revealed whether the follow-up will increase multi-language support.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Anthropic: OpenAI Models Easily "Abused", GPT Can Provide Explosive Recipes

2025-8-31 11:36:08

HeadlinesInformation

AI-generated content must "identify itself", "artificial intelligence generated synthetic content labeling measures" come into force

2025-9-1 10:32:41

Search