Xiaomi's big model for sound understanding MiDashengLM-7B released and fully open-sourced, 22 public review sets to refresh the best score

August 4 News.MilletSelf-researching large models for sound understanding MiDashengLM-The 7B was officially released today.full complementOpen Source.

Xiaomi's Sound Understanding Large Model MiDashengLM-7B Released and Open-Sourced in Full Volume, 22 Public Review Sets Refresh Best Scores

According to Xiaomi's official introduction, MiDashengLM-7B achieves double breakthroughs in speed and accuracy: the delay of the first Token of a single sample is only 1/4 of similar models, and the concurrency is more than 20 times under the same video memory.Setting a new multimodal large model best score on 22 public review sets (SOTA).

Based on Xiaomi Dasheng as an audio encoder and Qwen2.5-Omni-7B Thinker as an autoregressive decoder, MiDashengLM-7B achieves a unified understanding of speech, ambient sound, and music through an innovative generic audio description training strategy.

In 2024, Xiaomi released the Xiaomi Dasheng sound base model that broke the AudioSet 50+ mAP for the first time in the international arena, establishing a leading position in the three major fields of HEAR Benchmark ambient sound, voice, and music and maintaining it to this day.

Xiaomi Dasheng has more than 30 on-the-ground applications in Xiaomi's smart home and car cabin scenarios.The industry's first out-of-vehicle wake-up defense, mobile phone speaker 24/7 monitoring of abnormal sounds, and "a ringing finger" ambient sound correlation IoT control capability. The industry's first out-of-vehicle wake-up defense, cell phone speakers to monitor abnormal sounds around the clock, "a ringing finger" ambient sound associated with IoT control capabilities, as well as Xiaomi YU7 equipped with enhanced sentinel mode scratching detection, etc., behind Xiaomi Dasheng as the core algorithm of empowerment.

MiDashengLM's training data consists of 100% of publicly available data, and the model is released under the relaxed Apache License 2.0, which supports both academic and commercial applications.

Xiaomi says that unlike models such as Qwen2.5-Omni, which do not disclose details of their training data, theMiDashengLM fully discloses the detailed ratios of 77 data sourcesThe full process, from audio encoder pre-training to command fine-tuning, is detailed in the technical report.

As a key technology in Xiaomi's "human-car-home ecosystem" strategy, MiDashengLM can not only understand what is happening around the user, but also what is happening in the environment by unifying the cross-domain capabilities of understanding voice, ambient sound and music.It can also be analyzed to discover the hidden meanings of these things, improving the generalization of user scenario understanding.

MiDashengLM-based models provide more humanized communication and feedback through natural language and user interaction, such as providing feedback on pronunciation and formulating targeted enhancement programs when users are practicing singing or practicing a foreign language, or answering real-time questions about ambient sound when users are driving a vehicle.

MiDashengLM, with the Xiaomi Dasheng audio encoder as the core component, is an important upgrade to the Xiaomi Dasheng series of models. Based on the current version, Xiaomi has embarked on further upgrades to the computational efficiency of the model, theSeek offline deployment on end devices and improve more comprehensive features such as voice editing based on user's natural language prompts.

1AIttached MiDashengLM open source address:

GitHub homepage:https://github.com/xiaomi-research/dasheng-lm
Technical Report:: https://github.com/xiaomi-research/dasheng-lm/tree/main/technical_report
Model parameters (Hugging Face):https://huggingface.co/mispeech/midashenglm-7b
Model Parameters (Magic Hitch Community):https://modelscope.cn/models/midasheng/midashenglm-7b
web page Demo: https://xiaomi-research.github.io/dasheng-lm
each other Demo:https://huggingface.co/spaces/mispeech/MiDashengLM

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

{{userData.name}}Verify

Xiaomi's Sound Understanding Large Model MiDashengLM-7B Released and Open-Sourced in Full Volume, 22 Public Review Sets Refresh Best Scores

Open Source Big Model Scores New Record, Ali Tongyi Qwen3 Model Takes Third Place Worldwide

Tencent mixed yuan 0.5B, 1.8B, 4B, 7B model open source release, consumer graphics cards can be run

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

Open Source Big Model Scores New Record, Ali Tongyi Qwen3 Model Takes Third Place Worldwide

Tencent mixed yuan 0.5B, 1.8B, 4B, 7B model open source release, consumer graphics cards can be run

Xiaomi open-sources "Xiaomi MiMo" large model: built for inference, surpasses OpenAI o1-mini with 7B parameters

Xiaomi's multimodal large model MiMo-VL open source, officially said to be leading in many aspects Qwen2.5-VL-7B

Xiaomi: "CyberOne humanoid robot will soon be mass-produced" news is not true

Step Star's new generation of basic big model Step 3 is officially open-sourced: with powerful visual perception and complex reasoning ability

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow