{"id":36901,"date":"2025-06-07T12:42:57","date_gmt":"2025-06-07T04:42:57","guid":{"rendered":"https:\/\/www.1ai.net\/?p=36901"},"modified":"2025-06-07T12:42:57","modified_gmt":"2025-06-07T04:42:57","slug":"%e9%9d%a2%e5%a3%81%e6%99%ba%e8%83%bd%e5%8f%91%e5%b8%83%e7%ab%af%e4%be%a7%e5%a4%a7%e6%a8%a1%e5%9e%8b%e5%89%8d%e8%bf%9b%e5%9b%9bminicpm-4-0%ef%bc%8c%e5%8f%b7%e7%a7%b0%e6%80%a7%e8%83%bd","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/36901.html","title":{"rendered":"Facade Intelligence Releases End-Side Large Model \"Forward Four\" MiniCPM 4.0, Claims to be King of Performance Size"},"content":{"rendered":"<p>June 7 News.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e9%9d%a2%e5%a3%81%e6%99%ba%e8%83%bd\" title=\"[View articles tagged with [face smart]]\" target=\"_blank\" >Wall-facing intelligence<\/a> Posted on the evening of June 6<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e7%ab%af%e4%be%a7%e5%a4%a7%e6%a8%a1%e5%9e%8b\" title=\"[Sees articles with labels]\" target=\"_blank\" >end-lateral macromodel<\/a> MiniCPM 4.0: The company says the new model achieves up to 220x speedup in extreme scenarios and 5x speedup in regular scenarios through its self-developed CPM.cu inference framework, and supports deployment in frameworks such as vLLM, SGLang, and LlamaFactory.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-36902\" title=\"97a7349cj00sxgyei006od000u000gwp\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/06\/97a7349cj00sxgyei006od000u000gwp.jpg\" alt=\"97a7349cj00sxgyei006od000u000gwp\" width=\"1080\" height=\"608\" \/><\/p>\n<p>The 8B Lightning Sparse Edition, which utilizes an innovative sparse architecture to create a storm of high efficiency, and the 0.5B, which is described as \"the lightest and most powerful small steel gun\", were released this time.<\/p>\n<p>According to the official introduction, the MiniCPM 4.0 series of LLM models launched in the face of the wall have<strong>\u00a0<\/strong><strong>8B, 0.5B two parameter scales<\/strong>In response to the technical problem that it is difficult for a single architecture to accommodate different scenarios of long and short texts, MiniCPM 4.0-8B adopts the \"High-efficiency Dual-band Gear Shift\" mechanism, which is capable of<strong>Automatic switching of attention modes based on task characteristics<\/strong>: Enabling sparse attention to reduce computational complexity when dealing with difficult long text and deep thinking tasks, and switching to dense attention to ensure accuracy in short text scenarios, realizing an efficient response to long and short text switching.<\/p>\n<p>According to 1AI, MiniCPM 4.0 is available in the<strong>\u00a0<\/strong><strong>vLLM, SGLang, LlamaFactory, XTuner<\/strong><strong>\u00a0<\/strong>It has built-in self-developed CPM.cu high-speed end-side inference framework. Its built-in self-developed CPM.cu high-speed end-side reasoning framework, from speculative sampling innovation, model compression and quantization innovation, end-side deployment framework innovation in several aspects, to bring 90% model slimming and speed enhancement, the official claim that the end-side reasoning will realize the silky-smooth \"from birth to life\".<\/p>","protected":false},"excerpt":{"rendered":"<p>June 7 news, the face of the wall of intelligence on the evening of June 6 released end-side large model MiniCPM 4.0, the company said the new model through self-research CPM.cu reasoning framework, in extreme scenarios to achieve the highest 220 times speedup, the regular 5 times speedup, support in vLLM, SGLang, LlamaFactory and other frameworks deployed. The released 8B Lightning Sparse Edition adopts the innovative sparse architecture to set off a high efficiency storm, while the other 0.5B is called \"the most powerful small steel gun that is lightweight and agile\". According to the official introduction, the MiniCPM 4.0 series of LLM models with 8B and 0.5B parameter scales are launched to address the problem of a single architecture that can hardly take care of different scenarios of long and short texts.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[6850,2184],"collection":[],"class_list":["post-36901","post","type-post","status-publish","format-standard","hentry","category-news","tag-6850","tag-2184"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/36901","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=36901"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/36901\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=36901"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=36901"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=36901"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=36901"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}