{"id":26942,"date":"2025-01-16T19:23:10","date_gmt":"2025-01-16T11:23:10","guid":{"rendered":"https:\/\/www.1ai.net\/?p=26942"},"modified":"2025-01-16T19:23:10","modified_gmt":"2025-01-16T11:23:10","slug":"%e9%9d%a2%e5%a3%81%e6%99%ba%e8%83%bd%e5%8f%91%e5%b8%83-minicpm-o-2-6-%e5%85%a8%e6%a8%a1%e6%80%81%e6%a8%a1%e5%9e%8b%ef%bc%8c%e5%8f%b7%e7%a7%b0%e7%ab%af%e4%be%a7-gpt-4o","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/26942.html","title":{"rendered":"Facade Intelligence Releases MiniCPM-o 2.6 Full Modal Model, Called \"End-Side GPT-4o\""},"content":{"rendered":"<p>January 16th.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e9%9d%a2%e5%a3%81%e6%99%ba%e8%83%bd\" title=\"[View articles tagged with [face smart]]\" target=\"_blank\" >Wall-facing intelligence<\/a>Public announced today the launch of the \"MiniCPM-o 2.6\" end-side<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%85%a8%e6%a8%a1%e6%80%81%e6%a8%a1%e5%9e%8b\" title=\"[Sees articles with labels]\" target=\"_blank\" >holomodal model<\/a>With a parameter of 8B, it is claimed that the performance is comparable to GPT-4o and Claude-3.5-Sonnet.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-26943\" title=\"02e36597j00sq6i9n00jid000u000sdp\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/01\/02e36597j00sq6i9n00jid000u000sdp.jpg\" alt=\"02e36597j00sq6i9n00jid000u000sdp\" width=\"1080\" height=\"1021\" \/><\/p>\n<p>It utilizes an end-to-end multimodal architecture that can simultaneously process multiple types of data such as text, images, audio and video to generate high-quality text and speech output. Officially, it has a total parameter count of 8B, visual, speech and multimodal streaming capabilities<strong>Achieved GPT-4o-202405 rating<\/strong>, one of the richest models in the open source community in terms of modal support and performance.<\/p>\n<p>MiniCPM-o 2.6 support<strong>Bilingual voice dialog with configurable voices<\/strong>The program also features advanced capabilities such as emotion\/speed\/style control, end-to-end voice cloning, role-playing, and more.<\/p>\n<p>According to the official introduction, MiniCPM-o 2.6 is also<strong>The first support in the<\/strong><strong>\u00a0<\/strong><strong>iPad<\/strong><strong>\u00a0<\/strong><strong>Multimodal real-time streaming interactions on end-side devices such as the<\/strong>The multimodal macromodel of GPT-4o-20240 is a large model of multimodality. With an average score of 70.2 on the OpenCompass list (combining 8 mainstream multimodal benchmarks), it outperforms mainstream commercial closed-source multimodal macromodels such as GPT-4o-202405, Gemini 1.5 Pro, and Claude 3.5 Sonnet in terms of single-graph comprehension with a size in the order of 8B.<\/p>\n<p>1AI Attached open source address:<\/p>\n<ul>\n<li>GitHub: https:\/\/github.com\/OpenBMB\/MiniCPM-o<\/li>\n<li>huggingface: https:\/\/huggingface.co\/openbmb\/MiniCPM-o-2_6<\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>Jan. 16, 2012 - Facade Intelligence announced today the launch of MiniCPM-o 2.6 end-side full-modal model with 8B parameters, claiming performance comparable to that of GPT-4o and Claude-3.5-Sonnet. It adopts an end-to-end multimodal architecture, and can process multiple types of data, such as text, images, audio and video, to produce high-quality text and speech outputs simultaneously. , audio and video and other types of data simultaneously, generating high-quality text and speech output. Officially, it has a total number of 8B parameters, and its visual, speech, and multimodal streaming capabilities have reached the GPT-4o-202405 level, making it one of the richest models in the open source community in terms of modal support and performance. MiniCPM-o 2.6 supports bilingual voice dialog with configurable voices, and also has emotion\/speed of speech\/wind<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[5569,2184],"collection":[],"class_list":["post-26942","post","type-post","status-publish","format-standard","hentry","category-news","tag-5569","tag-2184"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/26942","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=26942"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/26942\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=26942"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=26942"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=26942"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=26942"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}