{"id":14906,"date":"2024-07-06T08:57:15","date_gmt":"2024-07-06T00:57:15","guid":{"rendered":"https:\/\/www.1ai.net\/?p=14906"},"modified":"2024-07-06T08:57:15","modified_gmt":"2024-07-06T00:57:15","slug":"%e4%ba%a4%e4%ba%92%e6%95%88%e6%9e%9c%e5%af%b9%e6%a0%87-gpt-4o%ef%bc%8c%e5%95%86%e6%b1%a4%e5%8f%91%e5%b8%83%e5%9b%bd%e5%86%85%e9%a6%96%e4%b8%aa%e6%89%80%e8%a7%81%e5%8d%b3%e6%89%80%e5%be%97%e6%a8%a1","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/14906.html","title":{"rendered":"SenseTime releases China&#039;s first WYSIWYG model &quot;RiRiXin5o&quot; with interactive effects comparable to GPT-4o"},"content":{"rendered":"<p data-vmark=\"e69a\"><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%95%86%e6%b1%a4\" title=\"[Sees articles with [business soup] labels]\" target=\"_blank\" >SenseTime<\/a>Technology Release<strong><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%97%a5%e6%97%a5%e6%96%b0\" title=\"_Other Organiser\" target=\"_blank\" >New every day<\/a> SenseNova 5.5<\/strong>\u201d<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%a4%a7%e6%a8%a1%e5%9e%8b\" title=\"[View articles tagged with [large models]]\" target=\"_blank\" >Large Model<\/a>system, and released the first WYSIWYG model in China<strong>New 5o every day<\/strong>\u201d, interactive effect benchmark <a href=\"https:\/\/www.1ai.net\/en\/tag\/gpt-4o\" title=\"[View articles tagged with [GPT-4o]]\" target=\"_blank\" >GPT-4o<\/a>.<\/p>\n<p data-vmark=\"5720\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-14907\" title=\"2351b0b1-536b-47c3-aca4-4a285caff95f\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/07\/2351b0b1-536b-47c3-aca4-4a285caff95f.png\" alt=\"2351b0b1-536b-47c3-aca4-4a285caff95f\" width=\"1257\" height=\"407\" \/><\/p>\n<p data-vmark=\"2f38\">By integrating cross-modal information based on multiple forms such as sound, text, images and videos, &quot;New 5o Every Day&quot; brings a new AI interaction mode - real-time streaming multimodal interaction.<\/p>\n<p data-vmark=\"4ed5\">According to reports, &quot;Daily New 5o&quot; can listen, see, and find topics, just like &quot;chatting with a real person&quot;. This interaction mode is suitable for applications such as real-time conversations and speech recognition. It can naturally handle multiple tasks in the same model and adaptively adjust behaviors and outputs according to different contexts.<\/p>\n<p data-vmark=\"965f\"><strong>RiRiXin 5.5 is the first officially released streaming native multimodal interaction model in China<\/strong>, the model training is based on more than\u00a0<strong>10TB tokens<\/strong>\u00a0High-quality training data, including a large amount of high-quality artificial synthetic data, builds a high-level thinking chain. The model adopts a hybrid end-cloud collaborative architecture and has\u00a0<strong>600 billion parameters<\/strong>, can maximize the cloud-edge-end collaboration and achieve\u00a0<strong>109.5 words\/second<\/strong>The reasoning speed.<\/p>\n<p data-vmark=\"0d88\">SenseTime also released its first<strong>Vimi, a large model for &quot;controllable&quot; character video generation<\/strong>, a character video consistent with the target action can be generated through a photo of any style, and it supports multiple driving methods, and can be driven by existing character videos, animations, sounds, texts and other elements.<\/p>","protected":false},"excerpt":{"rendered":"<p>SenseNova released the \"SenseNova 5.5\" big model system, and released the first WYSIWYG model \"SenseNova 5o\" in China, with the interaction effect benchmarked against GPT-4o. Through the integration of cross-modal information, \"SenseNova 5o\" brings a new AI interaction mode -- real-time streaming multimodal interaction -- in a variety of forms, such as voice, text, image and video, By integrating cross-modal information, based on sound, text, image and video, \"Rizhixin 5o\" brings a new AI interaction mode -- real-time streaming multimodal interaction. According to the introduction, \"Rizhixin 5o\" can listen, see and find topics, just like \"real people chatting\", this interaction mode is suitable for real-time dialog and speech recognition applications, and it can naturally handle multiple tasks in the same model and adaptively adjust behaviors and outputs according to different contexts. It can naturally handle multiple tasks in the same model and adaptively adjust its behavior and output according to different contexts. Rizhixin 5.5 is the first streaming native multimodal interaction model officially released in China.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[2582,2391,216,3358],"collection":[],"class_list":["post-14906","post","type-post","status-publish","format-standard","hentry","category-news","tag-gpt-4o","tag-2391","tag-216","tag-3358"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/14906","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=14906"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/14906\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=14906"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=14906"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=14906"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=14906"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}