{"id":39088,"date":"2025-07-09T11:39:20","date_gmt":"2025-07-09T03:39:20","guid":{"rendered":"https:\/\/www.1ai.net\/?p=39088"},"modified":"2025-07-09T11:39:20","modified_gmt":"2025-07-09T03:39:20","slug":"%e6%98%86%e4%bb%91%e4%b8%87%e7%bb%b4%e5%8f%91%e5%b8%83%e5%b9%b6%e5%bc%80%e6%ba%90-skywork-r1v-3-0%ef%bc%8c%e5%a4%9a%e6%a8%a1%e6%80%81%e6%8e%a8%e7%90%86%e8%83%bd%e5%8a%9b%e9%80%bc%e8%bf%91%e4%ba%ba","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/39088.html","title":{"rendered":"Skywork-R1V 3.0 Released and Open-Sourced by Kunlun World Wide Web, Multimodal Reasoning Capability Approaches Human Expert Levels"},"content":{"rendered":"<p>July 9 News.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%98%86%e4%bb%91%e4%b8%87%e7%bb%b4\" title=\"[Sees articles with [Konlen] tags]\" target=\"_blank\" >Kunlun Wanwei<\/a>Just released an announcement announcing the latest Skywork-R1V 3.0 release and<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%ba%90\" title=\"[View articles tagged with [open source]]\" target=\"_blank\" >Open Source<\/a>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-39089\" title=\"63484cb5j00sz44sb003hd000u000idp\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/07\/63484cb5j00sz44sb003hd000u000idp.jpg\" alt=\"63484cb5j00sz44sb003hd000u000idp\" width=\"1080\" height=\"661\" \/><\/p>\n<p>According to KunlunWanwei, Skywork-R1V 3.0 deeply stimulates the cross-modal reasoning ability of the model in the post-training phase through the reinforcement learning strategy, and achieves a double leap in complex logic modeling and cross-disciplinary generalization.<\/p>\n<p>Skywork-R1V 3.0 is based on the previous generation of inference model Skywork-R1V 2.0 to distill the data for a \"cold start\", and construct a high-quality multimodal inference training set through rejection sampling, so as to instruct the open-source visual macromodel InternVL-38B (38B parameter) to learn the basic format of multimodal inference. The training set was constructed by rejection sampling, and the open source visual macromodel InternVL-38B (38B parameters) was instructed to learn the basic format and method of multimodal inference.<\/p>\n<p>Subsequently, the reinforcement learning algorithm GRPO (Group Relative Policy Optimization) was introduced to deeply stimulate the inference potential of the model, successfully realizing the migration of inference capability between image and text modalities, and significantly improving its understanding and analysis performance in cross-modal and multi-disciplinary scenarios.<\/p>\n<p>According to the introduction,<a href=\"https:\/\/www.1ai.net\/en\/tag\/skywork-r1v-3-0\" title=\"[See article with [Skywork R1V 3.0] label]\" target=\"_blank\" >Skywork R1V 3.0<\/a> Relying on only about 12,000 supervised fine-tuning samples and 13,000 reinforcement learning samples, efficient training is achieved, which fully reflects the advantage of \"small data inspires large capacity\".<\/p>\n<p>In terms of performance, the model achieves the highest score of 76.0 for open-source models in the authoritative and comprehensive multimodal review MMMU, surpassing closed-source models such as Claude-3.7-Sonnet (75.0) and GPT-4.5 (74.4), and approaching the level of human primary experts (76.2).<\/p>\n<p>Kunlun says that R1V 3.0's outstanding performance in high school math is close to a number of top closed-source models and achieves the optimal results for open-source multimodal reasoning models, proving its excellent real-world problem-solving performance and stability in cross-scene generalization.<\/p>\n<p>In a more testing test of visual reasoning\u00a0<strong>EMMA-Mini (CoT)<\/strong>\u00a0On top of that, with the open source leading\u00a0<strong>40.3<\/strong>\u00a0It outperforms larger models such as Qwen2.5-VL-72B-Instruct and InternVL3-78B, and closes the gap with the closed-source model Claude-3.7-Sonnet.<\/p>\n<p>In covering primary and secondary school knowledge points\u00a0<strong>MMK12<\/strong>\u00a0On, R1V 3.0 to\u00a0<strong>78.5<\/strong>\u00a0The score again leads the open-source camp, surpassing open-source models such as Qwen2.5-VL-72B-Instruct and InternVL3-78B, as well as closed-source models such as GPT-4.5 and GPT-4o.<\/p>\n<p>Compared with the previous generation model, Skywork-R1V 3.0 has achieved significant performance improvements in several key areas, including physics and logic, and has become one of the most powerful multimodal inference models in the open source space:<\/p>\n<ul>\n<li><strong>Physical reasoning:<\/strong>\u00a0Authoritative reviews in the field of physics\u00a0<strong>PhyX-MC-Text-Minimal<\/strong>\u00a0and\u00a0<strong>SeePhys<\/strong>\u00a0Skywork-R1V 3.0 achieved the following results respectively\u00a0<strong>52.8\u00a0<\/strong>score\u00a0<strong>31.5<\/strong>\u00a0point<strong>open source best performance<\/strong>The model has fully demonstrated its excellent ability in multimodal physics reasoning. The model is not only able to accurately understand basic physics concepts such as mechanics and electromagnetism, but also good at dealing with complex physics problems combining graphics and text (e.g., analyzing professional diagrams such as force analysis diagrams and circuit schematic diagrams), and its level of physical reasoning has significantly exceeded that of the current mainstream open-source models, as well as some of the closed-source models such as GPT-4.5 and Gemini 2 Flash.<\/li>\n<\/ul>\n<ul>\n<li><strong>Logical Reasoning:<\/strong>Skywork-R1V 3.0 also excels in a number of authoritative logical reasoning tests: in the\u00a0<strong>LogicVista<\/strong>\u00a0Achieved in the test\u00a0<strong>59.7<\/strong>\u00a0points in\u00a0<strong>VisuLogic<\/strong>\u00a0Achieved in the test\u00a0<strong>28.5<\/strong>\u00a0Points. In the\u00a0<strong>MME-Reasoning<\/strong>\u00a0Skywork-R1V 3.0 has been recognized as one of the most popular products in the world.\u00a0<strong>42.8<\/strong>\u00a0The score surpasses the closed-source model Claude-4-Sonnet, which demonstrates Skywork-R1V 3.0's leading capabilities in multimodal logic consistency, conditional reasoning, and cross-modal causal modeling.<\/li>\n<\/ul>\n<ul>\n<li><strong>Mathematical reasoning:<\/strong>\u00a0R1V 3.0 demonstrated excellent problem solving skills on math problems. On the leading math benchmarks MathVista, MathVerse, and MathVision, R1V 3.0 scored 77.1, 59.6, and 52.6, respectively, ahead of open-source models such as Qwen2.5-VL-72B-Instruct, InternVL3-78B, QVQ-72B-Preview, and others. Preview and other open source models.<\/li>\n<\/ul>\n<p>Skywork-R1V 3.0 download:<\/p>\n<ul>\n<li>HuggingFace at https:\/\/huggingface.co\/ Skywork \/ Skywork-R1V3-38B<\/li>\n<li>GitHub address: https:\/\/github.com\/SkyworkAI\/Skywork-R1V<\/li>\n<li>Technical report: https:\/\/github.com\/SkyworkAI\/Skywork-R1V\/blob\/main\/Skywork_R1V3.pdf<\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>July 9, 2011 - Kunlun World Wide has just released an announcement announcing the launch of the latest Skywork-R1V 3.0 version and open source. According to KLM, Skywork-R1V 3.0 deeply stimulates the cross-modal reasoning ability of the model through reinforcement learning strategies in the post-training phase, achieving a double leap in complex logic modeling and cross-disciplinary generalization. Skywork-R1V 3.0 is based on the distilled data of the previous generation inference model Skywork-R1V 2.0 for \"cold start\", and constructs a high-quality multimodal inference training set through rejection sampling, which guides the open-source visual macromodel InternVL-38B (with 38B parameters) to learn the basic format and methods of multimodal inference. We then introduce a reinforcement learning algorithm. Subsequently, a reinforcement learning algorithm is introduced<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[7154,219,1050],"collection":[],"class_list":["post-39088","post","type-post","status-publish","format-standard","hentry","category-news","tag-skywork-r1v-3-0","tag-219","tag-1050"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/39088","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=39088"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/39088\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=39088"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=39088"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=39088"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=39088"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}