{"id":9647,"date":"2024-05-01T08:42:41","date_gmt":"2024-05-01T00:42:41","guid":{"rendered":"https:\/\/www.1ai.net\/?p=9647"},"modified":"2024-04-30T22:43:22","modified_gmt":"2024-04-30T14:43:22","slug":"%e6%9c%80%e6%96%b0%e4%b8%ad%e6%96%87%e5%a4%a7%e6%a8%a1%e5%9e%8b%e6%b5%8b%e8%af%84%ef%bc%9a%e7%99%be%e5%b7%9d%e6%99%ba%e8%83%bd-baichuan-3-%e5%9b%bd%e5%86%85%e7%ac%ac%e4%b8%80","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/9647.html","title":{"rendered":"The latest Chinese large model review: Baichuan Intelligent Baichuan 3 is the first in China"},"content":{"rendered":"<p data-vmark=\"df56\">Today&#039;s domestic<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%a4%a7%e6%a8%a1%e5%9e%8b\" title=\"[View articles tagged with [large models]]\" target=\"_blank\" >Large Model<\/a>The evaluation organization SuperCLUE released the &quot;Chinese Large Model Benchmark Evaluation April 2024 Report&quot;, which selected the April versions of 32 representative large models at home and abroad, and observed and reflected on the development status of large models at home and abroad through multi-dimensional comprehensive evaluation. The report shows that<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e7%99%be%e5%b7%9d%e6%99%ba%e8%83%bd\" title=\"[View articles tagged with [Baichuan Intelligent]]\" target=\"_blank\" >Baichuan Intelligence<\/a>of\u00a0<strong>Baichuan 3 ranks first among domestic large models<\/strong>, followed by Zhipu GLM-4, Tongyi Qianwen 2.1, Wenxin Yiyan 4.0, Moonshot (Kimi) and other large models. From a global perspective, foreign counterparts&#039; GPT-4 and Claude3 scored better.<\/p>\n<p data-vmark=\"f049\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-9648\" title=\"f3610b0b-1361-428e-8cb5-52d305f1c356\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/f3610b0b-1361-428e-8cb5-52d305f1c356.png\" alt=\"f3610b0b-1361-428e-8cb5-52d305f1c356\" width=\"1080\" height=\"737\" \/><\/p>\n<p data-vmark=\"c85b\">SuperCLUE is a comprehensive evaluation benchmark for general large models in China. Its predecessor is the third-party Chinese language understanding evaluation benchmark CLUE (The Chinese Language Understanding Evaluation). Unlike traditional evaluations in the form of multiple-choice questions, SuperCLUE incorporates the evaluation of open subjective questions. Through a multi-dimensional, multi-perspective, and multi-level evaluation system and dialogue, it simulates the application scenarios of large models and truly and effectively examines the model generation capabilities. At the same time, SuperCLUE constructs multi-round dialogue scenarios to more deeply examine the application effects of large models in real multi-round dialogue scenarios, and comprehensively evaluates the context, memory, and dialogue capabilities of large models.<\/p>\n<p data-vmark=\"222e\">According to reports, the SuperCLUE assessment consists of ten basic tasks, including logical reasoning, code, language comprehension, long text, role-playing, etc. The questions are multiple rounds of open-ended short-answer questions. The assessment set has a total of 2,194 questions.<\/p>\n<p data-vmark=\"182d\">The test results show that<strong>Baichuan3 has a balanced ability in both arts and sciences. In terms of knowledge encyclopedia ability, Baichuan 3 surpassed GPT-4-Turbo with a score of 82, ranking first among all 32 large domestic and foreign models participating in the evaluation.<\/strong>In terms of the &quot;logical reasoning&quot; ability, which represents the intelligence of large models, it surpassed Claude3-Opus with a score of 68.60, and also won the first place among a number of domestic large models. In addition, Baichuan 3 also performed well in terms of computing, code, and tool usage, ranking among the top three in China.<\/p>\n<p data-vmark=\"0c5f\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-9649\" title=\"b9f5fa4a-3b14-481d-9fad-3ac1a431fb85\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/b9f5fa4a-3b14-481d-9fad-3ac1a431fb85.png\" alt=\"b9f5fa4a-3b14-481d-9fad-3ac1a431fb85\" width=\"1080\" height=\"195\" \/><\/p>","protected":false},"excerpt":{"rendered":"<p>Today, SuperCLUE, a domestic large model evaluation organization, released the \"Chinese Large Model Benchmark Evaluation 2024 April Report\", which selects 32 representative large models at home and abroad in April, and observes and thinks about the current development status of domestic and foreign large models through multi-dimensional comprehensive evaluation. According to the report, Baichuan Intelligence's Baichuan 3 ranked first among the domestic big models, followed by Wisdom Spectrum GLM-4, Tongyi Qianqian 2.1, Wenxin Yiyin 4.0, Moonshot (Kimi) and other big models. From a global perspective, GPT-4 and Claude3 from foreign counterparts have better scores. SuperCLUE is a comprehensive evaluation benchmark for general-purpose large models in China.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[216,428],"collection":[],"class_list":["post-9647","post","type-post","status-publish","format-standard","hentry","category-news","tag-216","tag-428"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/9647","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=9647"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/9647\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=9647"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=9647"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=9647"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=9647"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}