{"id":31104,"date":"2025-03-19T20:01:13","date_gmt":"2025-03-19T12:01:13","guid":{"rendered":"https:\/\/www.1ai.net\/?p=31104"},"modified":"2025-03-19T20:01:13","modified_gmt":"2025-03-19T12:01:13","slug":"%e4%b8%ad%e5%9b%bd%e4%bf%a1%e9%80%9a%e9%99%a2%e5%90%af%e5%8a%a8-ai%e5%a4%a7%e6%a8%a1%e5%9e%8b%e5%b9%bb%e8%a7%89%e8%af%84%e6%b5%8b%ef%bc%8c%e6%80%bb%e4%bd%93%e6%b6%89%e5%8f%8a%e4%ba%94%e7%a7%8d","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/31104.html","title":{"rendered":"China's ICT Academy launches AI big model phantom evaluation, involving five test dimensions overall"},"content":{"rendered":"<p>March 19, 2011 - 1AI has learned from the<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e4%b8%ad%e5%9b%bd%e4%bf%a1%e9%80%9a%e9%99%a2\" title=\"[Sees articles with tags]\" target=\"_blank\" >China ICT<\/a>The official WeChat public number was informed that, in order to map out the current status of the illusion of big models and promote the application of big models to go deeper and more practical, the Artificial Intelligence Institute of the China Academy of Information and Communication Research initiated the big model based on the AI Safety Benchmark measurement work in the previous period<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%b9%bb%e8%a7%89%e6%b5%8b%e8%af%95\" title=\"[Sees articles that contain the labels]\" target=\"_blank\" >Hallucination test<\/a>.<\/p>\n<p>Big Model Hallucination (AI Hallucination) refers to a model generating content or answering questions that appear reasonable but are actually inconsistent with user input (faithfulness hallucination) or not factual (factual hallucination). With the wide application of big models in key areas such as healthcare and finance, the potential application risk posed by big model hallucination is increasing and is gaining widespread attention in the industry.<\/p>\n<p>This round of phantom testing efforts will be tested on a large language model.<strong>Covers both factual and fidelity hallucination types of hallucinations<\/strong>The specific measurement system is as follows:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-31106\" title=\"e7d87710j00stddcg006xd000o100dgp\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/03\/e7d87710j00stddcg006xd000o100dgp.jpg\" alt=\"e7d87710j00stddcg006xd000o100dgp\" width=\"865\" height=\"484\" \/><\/p>\n<p><strong>Test data contains more than 7000 Chinese test samples<\/strong>, the test format consisted of two types of questions, information extraction and intellectual reasoning, corresponding to faithful hallucination detection, and factual discrimination questions corresponding to factual hallucination detection.<strong>Five test dimensions are covered overall: humanities, social sciences, natural sciences, applied sciences, and formal sciences.<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-31105\" title=\"71c877e2j00stddcv008vd000oo00alp\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/03\/71c877e2j00stddcv008vd000oo00alp.jpg\" alt=\"71c877e2j00stddcv008vd000oo00alp\" width=\"888\" height=\"381\" \/><\/p>\n<p>The China Academy of Information and Communications Technology (CACT) invites all relevant enterprises to participate in model evaluation and jointly promote the application of large model security.<\/p>","protected":false},"excerpt":{"rendered":"<p>March 19 news, 1AI learned from the official WeChat public number of the China Academy of Information and Communication Technology, in order to find out the current situation of the big model hallucination, and to promote the application of the big model to go deep and practical, the Artificial Intelligence Institute of the China Academy of Information and Communication Technology initiated the big model hallucination test based on the previous AI Safety Benchmark measurement work. Big model hallucination (AI Hallucination) refers to the fact that when generating content or answering questions, the model produces content that seems reasonable, but in fact is inconsistent with user input (faithfulness hallucination) or does not conform to the facts (factual hallucination). With the wide application of big models in critical areas such as healthcare and finance, the potential application risk posed by big model illusion is increasing and is gaining widespread attention in the industry. This round of hallucination testing work will be based on the big language model as the<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[148,146],"tags":[433,4620,6019],"collection":[],"class_list":["post-31104","post","type-post","status-publish","format-standard","hentry","category-headline","category-news","tag-ai","tag-4620","tag-6019"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/31104","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=31104"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/31104\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=31104"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=31104"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=31104"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=31104"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}