{"id":6816,"date":"2024-04-01T09:27:45","date_gmt":"2024-04-01T01:27:45","guid":{"rendered":"https:\/\/www.1ai.net\/?p=6816"},"modified":"2024-04-01T09:27:45","modified_gmt":"2024-04-01T01:27:45","slug":"%e9%98%b2%e6%ad%a2%e8%81%8a%e5%a4%a9%e6%9c%ba%e5%99%a8%e4%ba%ba%e9%80%a0%e8%b0%a3%ef%bc%8c%e8%b0%b7%e6%ad%8c-deepmind%e3%80%81%e6%96%af%e5%9d%a6%e7%a6%8f%e5%a4%a7%e5%ad%a6%e7%a0%94","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/6816.html","title":{"rendered":"To prevent chatbots from &quot;spreading rumors&quot;, Google Deepmind and Stanford University researchers launched AI fact-checking tools"},"content":{"rendered":"<p data-vmark=\"dcf8\">Regardless of the current AI <a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%81%8a%e5%a4%a9%e6%9c%ba%e5%99%a8%e4%ba%ba\" title=\"[View articles tagged with [chatbot]]\" target=\"_blank\" >Chatbots<\/a>No matter how powerful AI is, it will have a behavior that is often criticized - providing users with answers that are inconsistent with the facts in a way that looks convincing. In simple terms, AI sometimes &quot;talks nonsense&quot; or even &quot;spreads rumors&quot; in its answers.<\/p>\n<p data-vmark=\"9fe3\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-6817\" title=\"6325020e-4bb8-49c4-97a8-73d6c3dc0ea4\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/6325020e-4bb8-49c4-97a8-73d6c3dc0ea4.jpg\" alt=\"6325020e-4bb8-49c4-97a8-73d6c3dc0ea4\" width=\"640\" height=\"350\" \/><\/p>\n<p>Image source: Pixabay<\/p>\n<p data-vmark=\"c758\">Preventing large AI models from behaving in this way is not easy and is a technical challenge. However, according to foreign media Marktechpost,<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%b0%b7%e6%ad%8c\" title=\"[View articles tagged with [Google]]\" target=\"_blank\" >Google<\/a> <a href=\"https:\/\/www.1ai.net\/en\/tag\/deepmind\" title=\"_Other Organiser\" target=\"_blank\" >DeepMind<\/a> and<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%96%af%e5%9d%a6%e7%a6%8f%e5%a4%a7%e5%ad%a6\" title=\"[Sees articles with tags on Stanford University]\" target=\"_blank\" >Stanford University<\/a>It seems some kind of workaround has been found.<\/p>\n<p data-vmark=\"1f32\">Researchers have introduced a tool based on a large language model -<span class=\"accentTextColor\">\u00a0Search Enhanced Fact Evaluator<\/span>The results of the study, along with the experimental code and dataset, have been published.<a href=\"https:\/\/arxiv.org\/abs\/2403.18802\" target=\"_blank\" rel=\"noopener\">Click here to view<\/a><\/p>\n<p data-vmark=\"43ac\"><span class=\"accentTextColor\">The system analyzes, processes, and evaluates responses generated by the chatbot in four steps:<\/span>, to verify accuracy and truthfulness: split the answer into individual fact checks, correct them, and compare them with Google search results. The system then checks the relevance of each fact to the original question.<\/p>\n<p data-vmark=\"2310\">To evaluate its performance, the researchers created a dataset called LongFact containing about 16,000 facts and tested the system on 13 large language models from Claude, Gemini, GPT, and PaLM-2. The results showed that in a focused analysis of 100 controversial facts, SAFE&#039;s judgments were correct with a rate of 76% under further review. At the same time, the framework also has economic advantages:<span class=\"accentTextColor\">The cost is more than 20 times cheaper than manual annotation<\/span>.<\/p>\n<p data-vmark=\"d817\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-6818\" title=\"a595874c-7a90-46e5-a97b-c5fe479fed51\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/a595874c-7a90-46e5-a97b-c5fe479fed51.png\" alt=\"a595874c-7a90-46e5-a97b-c5fe479fed51\" width=\"762\" height=\"477\" \/><\/p>","protected":false},"excerpt":{"rendered":"<p>No matter how powerful AI chatbots are today, there's a major, much-criticized behavior that they all exhibit to a greater or lesser extent -- providing users with factually incorrect answers in ways that seem convincing. Simply put, AIs can sometimes \"run their mouths\" or even \"spread rumors\" in their answers. Pixabay Preventing this kind of behavior in big AI models is not easy and is a technical challenge. But according to Marktechpost, Google DeepMind and Stanford University seem to have found a workaround. The researchers have launched a tool based on big language models, the Search-Augmented Factuality Eva<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[593,1998,275,281],"collection":[],"class_list":["post-6816","post","type-post","status-publish","format-standard","hentry","category-news","tag-deepmind","tag-1998","tag-275","tag-281"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/6816","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=6816"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/6816\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=6816"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=6816"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=6816"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=6816"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}