{"id":11399,"date":"2024-05-27T10:57:50","date_gmt":"2024-05-27T02:57:50","guid":{"rendered":"https:\/\/www.1ai.net\/?p=11399"},"modified":"2024-05-27T10:57:50","modified_gmt":"2024-05-27T02:57:50","slug":"gpt-4%e8%a2%ab%e8%af%81%e5%ae%9e%e6%9c%89%e4%ba%ba%e7%b1%bb%e5%bf%83%e6%99%ba%ef%bc%81%e7%bd%91%e5%8f%8b%ef%bc%9a%e8%bf%9eai%e9%83%bd%e5%8f%af%e4%bb%a5%e7%9c%8b%e5%87%ba%e4%bb%96%e5%9c%a8%e5%98%b2","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/11399.html","title":{"rendered":"GPT-4 is proven to have human mind! Netizen: Even AI can see that he is mocking you"},"content":{"rendered":"<p>New research published in the journal Nature shows that the<a href=\"https:\/\/www.1ai.net\/en\/tag\/gpt-4\" title=\"[SEE ARTICLES WITH [GPT-4] LABELS]\" target=\"_blank\" >GPT-4<\/a>Performance in Theory of Mind (ToM) is comparable to humans and even exceeds them in some areas. The study was conducted by James W. A. Strachan et al. They evaluated the performance of GPT-4, GPT-3.5, Llama2, and human participants through a series of tests and compared them.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-11400\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/05\/6385240375159498856112958.png\" alt=\"\" width=\"749\" height=\"624\" \/><\/p>\n<p>The following are the key findings of the study.<\/p>\n<p>Theory of Mind Performance:Theory of Mind is the ability to understand the mental states of others and is critical to social interaction.The GPT-4 performs as well as humans in Theory of Mind and even outperforms humans in detecting sarcasm and innuendo.<\/p>\n<p>Test Items:The study included five test items, namely false beliefs, irony, gaffes, innuendo, and strange stories.The GPT-4 significantly outperformed humans on three tests, namely irony, innuendo, and strange stories, and was on par with humans on the false beliefs test, and only underperformed humans on the gaffes test.<\/p>\n<p>Conservatism:The GPT-4's low scores on the Gaffe Test are not due to a lack of comprehension, but rather to a conservative strategy that does not readily give definitive opinions.<\/p>\n<p>Gaffe Likelihood Test:In the Gaffe Likelihood Test, the GPT-4 demonstrated flawless performance, showing that it was able to successfully infer the speaker's mental state and determine that an unintentional offense was more likely than an intentional insult.<\/p>\n<p>Separation of ability and performance:Research suggests that GPT models may have the technical sophistication to compute mind-like inferences, but perform differently from humans under uncertainty. Humans tend to eliminate uncertainty, whereas GPTs do not spontaneously compute inferences to reduce uncertainty.<\/p>\n<p>Cautious Behavior:GPT-4's conservatism in gaffe testing may stem from mitigating measures in its underlying architecture that are designed to improve factuality and avoid over-reliance on the model by users.<\/p>\n<p>The results of this study suggest that the ability of the GPT-4 to understand human mental states may be underestimated. The researchers call for a \"machine psychology\" that uses the tools and paradigms of experimental psychology to systematically study the capabilities and limitations of large-scale language models.<\/p>\n<p>Paper address:https:\/\/www.nature.com\/articles\/s41562-024-01882-z<\/p>","protected":false},"excerpt":{"rendered":"<p>The latest studies published in the Nature magazine show that GPT-4's performance in the theory of mind (Theory of Mind, TOM) is comparable to, and in some respects exceeds, humans. The study was conducted by James W. A. Strachan and others, who assessed and compared the performance of GPT-4, GPT-3.5, Llama2 and human participants through a series of tests. The following is the main finding of the study: The expression of mind theory: mind theory is the ability to understand the psychological state of others and is essential for social interaction. GPT-4's performance in intellectual theory is no different from that of humans, even better than in the detection of irony and implication. Testing projects: the study included five testing projects<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[510],"collection":[],"class_list":["post-11399","post","type-post","status-publish","format-standard","hentry","category-news","tag-gpt-4"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/11399","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=11399"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/11399\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=11399"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=11399"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=11399"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=11399"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}