{"id":7047,"date":"2024-04-03T09:44:19","date_gmt":"2024-04-03T01:44:19","guid":{"rendered":"https:\/\/www.1ai.net\/?p=7047"},"modified":"2024-04-03T09:44:19","modified_gmt":"2024-04-03T01:44:19","slug":"%e7%a0%94%e7%a9%b6%e5%8f%91%e7%8e%b0%ef%bc%9agpt-4%e5%9c%a8%e4%b8%b4%e5%ba%8a%e6%8e%a8%e7%90%86%e4%b8%ad%e8%a1%a8%e7%8e%b0%e4%bc%98%e4%ba%8e%e5%8c%bb%e7%94%9f%ef%bc%8c%e4%bd%86%e4%b9%9f%e6%9b%b4","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/7047.html","title":{"rendered":"GPT-4 outperforms doctors in clinical reasoning, but also makes mistakes more often, study finds"},"content":{"rendered":"<p>In a new study, scientists at Beth Israel Deaconess Medical Center (BIDMC) compared a large language model with human doctors to find out whether the patient&#039;s speech was correct or not.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e4%b8%b4%e5%ba%8a%e6%8e%a8%e7%90%86\" title=\"[Sees articles with [clinical reasoning] labels]\" target=\"_blank\" >Clinical Reasoning<\/a>Comparison of abilities. The researchers used the revised IDEA (r-IDEA) score, a commonly used tool for assessing clinical reasoning abilities.<\/p>\n<p>This study involved giving <a href=\"https:\/\/www.1ai.net\/en\/tag\/gpt-4\" title=\"[SEE ARTICLES WITH [GPT-4] LABELS]\" target=\"_blank\" >GPT-4<\/a>The chatbot, 21 attending physicians, and 18 residents were asked to provide support for 20 clinical cases to build diagnostic reasoning and solve problems. The r-IDEA scores of the three groups of answers were then evaluated. The researchers found that the chatbot actually gained<span class=\"spamTxt\">Highest<\/span>The authors found that the chatbot \u201coften got it completely wrong\u201d.<\/p>\n<p class=\"article-content__img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-7048\" title=\"202307051434452205_0\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/202307051434452205_0.jpg\" alt=\"202307051434452205_0\" width=\"1000\" height=\"752\" \/><\/p>\n<p>Source Note: The image is generated by AI, and the image is authorized by Midjourney<\/p>\n<p>&quot;Further research is needed to determine how large language models can be used to predict the future of speech,&quot; explains lead author Dr. Stephanie Cabral.<span class=\"spamTxt\">most<\/span>\u201cIn summary, the results showed reasonable reasoning by the chatbot, but also significant errors; this further supports that this AI-driven system, at its current level of maturity, is best suited as a tool to augment physicians\u2019 practice rather than replace their diagnostic abilities.\u201d<\/p>\n<p>As medical<span class=\"spamTxt\">Leaders<\/span>As technologists often explain, this is because the practice of medicine is not based solely on the output of rule-based algorithms, but on deep reasoning and clinical intuition, which is <a href=\"https:\/\/www.1ai.net\/en\/tag\/llm\" title=\"[SEE ARTICLES WITH [LLM] LABELS]\" target=\"_blank\" >LLM<\/a> However, tools like this that provide diagnostic or clinical support can still be extremely powerful assets in the physician\u2019s workflow. For example, if the system can reasonably provide \u201c<span class=\"spamTxt\">first<\/span>Diagnosis\u201d or preliminary diagnosis suggestions may allow doctors to save a lot of time in the diagnosis process. In addition, there may be opportunities to increase efficiency if these tools can enhance doctors\u2019 workflow and improve their ability to process the large amounts of clinical information in medical records.<\/p>\n<p>Many organizations are taking advantage of these potential clinical enhancements. For example, AI-driven transcription technologies that leverage natural language processing are helping physicians complete clinical documentation more efficiently. Enterprise search tools are integrating with organizational and electronic medical record systems to help physicians search large amounts of data, promote data interoperability, and gain faster and deeper insights into existing patient data. Other systems may even help provide preliminary diagnoses; for example, tools are emerging in the fields of radiology and dermatology that can suggest potential diagnoses by analyzing uploaded photos.<\/p>\n<p>However, there is still much work to be done in this area. In short, although these AI systems are not yet ready for clinical diagnosis, it is still possible to use this technology to enhance clinical workflow, especially to ensure safe and accurate processes while maintaining human control.<\/p>","protected":false},"excerpt":{"rendered":"<p>In a new study, scientists at Beth Israel Medical Center (BIDMC) in the United States compared a large-scale language model with human physicians for clinical reasoning ability. The researchers used the Revised IDEA (r-IDEA) score, a commonly used tool to assess clinical reasoning ability. The study consisted of giving a chatbot powered by GPT-4, 21 attending physicians, and 18 residents 20 clinical cases to build diagnostic reasoning and solve problems. The r-IDEA scores of the answers from these three groups were then evaluated. The researchers found that the chatbot actually received the highest r-IDEA score, which is actually quite impressive in terms of diagnostic reasoning. However, the authors also noted that the chatbot<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[510,473,2055],"collection":[],"class_list":["post-7047","post","type-post","status-publish","format-standard","hentry","category-news","tag-gpt-4","tag-llm","tag-2055"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/7047","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=7047"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/7047\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=7047"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=7047"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=7047"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=7047"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}