{"id":10359,"date":"2024-05-15T09:29:01","date_gmt":"2024-05-15T01:29:01","guid":{"rendered":"https:\/\/www.1ai.net\/?p=10359"},"modified":"2024-05-15T09:29:01","modified_gmt":"2024-05-15T01:29:01","slug":"openai-%e6%9b%be%e7%a7%98%e5%af%86%e6%b5%8b%e8%af%95-gpt-4o%ef%bc%8c%e5%8a%9b%e5%8e%8b%e7%be%a4%e9%9b%84%e7%99%bb%e9%a1%b6%e8%81%8a%e5%a4%a9%e6%9c%ba%e5%99%a8%e4%ba%ba%e7%ab%9e%e6%8a%80%e5%9c%ba","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/10359.html","title":{"rendered":"OpenAI secretly tested GPT-4o, and it topped the chatbot arena rankings"},"content":{"rendered":"<p data-vmark=\"8fe3\"><a href=\"https:\/\/www.1ai.net\/en\/tag\/openai\" title=\"[View articles tagged with [OpenAI]]\" target=\"_blank\" >OpenAI<\/a> William Fedus, an employee of LMSYS, confirmed on social media platform X on Monday that <a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%81%8a%e5%a4%a9%e6%9c%ba%e5%99%a8%e4%ba%ba\" title=\"[View articles tagged with [chatbot]]\" target=\"_blank\" >Chatbots<\/a>The mysterious chatbot &quot;gpt-chatbot&quot; that performed well in the Chatbot Arena is the new artificial intelligence model they just released. <a href=\"https:\/\/www.1ai.net\/en\/tag\/gpt-4o\" title=\"[View articles tagged with [GPT-4o]]\" target=\"_blank\" >GPT-4o<\/a>Fedus also revealed that GPT-4o topped the Arena leaderboard in the test, achieving the highest score ever.<\/p>\n<p data-vmark=\"1d21\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-10360\" title=\"3b1f8ce9-ffac-4298-933a-47afa81ceaf9\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/05\/3b1f8ce9-ffac-4298-933a-47afa81ceaf9.jpg\" alt=\"3b1f8ce9-ffac-4298-933a-47afa81ceaf9\" width=\"1160\" height=\"653\" \/><\/p>\n<p data-vmark=\"3733\">\u201cGPT-4o is our most advanced cutting-edge model,\u201d Fedus wrote on Twitter. \u201cWe\u2019ve been testing a version of it in Arena under the name \u2018im-also-a-good-gpt2-chatbot\u2019.\u201d<\/p>\n<p data-vmark=\"e04b\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-10361\" title=\"e6750580-2291-49a1-bae8-2cbb10544b06\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/05\/e6750580-2291-49a1-bae8-2cbb10544b06.jpg\" alt=\"e6750580-2291-49a1-bae8-2cbb10544b06\" width=\"1200\" height=\"700\" \/><\/p>\n<p data-vmark=\"1a95\">Chatbot Arena is a website where visitors can talk to two random AI language models at the same time, without knowing which is which, and then choose the model that provides the better response.<\/p>\n<p data-vmark=\"d51d\">Starting in April this year, OpenAI tested multiple versions of GPT-4o in the arena. The model first appeared under the name &quot;gpt2-chatbot&quot;, then became &quot;im-a-good-gpt2-chatbot&quot;, and finally &quot;im-also-a-good-gpt2-chatbot&quot;.<\/p>\n<p data-vmark=\"8aa3\">Since GPT-4o was released today, multiple sources have revealed that the model has topped LMSYS\u2019s internal leaderboard by a huge margin, surpassing the previous top-ranked models Claude 3 Opus and GPT-4 Turbo.<\/p>\n<p data-vmark=\"1059\"><span class=\"link-text-start-with-http\">lmsys.org<\/span>\u00a0The official account of shared a chart and wrote: &quot;The &#039;gpt2-chatbot&#039; series model has just soared to the top of the list, surpassing all other models by a significant margin (about 50 Elo), and it has become the most powerful model in the arena. This is an internal screenshot. The public version of &#039;gpt-4o&#039; has now entered the arena and will soon appear on the public leaderboard!&quot;<\/p>\n<p data-vmark=\"0723\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-10362\" title=\"a07771dc-a6b3-4890-a377-e553877ce2a1\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/05\/a07771dc-a6b3-4890-a377-e553877ce2a1.jpg\" alt=\"a07771dc-a6b3-4890-a377-e553877ce2a1\" width=\"1280\" height=\"1006\" \/><\/p>\n<p data-vmark=\"b850\">As of press time, &quot;im-also-a-good-gpt2-chatbot&quot; has an Elo score of 1309, ahead of GPT-4-Turbo-2023-04-09 with 1253 and Claude 3 Opus with 1246. Claude 3 and GPT-4 Turbo had been competing for the top spot on the leaderboard until the three &quot;gpt2-chatbots&quot; showed up and messed things up.<\/p>","protected":false},"excerpt":{"rendered":"<p>OpenAI employee William Fedus confirmed on Monday on social platform X that the mysterious chat robot \u201cgpt-chatbot\u201d (gpt-chatbot) that recently performed excellently on the LMSYS chat robot arena (Chatbot Arena) was the new artificial intelligence model GPT-4o that they had just released. Fedus also revealed that GPT-4o scored the highest on the playing field in history. \"GPT-4o is our state-of-the-art state-of-the-art model,\" Fedus wrote on Twitter, \"We've been testing a version of the model using the name `im-also-a-good-gpt2-chatbot' in the arena.\" Chat<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[2582,190,275],"collection":[],"class_list":["post-10359","post","type-post","status-publish","format-standard","hentry","category-news","tag-gpt-4o","tag-openai","tag-275"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/10359","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=10359"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/10359\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=10359"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=10359"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=10359"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=10359"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}