{"id":10768,"date":"2024-05-21T09:31:59","date_gmt":"2024-05-21T01:31:59","guid":{"rendered":"https:\/\/www.1ai.net\/?p=10768"},"modified":"2024-05-21T09:31:59","modified_gmt":"2024-05-21T01:31:59","slug":"%e5%a3%b0%e7%a7%b0%e5%aa%b2%e7%be%8e%e4%ba%ba%e7%b1%bb%e4%b8%93%e5%ae%b6%ef%bc%8c%e8%b0%b7%e6%ad%8c-gemini-1-5-pro-%e6%95%b0%e5%ad%a6%e7%89%88%e6%8f%90%e6%99%ba","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/10768.html","title":{"rendered":"Claiming to be &quot;comparable to human experts&quot;, Google Gemini 1.5 Pro Mathematics Edition &quot;improves intelligence&quot;: MATH benchmark accuracy rate is 91.1%"},"content":{"rendered":"<p data-vmark=\"f48d\"><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%b0%b7%e6%ad%8c\" title=\"[View articles tagged with [Google]]\" target=\"_blank\" >Google<\/a>The company released a technical report last week, saying <a href=\"https:\/\/www.1ai.net\/en\/tag\/gemini\" title=\"[View articles tagged with [Gemini]]\" target=\"_blank\" >Gemini<\/a> The 1.5 Pro model significantly improved its math scores after being trained in a specific area of math.<strong>And successfully solved some problems of the International Mathematical Olympiad.<\/strong><\/p>\n<p data-vmark=\"eec7\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-10769\" title=\"91b9c12e-099a-492c-8b6d-9fc90f51576d\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/05\/91b9c12e-099a-492c-8b6d-9fc90f51576d.jpg\" alt=\"91b9c12e-099a-492c-8b6d-9fc90f51576d\" width=\"728\" height=\"410\" \/><\/p>\n<p data-vmark=\"0897\">Google trained the Gemini 1.5 Pro model specifically for mathematical scenarios and tested it with the MATH benchmark, the American Invitational Mathematics Examination (AIME), and Google&#039;s internal HiddenMath benchmark.<\/p>\n<p data-vmark=\"3784\">According to Google, Math Gemini 1.5 Pro performs \u201con par with human experts\u201d on math benchmarks, solving significantly more problems on the AIME benchmark than the standard, non-Math Gemini 1.5 Pro, and also achieving improved scores on other benchmarks.<\/p>\n<p data-vmark=\"bbaf\">Of the three examples shared by Google, two were solved by the math-specific Gemini 1.5 Pro, while one was incorrectly solved by the standard Gemini 1.5 Pro variant. These problems typically require solvers to recall basic math formulas from algebra and rely on their segmentation and other math rules to arrive at the correct answer.<\/p>\n<p data-vmark=\"9b8c\">In addition to the questions, Google also shared important details about the Gemini 1.5 Pro benchmarks, which show that Gemini 1.5 Pro is ahead of GPT-4 Turbo and Amazon&#039;s Claude in all five benchmark scores.<\/p>\n<p data-vmark=\"00aa\">Google said that the mathematical derivative Gemini 1.5 Pro has a single sample MATH benchmark accuracy of 80.6%, and when sampling 256 solutions and selecting a candidate answer (rm@256), the accuracy reaches 91.1%.<\/p>","protected":false},"excerpt":{"rendered":"<p>Google released a technical report last week, stating that the Gemini 1.5 Pro model, after training in specialized mathematics, had significantly improved mathematics performance and successfully resolved some of the problems of the International Math Olympiad. Google trained Gemini 1.5 Pro models for mathematical scenes and tested them through the MATH benchmark, the American Mathematics Invitation Examination (AIME) and the HiddenMath benchmark within Google. According to Google data, the mathematical Gemini 1.5 Pro's performance in the mathematical benchmarking test is \u201csimilar to that of human experts\u201d, compared to the standard non-mathematical Gemini 1.5 Pro, the mathematical Gemini 1.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[436,281],"collection":[],"class_list":["post-10768","post","type-post","status-publish","format-standard","hentry","category-news","tag-gemini","tag-281"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/10768","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=10768"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/10768\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=10768"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=10768"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=10768"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=10768"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}