AI PK North Great Chemical Students: The top model is only comparable to the average for junior undergraduate students

December 29, according to XinhuaPeking UniversityThe recent results of the multi-modular in-depth reasoning assessment in the field of chemistry, SUPERChem, were released by a team from the Great North Computing Centre, the Computer Academy and the Yumpe Institute。

AI PK NORTH CHEMISTRY: TOP MODEL IS JUST THE SAME AS THE AVERAGE FOR JUNIOR UNDERGRADUATES

And in the near future, they're using this "Northern Test Paper" as a yardstick, so they're trying to measure it AI The true boundaries of scientific reasoning。

According to the information received, the examination was attended by two junior students from the North Great Chemical and Molecular Engineering Institute, in addition to the GPT, Gemini, DeepSeek and Qwen, among others。

According to the report, the SUPERChem library is made up of 500 deep adaptations of difficult questions and front-line professional literature, not from a web-based public repository. The library is also designed to create a set of topics that AI “not seen” must rely on hard-power reasoning。

IN THIS CAREFULLY DESIGNED EXAMINATION, HUMANS HAVE SHOWN COMPLEX SCIENTIFIC INSTINCTS. AS A BASELINE, UNDERGRADUATE STUDENTS AT THE NORTH GREAT CHEMICAL INSTITUTE WHO PARTICIPATED IN THE TESTING ACHIEVED AN AVERAGE ACCURACY RATE OF 40.31 TP3T。

ON THE OTHER HAND, AI IS DOING WELL:

Even the top-of-the-post model tested has only the same level of achievement as the average for undergraduate students in the lower grades. According to the list, the highest GPT-5 (High) is the correct rate of 39.61 TP3T, which is below human level。

Not only is the correct rate "unusual", but in some areas, modeling is confusing for the team:

THE LANGUAGE OF CHEMISTRY IS GRAPHIC, AND THE MOLECULAR STRUCTURE, THE RESPONSE MACHINE, CONTAINS KEY INFORMATION. FOR SOME MODELS, HOWEVER, THE ACCURACY RATE IS NOT REVERSED WHEN IMAGE INFORMATION IS INTRODUCED. THIS SUGGESTS THAT THE CURRENT AI STILL HAS SIGNIFICANT SENSORY BOTTLENECKS IN TRANSLATING VISUAL INFORMATION INTO CHEMICAL SYNTAX。

EVEN IF THE RIGHT ANSWER IS CHOSEN, IT MAY BE DIFFICULT TO SOLVE THE PROBLEM. THE TEAM FOUND THAT THE AI CHAIN OF REASONING TENDED TO BREAK UP HIGH-LEVEL TASKS SUCH AS PRODUCT STRUCTURE PREDICTION, RESPONSE MACHINE RECOGNITION AND STRUCTURE RELATIONSHIP ANALYSIS. THE CURRENT TOP-OF-THE-ART MODEL, WITH ITS VAST KNOWLEDGE RESERVES, IS STILL ILL-EQUIPPED TO DEAL WITH HARD NUCLEAR CHEMISTRY, WHICH REQUIRES CAREFUL LOGIC AND DEEP UNDERSTANDING。

According to the report, the team released this result not to prove AI ' s short board, but to push it further. SUPERChem is like a signpost. It reminds us:

There is still a long way to go from a general chat robot to a professional scientific assistant who can understand the structure of the relationship, and who can drive the response machine. It's from "Remember Knowledge" to "Understanding Physical World."。

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

{{userData.name}}Verify

AI PK NORTH CHEMISTRY: TOP MODEL IS JUST THE SAME AS THE AVERAGE FOR JUNIOR UNDERGRADUATES

Cursor CEO: over-dependent vibe coding will happen sooner or later

Anthropic Alliance: Under calm, AI is starting to divide the "parallel world."

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

Cursor CEO: over-dependent vibe coding will happen sooner or later

Anthropic Alliance: Under calm, AI is starting to divide the "parallel world."

AI leads to surge in electricity use, study shows power needed for data centers across the U.S. is expected to nearly triple over the next three years

Scientists try to develop world's first virtual human cell with AI

Video Generation Platform Runway Hosts Annual AI Film Festival, 6,000 Entries to Determine Top 10

U.S. Government Is Building Its Own AI Platform, Aiming to Go Live on Independence Day, July 4

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow