Apple Researchers question the reasoning of AI: simple mathematical questions change the correct answer

Apple researchers question AI's reasoning ability: simple math questions can be answered incorrectly with minor changes

In recent years, artificial intelligence (AI) have made significant progress in various areas, with large-scale language modeling (LLM) is capable of generating human-level text and even exceeding human performance on some tasks. However, researchers of LLM'sreasoning abilityquestioned, they found that these models, when solving simple mathematical problems, wereThe fact that mistakes are made with just a few minor changes suggests that they may not be capable of true logical reasoning.

Apple researchers question AI's reasoning ability: simple math questions can be answered incorrectly with minor changes

Thursday.appleA group of researchers at the company published a paper titled "Understanding the Limitations of Mathematical Reasoning in Large Language Models," revealing that LLMs are susceptible to interference when solving mathematical problems.IT House notes that theThe researchers tested the reasoning power of the LLM by making small changes to the math problem, such as adding irrelevant information. It turns out that the performance of these models drops dramatically in the face of such changes.

For example, when researchers give a simple mathematical question: "Oliver picks 44 ecstasy nuts on Friday and 58 ecstasy on Saturday. On Sunday, he picked twice as strange as Friday. How many strange results did Oliver pick?" When the LLM was able to calculate the answer correctly. However, when the researcher added an unrelated detail, “Sunday, he picked twice as many ecstasy as Friday, five of which were smaller than average”, the LLM answered wrongly. For example, GPT-o1-mini responded: "... Sunday, 5 of which are smaller than the average. We need to subtract them from the total number of Sundays: 88 – 5 = 83

The above is just a simple example ofThe researchers modified hundreds of questions, almost all of which resulted in a significant decrease in the model's response success rate.

According to the researchers, this phenomenon suggests that LLMs don't really understand math problems, but instead make predictions based solely on patterns in the training data. But when real "reasoning" is required, such as whether to count small kiwis, they produce strange and implausible results.

This finding has important implications for the development of AI. Although LLM performs well in many areas, there are still limitations in its reasoning ability. In the future, researchers need to further explore how to improve LLM's reasoning ability so that it can better understand and solve complex problems.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

{{userData.name}}Verify

Apple researchers question AI's reasoning ability: simple math questions can be answered incorrectly with minor changes

Gartner Predicts 80% AI Workers Will Need to Upskill by 2027

AI Hawk-Eye Linesman System to Replace Human Linesmen at Wimbledon Tennis Championships from Next Year

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

Gartner Predicts 80% AI Workers Will Need to Upskill by 2027

AI Hawk-Eye Linesman System to Replace Human Linesmen at Wimbledon Tennis Championships from Next Year

Apple in talks with news publishers to develop generative AI systems using their content

Apple AIM autoregressive vision model validation performance is related to model size

Gurman: Apple is developing its own large-scale language model on the device to enable AI functions

Apple executives: working hard to introduce "Apple Intelligence" into the Chinese market

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow