Tongyi Qianwen Mathematical Model Qwen2 Math Demo released, 72B version beats GPT-4

Alibaba's."Thousand Questions on Tongyi"The team has made another big news! They've just released the Qwen2Math Demo, whichMathematical modelIt's just a little monster.GPT-4All were trampled under its feet.

This model can not only handle math problems entered in text, but also understand formulas in pictures and screenshots. Imagine that you take a picture of a math equation and it can give you the answer. It is simply a magic tool for solving math problems in math class! (Of course, we do not encourage cheating)

Tongyi Qianwen Mathematical Model Qwen2 Math Demo released, 72B version beats GPT-4

Qwen2-Math has launched three versions: 72B, 7B and 1.5B. The 72B version is simply a math genius. It scored 7 points more than GPT-4 on the MATH dataset, an improvement of 9.6%. This is like you scored 145 points in the college entrance examination math, while the top student next to you only scored 132 points.

What's more, the 7B version uses less than one tenth of the number of parameters to outperform the 72B open source mathematical model NuminaMath, which was the winner of the world's first AIMO, and the award was given by the "top man" in the mathematical world, Tao Zhexuan himself.

Ali's Senior Algorithm Specialist, Jun Yang Lin, excitedly announced that they had turned the Qwen2 model into a math whiz. How did they do it? They used a special "math brain booster" - a carefully designed math-specific corpus. The corpus contains a large number of high-quality mathematical web texts, books, codes, exam questions, and even math questions "made up" by the Qwen2 model itself.

The result? In classic math test sets such as GSM8K and MATH, Qwen2-Math-72B left 405B's Llama-3.1 behind. These test sets are no joke, they contain algebra, geometry, probability, number theory and other math problems.

Not only that, Qwen2-Math also challenged the Chinese dataset CMATH and college entrance examination questions. On the Chinese dataset, even the 1.5B version can beat the 70B Llama3.1. Moreover, no matter which version, the results are significantly improved compared with the Qwen2 basic model of the same scale.

It seems that "Tongyi Thousand Questions" has really asked a math genius this time! In the future, can we ask it to do math problems? But remember, it's just a tool, don't be fooled by its cleverness, you still need to practice your math skills!

Online experience address: https://huggingface.co/spaces/Qwen/Qwen2-Math-Demo

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Teaming up with IBM, the 2024 US Open will introduce new AI commentary and post-match analysis

2024-8-21 9:28:34

Information

IDC predicts that global AI spending is expected to reach $632 billion by 2028

2024-8-21 9:31:35

Search