according toAli TongyiNews, internationally recognizedLarge Model Review Chatbot Arena Recently, the latest list was released, Qwen3-235B-A22B-Instruct-2507 got 1433 points, surpassing the top closed-source models Grok4, Claude4, and GPT4.1, and Qwen3 was ranked as the "third in the world" in the overall list.

Chatbot Arena, which uses a blind test evaluation mechanism, is said to be one of the most influential lists in the field of AI macromodeling.
Qwen3's score of 1,433 is the highest score in the history of global open-source big models and Chinese big models. At the same time, Qwen3 was also "No. 1 in the world" in 5 key competencies, including math, coding, hard prompts, longer query, and instruction following.
In addition to the Qwen3 Instruct model, a number of models from the Qwen3 family also achieved excellent results:
The reasoning model Qwen3-235B-A22B-Thinking-2507 also broke into the top ten of the list, tying for the world's top spot in math ability;
The programming model Qwen3-Coder performance tied for first place with Gemini2.5 Pro, DeepSeek-R1, and Claude4 in Chatbot Arena's WebDev Arena sublist, which specializes in evaluating programming capabilities.