{"id":49264,"date":"2026-01-27T11:53:21","date_gmt":"2026-01-27T03:53:21","guid":{"rendered":"https:\/\/www.1ai.net\/?p=49264"},"modified":"2026-01-27T11:53:21","modified_gmt":"2026-01-27T03:53:21","slug":"%e9%98%bf%e9%87%8c%e5%8f%91%e5%b8%83%e5%8d%83%e9%97%ae%e6%97%97%e8%88%b0%e6%8e%a8%e7%90%86%e6%a8%a1%e5%9e%8b-qwen3-max-thinking%ef%bc%9a%e6%80%bb%e5%8f%82%e6%95%b0%e8%b6%85%e4%b8%87%e4%ba%bf%ef%bc%8c","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/49264.html","title":{"rendered":"Qwen3-Max-Thinking: The total parameter is over trillions of dollars, and is called performance GPT-5.2"},"content":{"rendered":"<p>January 27th.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e9%98%bf%e9%87%8c\" title=\"[View articles tagged with [Ali]]\" target=\"_blank\" >Ali<\/a>release<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%8d%83%e9%97%ae\" title=\"[See articles with [thousands of questions] labels]\" target=\"_blank\" >Questions<\/a>Flagship<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%8e%a8%e7%90%86%e6%a8%a1%e5%9e%8b\" title=\"[View articles tagged with [inference model]]\" target=\"_blank\" >inference model<\/a> Qwen3-Max-Thinking. According to official presentations, it achieved significant improvements in several critical dimensions, including:<strong>Factual knowledge, complex reasoning, command compliance, human preferences and intelligence capabilities<\/strong>I don't know. Performance in 19 authoritative benchmark tests<strong>Optimistic top model GPT-5.2-Thinking, Claude-Opus-4.5 and Gemini 3 Pro<\/strong>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-49265\" title=\"2706432aj00t9i841008kd000u000t7p\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2026\/01\/2706432aj00t9i841008kd000u000t7p.jpg\" alt=\"2706432aj00t9i841008kd000u000t7p\" width=\"1080\" height=\"1051\" \/><\/p>\n<p>New model<strong>Total parameters over trillion<\/strong>Larger-scale intensive post-learning training was undertaken and, through a series of innovations in reasoning techniques, a significant leap in model performance was finalized. Qwen3-Max-Thinking has also significantly enhanced the original Agent capacity of the autonomous call tool in several key energy benchmarking tests<strong>Think like a professional<\/strong>The answer is more user-friendly, intelligent and fluid. At the same time, model illusions have been significantly reduced, laying the groundwork for real and complex tasks\u3002<\/p>\n<p>According to the official presentation, Qwen3-Max-Thinking has updated several Best Performance (SOTA) records, particularly in key performance benchmark tests such as Scientific Knowledge (GPQA Diamond), Mathematical Logic (IMO-AnswerBench), and LiveCodeBench, to achieve international lead\u3002<\/p>\n<p>Qwen3-Max-Thinking is now online Qwen Chat, and users can interact directly with the model and its self-adaptation tool call function. Meanwhile, API for Qwen3-Max-Thinking (model name\u00a0<strong>qwen3-max-2026-01-23<\/strong>It's also open\u3002<\/p>\n<p>1AI WITH EXPERIENCE LINKS:<\/p>\n<ul>\n<li><strong>Qwen Chat:<\/strong>i don't know<\/li>\n<li><strong>Ali Yunpun:<\/strong>https:\/\/bailian.console.aliyun.com\/cn-beijing\/?tab=model#\/model-market\/detail\/qwen3-max-2026-01-23<\/li>\n<\/ul>\n<p>Qwen3-Max-Thinking is known to have two core innovations\u3002<\/p>\n<ul>\n<li>The self-adaptation tool, which calls the search engine and code interpreter on demand, is now on line Qwen Chat<\/li>\n<li>TEST-TIME Scaling during testing significantly enhances reasoning performance beyond Gemini 3 Pro on key reasoning benchmarks\u3002<\/li>\n<\/ul>\n<p>The official description is as follows:<\/p>\n<blockquote>\n<ul>\n<li><strong>Capacity to access adaptation tools<\/strong><\/li>\n<li>Qwen3-Max-Thinking is free to choose and call on its built-in search, memory and code interpreter functions in a conversation, unlike the early way in which user manual selection tools are needed. This capacity is derived from a specially designed training process: after the initial tool has been fine-tuned, the model is further trained in diverse tasks using feedback based on rules and models. Experiments have shown that search and memory tools can effectively mitigate hallucinations, provide real-time information access and support more personalized responses. The code interpreter allows the user to execute the Snippets and apply computational reasoning to solve complex problems. Together, these functions provide a fluid and powerful experience of dialogue\u3002<\/li>\n<li><strong>Extension during testing<\/strong><\/li>\n<li>Extension during testing refers to the technology of allocating additional computing resources at the reasoning stage to enhance model performance. We have proposed a cumulative, multi-temporal, multi-temporal approach to scaling up. Unlike simply increasing the number of parallel lines of reasoning, N (which often leads to redundancies), we limit N and devote the savings to an iterative self-reflection guided by the \u201cexperiment of experience\u201d mechanism. The mechanism draws key insights from past reasoning, allowing the model to avoid duplicating known findings and instead focus on unresolved uncertainties. The key is that the mechanism achieves a more efficient use of the context and fuller integration of historical information within the same context window than by direct reference to the original reasoning trajectory. At roughly the same token consumption, the method continues to outperform standard parallel sampling and aggregation methods: GPQA (90.3 \u2192 92.8), HLE (34.1 \u2192 36.5), LiveCodeBench v6 (88.0 \u2192 91.4), IMO-AnswerBench (89.5 \u2192 91.5) and HLE (w\/works) (55.8 \u2192 58.3)\u3002<\/li>\n<\/ul>\n<\/blockquote>","protected":false},"excerpt":{"rendered":"<p>On January 27th, Ali published a model of the flag ship ' s reasoning Qwen3-Max-Thinking. According to official presentations, it has achieved significant improvements in several critical dimensions, including knowledge of facts, complex reasoning, compliance with directives, human preferences and intelligence capabilities. In 19 authoritative benchmark tests, performance is comparable to top model GPT-5.2-Thinking, Claude-Opus-4.5 and Gemini 3 Pro. Questioning the total parameters of the new model, over trillions of dollars, a larger scale of intensive post-learning training was conducted and, through a series of innovations in reasoning techniques, a significant leap in model performance was finally achieved. Qwen3-Max-Thinking<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[7880,5023,1759],"collection":[],"class_list":["post-49264","post","type-post","status-publish","format-standard","hentry","category-news","tag-7880","tag-5023","tag-1759"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/49264","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=49264"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/49264\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=49264"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=49264"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=49264"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=49264"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}