{"id":31761,"date":"2025-03-28T10:49:32","date_gmt":"2025-03-28T02:49:32","guid":{"rendered":"https:\/\/www.1ai.net\/?p=31761"},"modified":"2025-03-28T10:49:32","modified_gmt":"2025-03-28T02:49:32","slug":"%e9%98%bf%e9%87%8c%e9%80%9a%e4%b9%89%e5%8d%83%e9%97%ae%e6%8e%a8%e5%87%ba%e8%a7%86%e8%a7%89%e6%8e%a8%e7%90%86%e6%a8%a1%e5%9e%8b-qvq-max%ef%bc%9a%e5%8f%af%e5%88%86%e6%9e%90%e3%80%81%e6%8e%a8%e7%90%86","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/31761.html","title":{"rendered":"Ali Tongyi Qianqian Launches Visual Reasoning Model QVQ-Max: Analyzes, Reasons About Image and Video Content"},"content":{"rendered":"<p>March 28 - Early this morning.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e9%98%bf%e9%87%8c\" title=\"[View articles tagged with [Ali]]\" target=\"_blank\" >Ali<\/a><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e9%80%9a%e4%b9%89%e5%8d%83%e9%97%ae\" title=\"[View articles tagged with [Tongyi Thousand Questions]]\" target=\"_blank\" >Thousand Questions on Tongyi<\/a>The team announced the launch of the next generation<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%a7%86%e8%a7%89%e6%8e%a8%e7%90%86%e6%a8%a1%e5%9e%8b\" title=\"[Sees articles with visual reasoning labels]\" target=\"_blank\" >visual inference model<\/a> <a href=\"https:\/\/www.1ai.net\/en\/tag\/qvq-max\" title=\"[See article with [QVQ-Max] label]\" target=\"_blank\" >QVQ-Max<\/a>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-31762\" title=\"fb9e6e60j00sttbtf002rd000gi00g6p\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/03\/fb9e6e60j00sttbtf002rd000gi00g6p.jpg\" alt=\"fb9e6e60j00sttbtf002rd000gi00g6p\" width=\"594\" height=\"582\" \/><\/p>\n<p>According to the official introduction, QVQ-Max not only understands picture and video content, but also provides analysis and reasoning for said information. More than analyzing and reasoning, QVQ-Max can also do things like design illustrations, generate short video scripts, and even create role-playing content based on user needs.<\/p>\n<p>Core competencies: from observation to reasoning<\/p>\n<p>QVQ-Max's capabilities can be summarized in three areas: careful observation, in-depth reasoning and flexible application. Here's how it performs in each of these areas.<\/p>\n<ul>\n<li><strong>Careful Observation: Capturing Every Detail<\/strong><\/li>\n<li>QVQ-Max is very good at parsing pictures, whether it's a complex diagram or a casual photo taken in everyday life, it can quickly recognize key elements. For example, it can tell you what items are in a photo, what text logos are there, and even point out small details that you might have missed.<\/li>\n<li><strong>Deeper reasoning: not just \"seeing\" but \"thinking\"<\/strong><\/li>\n<li>It's not enough to recognize what's in the picture; QVQ-Max can further analyze the information and draw conclusions based on background knowledge. For example, in a geometry problem, it can deduce the answer based on the graph accompanying the question; in a video, it can speculate on what might happen next based on the content of the picture.<\/li>\n<li><strong>Flexible application: from answering questions to creating<\/strong><\/li>\n<li>In addition to analyzing and reasoning, QVQ-Max can do some interesting things, such as help you design illustrations, generate short video scripts, and even create role-playing content according to your needs. If you upload a draft, it may help you refine it into a complete work; upload a daily photo, it can be transformed into a sharp critic, a soothsayer.<\/li>\n<\/ul>\n<p>The QVQ-Max has a wide range of applications that can come in handy for school, work and everyday life.<\/p>\n<ul>\n<li><strong>Career Tools<\/strong>: At work, QVQ-Max can assist with tasks such as analyzing data, organizing information, and programming and writing code.<\/li>\n<li><strong>Learning Assistant<\/strong>: For students, QVQ-Max can help with difficult questions in subjects such as math and physics, especially those with diagrams. It also makes learning easier by explaining complex concepts in an intuitive way.<\/li>\n<li><strong>Life's little helper<\/strong>QVQ-Max can also provide practical advice in your life. For example, it can recommend what to wear based on photos of your closet, or show you how to cook a new dish based on pictures of recipes.<\/li>\n<\/ul>\n<p>1AI notes that the model is now available on Qwen Chat, where users can use QVQ-Max's reasoning power by uploading any image or video, asking a question, and clicking the \"Thinking\" button.<\/p>\n<p>Alibaba said that this is just one stage in the evolution of the model, and will continue to optimize its performance and expand its functionality in the future.<\/p>","protected":false},"excerpt":{"rendered":"<p>March 28th news, early this morning, Ali Tongyi Qianqi team announced the launch of a new generation of visual reasoning model QVQ-Max. According to the official introduction, QVQ-Max is not only able to understand the picture and video content, but also can provide analysis and reasoning for the above information. In addition to analyzing and reasoning, QVQ-Max can also design illustrations, generate short video scripts, and even create role-playing content according to users' needs. Core Capabilities: From Observation to Reasoning The capabilities of QVQ-Max can be summarized in three areas: detailed observation, in-depth reasoning and flexible application. Here is how it performs in each of these areas. Careful Observation: Capture Every Detail QVQ-Max is very good at analyzing images, whether they are complex charts or everyday objects.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[6090,5299,331,1759],"collection":[],"class_list":["post-31761","post","type-post","status-publish","format-standard","hentry","category-news","tag-qvq-max","tag-5299","tag-331","tag-1759"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/31761","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=31761"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/31761\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=31761"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=31761"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=31761"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=31761"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}