{"id":28892,"date":"2025-02-17T10:27:41","date_gmt":"2025-02-17T02:27:41","guid":{"rendered":"https:\/\/www.1ai.net\/?p=28892"},"modified":"2025-02-17T10:27:41","modified_gmt":"2025-02-17T02:27:41","slug":"deepseek-%e7%ad%89%e7%a7%92%e5%8f%98%e6%93%8d%e6%8e%a7%e7%94%b5%e8%84%91-ai%e6%99%ba%e8%83%bd%e4%bd%93%ef%bc%8c%e5%be%ae%e8%bd%af%e5%bc%80%e6%ba%90%e5%b7%a5%e5%85%b7-omniparser-v2-0-%e5%8f%91%e5%b8%83","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/28892.html","title":{"rendered":"DeepSeek and other AI intelligences that control computers in seconds, Microsoft's open-source tool OmniParser V2.0 is released."},"content":{"rendered":"<p>February 17th.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%be%ae%e8%bd%af\" title=\"[View articles tagged with [Microsoft]]\" target=\"_blank\" >Microsoft<\/a> <a href=\"https:\/\/www.1ai.net\/en\/tag\/omniparser\" title=\"_Other Organiser\" target=\"_blank\" >OmniParser<\/a> It is an AI tool for parsing and recognizing on-screen interactive icons by purely visual GUI-based intelligences, previously paired with GPT-4V to significantly enhance recognition capabilities.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-28893\" title=\"fed3f772j00srt2sx005hd000bz00bpp\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/02\/fed3f772j00srt2sx005hd000bz00bpp.jpg\" alt=\"fed3f772j00srt2sx005hd000bz00bpp\" width=\"431\" height=\"421\" \/><\/p>\n<p>On February 12, Microsoft released on its official website the\u00a0<strong>OmniParser Latest Version V2.0<\/strong>In addition, OpenAI (4o \/ o1 \/ o3-mini) is available,<a href=\"https:\/\/www.1ai.net\/en\/tag\/deepseek\" title=\"[View articles tagged with [DeepSeek]]\" target=\"_blank\" >DeepSeek<\/a>(R1), Qwen (2.5VL) and Anthropic (Sonnet) models into AI intelligences that can manipulate computers.<\/p>\n<p>Compared to version V1, OmniParser V2 has been trained using larger scale interactive element detection data and icon feature caption data, resulting in higher accuracy and faster inference in detecting smaller interactable UI elements, with a latency reduction of 60%.<\/p>\n<p>In the high-resolution Agent benchmark test ScreenSpot Pro.<strong>V2+GPT-4o had an accuracy of 39.6%<\/strong>, while the GPT-4o raw accuracy was only 0.8%.<\/p>\n<p>In order to be able to experiment faster with different intelligences setups, the<strong>Microsoft has also open-sourced OmniTool, a Dockerized Windows system that integrates a set of basic tools needed for intelligences<\/strong>, covering functions such as screen understanding, localization, action planning and execution, and a key tool for turning large models into intelligent bodies.<\/p>","protected":false},"excerpt":{"rendered":"<p>On February 17th, Microsoft OmniParser was an AI tool based on a purely visual GUI smart body resolution and the identification of interactive icons on screen, after which a combination of GPT-4V significantly enhanced recognition. On February 12, Microsoft released an updated version of OmniParser V2.0 on the Web, which makes models such as OpenAI (4o \/ o1 \/ o3-mini), DeepSeek (R1), Qwen (2.5 VL) and Anthropic (Sonnet) an AI body capable of computer manipulation. OmniParser V2 uses larger-scale interactive element detection data compared to V1<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[786,3606,5737,280],"collection":[],"class_list":["post-28892","post","type-post","status-publish","format-standard","hentry","category-news","tag-ai","tag-deepseek","tag-omniparser","tag-280"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/28892","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=28892"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/28892\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=28892"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=28892"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=28892"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=28892"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}