{"id":29629,"date":"2025-02-26T11:34:26","date_gmt":"2025-02-26T03:34:26","guid":{"rendered":"https:\/\/www.1ai.net\/?p=29629"},"modified":"2025-02-26T11:34:26","modified_gmt":"2025-02-26T03:34:26","slug":"%e5%be%ae%e8%bd%af%e5%bc%80%e6%ba%90%e5%a4%9a%e6%a8%a1%e6%80%81-ai-agentmagma%ef%bc%9a%e8%b4%ad%e7%89%a9%e6%97%b6%e5%8f%af%e8%87%aa%e5%8a%a8%e4%b8%8b%e5%8d%95%ef%bc%8c%e8%bf%98","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/29629.html","title":{"rendered":"Microsoft open source multimodal AI Agent \"Magma\": shopping can automatically order, but also predict the behavior of video characters"},"content":{"rendered":"<p>Feb. 26, 2012 - Early this morning, Beijing time.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%be%ae%e8%bd%af\" title=\"[View articles tagged with [Microsoft]]\" target=\"_blank\" >Microsoft<\/a>In the official website<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%ba%90\" title=\"[View articles tagged with [open source]]\" target=\"_blank\" >Open Source<\/a>Be<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%a4%9a%e6%a8%a1%e6%80%81\" title=\"[View articles tagged with [multimodal]]\" target=\"_blank\" >Multimodality<\/a> AI <a href=\"https:\/\/www.1ai.net\/en\/tag\/agent\" title=\"[View articles tagged with [Agent]]\" target=\"_blank\" >Agent<\/a> Base model --<a href=\"https:\/\/www.1ai.net\/en\/tag\/magma\" title=\"[See articles with [Magma] labels]\" target=\"_blank\" >Magma<\/a>Magma has a lot more to offer than a traditional Agent. Compared to traditional Agents, Magma has<strong>Multimodal capabilities across digital, physical worlds<\/strong>In addition to automatically processing different types of data such as images, video, and text, Magma has built-in psychological prediction capabilities that enhance the ability to understand the spatial and temporal dynamics of future video frames and accurately predict the intentions and future behavior of people or objects in the video.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-29630\" title=\"5118906cj00ss9tw600l1d000v900e1p\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/02\/5118906cj00ss9tw600l1d000v900e1p.jpg\" alt=\"5118906cj00ss9tw600l1d000v900e1p\" width=\"1125\" height=\"505\" \/><\/p>\n<p>Users can use Magma to<strong>Automatically place e-commerce orders and check the weather<\/strong>; it can also<strong>Automatically operated physical robots<\/strong>, or get help in playing real chess.<\/p>\n<p>According to the official description, Magma is able to help AI-driven assistants or robots understand their surroundings and act accordingly. For example, it can help domestic robots<strong>Learn how to organize items you've never seen before<\/strong>, or help virtual assistants<strong>Generate step-by-step user interface navigation instructions for unfamiliar tasks<\/strong>.<\/p>\n<p>Magma is one of the foundational models of VLA (IT House Note: Visual Linguistic Action) capable of adapting to new tasks in digital and physical environments, effectively learning from massive amounts of publicly available visual and linguistic data to fuse linguistic, spatial, and temporal intelligences to cope with complex tasks and environments in the digital and physical world.<\/p>\n<p>With open source link: https:\/\/microsoft.github.io\/Magma\/<\/p>","protected":false},"excerpt":{"rendered":"<p>February 26th news, Beijing time this morning, Microsoft in the official website of the open source multimodal AI Agent base model - Magma. compared with the traditional Agent, Magma has across the digital and physical world of multimodal capabilities, can automatically deal with different types of data, such as images, video, text, etc. In addition, Magma can also be built-in psychological prediction capabilities to enhance the understanding of the spatial and temporal dynamics of future video frames, can accurately infer the intentions and future behavior of the character or object in the video. In addition, Magma has built-in psychological prediction capabilities that enhance its ability to understand the spatial and temporal dynamics of future video frames, allowing it to accurately predict the intentions and future behaviors of people or objects in the video. Users can use Magma to automate e-commerce orders, check the weather, automate physical robots, or get help when playing real chess. According to the official description, Magma can help AI-driven assistants or robots to<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[1405,5813,592,219,280],"collection":[],"class_list":["post-29629","post","type-post","status-publish","format-standard","hentry","category-news","tag-agent","tag-magma","tag-592","tag-219","tag-280"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/29629","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=29629"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/29629\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=29629"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=29629"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=29629"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=29629"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}