{"id":26549,"date":"2025-01-10T21:02:27","date_gmt":"2025-01-10T13:02:27","guid":{"rendered":"https:\/\/www.1ai.net\/?p=26549"},"modified":"2025-01-10T21:02:27","modified_gmt":"2025-01-10T13:02:27","slug":"%e9%93%b6%e6%b2%b3%e9%80%9a%e7%94%a8%e5%8f%91%e5%b8%83%e5%85%a8%e7%90%83%e9%a6%96%e4%b8%aa%e7%ab%af%e5%88%b0%e7%ab%af%e5%85%b7%e8%ba%ab%e6%8a%93%e5%8f%96%e5%9f%ba%e7%a1%80%e5%a4%a7%e6%a8%a1%e5%9e%8b-g","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/26549.html","title":{"rendered":"Galaxy released the world's first end-to-end body grasping base model, GraspVLA, with pre-training data of one billion frames of \"vision-language-action\" pairs."},"content":{"rendered":"<p>January 10, 2011 - Galaxy General announced yesterday (January 9) that it has united with the Beijing Zhiyuan Artificial Intelligence Research Institute (BAAI) and researchers from Peking University and the University of Hong Kong to release the first fully generalized end-to-end embodied grasping fundamental big model <a href=\"https:\/\/www.1ai.net\/en\/tag\/graspvla\" title=\"[See articles with [GraspVLA] labels]\" target=\"_blank\" >GraspVLA<\/a>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-26550\" title=\"2e93d121j00spviuq00cbd000u000gpp\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/01\/2e93d121j00spviuq00cbd000u000gpp.jpg\" alt=\"2e93d121j00spviuq00cbd000u000gpp\" width=\"1080\" height=\"601\" \/><\/p>\n<p>Note:\"<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%85%b7%e8%ba%ab%e6%99%ba%e8%83%bd\" title=\"[Sees articles with [smart] labels]\" target=\"_blank\" >embodied intelligence<\/a>\"It refers to the integration of artificial intelligence into physical entities such as robots, giving them the ability to perceive, learn and interact dynamically with their environment.<\/p>\n<p>According to the introduction, the training of GraspVLA contains two parts: pre-training and post-training. The pre-training is based entirely on synthetic big data, and the training data reaches the largest data volume ever -- the\u00a0<strong>One billion frames of \"visual-verbal-action\" vs.<\/strong>, mastering generalized closed-loop grasping capabilities, reaching base models.<\/p>\n<p>After pre-training, the model can be directly Sim2Real (note: from simulation to reality) on unseen, ever-changing real scenes and objects with zero samples to test, which is officially claimed to meet the needs of most products; for special needs, post-training can be migrated from basic capabilities to specific scenarios with only small samples of learning, to maintain a high degree of generalization while forming professional skills that meet the needs of the product.<\/p>\n<p>Officially announced the seven generalization \"gold standards\" that VLA needs to meet to reach the basic model: lighting generalization, background generalization, plane position generalization, spatial height generalization, action strategy generalization, dynamic interference generalization, and object category generalization.<\/p>","protected":false},"excerpt":{"rendered":"<p>On January 10, General Galaxy announced yesterday, January 9th, that the first full-scale and comprehensive-to-end model of GrampVLA would be released by researchers from the Beijing Institute of Intellectually and Intelligible Research (BAAI) and Beijing and Hong Kong Universities. Note: \"Intelligence\" means the integration of artificial intelligence into physical entities, such as robots, giving them the ability to sense, learn and interact with environmental dynamics. GraspVLA training was described as having both pre- and post-training components. Of these, pre-training is based entirely on synthetic big data, which is the largest data volume ever. - A billion frames of visual-linguistic-actions, with the ability to extend closed loop capture, and to develop basic models. After pre-training, the model can go straight to Sim2Re<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[5497,5498,4783],"collection":[],"class_list":["post-26549","post","type-post","status-publish","format-standard","hentry","category-news","tag-graspvla","tag-5498","tag-4783"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/26549","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=26549"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/26549\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=26549"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=26549"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=26549"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=26549"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}