{"id":4025,"date":"2024-02-18T07:52:04","date_gmt":"2024-02-17T23:52:04","guid":{"rendered":"https:\/\/www.1ai.net\/?p=4025"},"modified":"2024-02-18T07:52:04","modified_gmt":"2024-02-17T23:52:04","slug":"meta-%e6%8e%a8%e5%87%ba-v-jepa-%e6%a8%a1%e5%9e%8b%ef%bc%8c%e5%88%a9%e7%94%a8-ai-%e9%ab%98%e6%95%88%e8%a1%a5%e5%85%85%e8%a7%86%e9%a2%91%e5%8f%97%e9%81%ae%e8%94%bd%e9%83%a8%e5%88%86","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/4025.html","title":{"rendered":"Meta launches V-JEPA model, using AI to efficiently supplement the obscured parts of videos"},"content":{"rendered":"<p data-vmark=\"71cc\"><a href=\"https:\/\/www.1ai.net\/en\/tag\/meta\" title=\"[View articles tagged with [Meta]]\" target=\"_blank\" >Meta<\/a> Chief AI scientist Yann LeCun launched the JEPA (Joint Embedding Predictive Architectures) model architecture in 2022.<span class=\"accentTextColor\">The following year, an \u201cI-JEPA\u201d image prediction model was developed based on the JEPA architecture, and a new model called \u201c<a href=\"https:\/\/www.1ai.net\/en\/tag\/v-jepa\" title=\"[SEE ARTICLES WITH [V-JEPA] LABELS]\" target=\"_blank\" >V-JEPA<\/a>&quot;of<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%a7%86%e9%a2%91%e9%a2%84%e6%b5%8b%e6%a8%a1%e5%9e%8b\" title=\"[Sees articles with [Video Forecast Model] labels]\" target=\"_blank\" >Video Prediction Model<\/a><\/span>.<\/p>\n<p data-vmark=\"f762\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-4026\" title=\"10b927c8-224d-4ddb-8a34-5c09f045d968\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/02\/10b927c8-224d-4ddb-8a34-5c09f045d968.png\" alt=\"10b927c8-224d-4ddb-8a34-5c09f045d968\" width=\"1440\" height=\"714\" \/><\/p>\n<p data-vmark=\"409d\">It is reported that the relevant JEPA architecture and I-JEPA\/V-JPA models focus on &quot;predictive ability&quot;, claiming that they can use abstraction to efficiently predict and generate the obscured parts of images\/videos in a &quot;human-understandable&quot; way.<\/p>\n<p data-vmark=\"e3bf\">IT Home noticed that the researchers used a series of specific masked videos to train the I-JEPA\/V-JEPA model. The researchers required the model to use an &quot;abstract method&quot; to fill in the missing content in the video, so that the model can learn the scene during the filling and further predict future events or actions, thereby achieving a deeper understanding of the world.<\/p>\n<p data-vmark=\"2074\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-4027\" title=\"68f01275-3c70-4020-afb5-4269004a5d3b\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/02\/68f01275-3c70-4020-afb5-4269004a5d3b.png\" alt=\"68f01275-3c70-4020-afb5-4269004a5d3b\" width=\"1440\" height=\"721\" \/><\/p>\n<p data-vmark=\"1c8f\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-4029\" title=\"b8258a82-17a8-44a7-966d-ad63a738cfb5\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/02\/b8258a82-17a8-44a7-966d-ad63a738cfb5.png\" alt=\"b8258a82-17a8-44a7-966d-ad63a738cfb5\" width=\"1600\" height=\"849\" \/><\/p>\n<p>\u25b2 Image source: Meta official press release (the same below)<\/p>\n<p data-vmark=\"619d\">The researchers said this training method allows the model to focus on the high-level concepts of the film, rather than &quot;getting bogged down in details that are not important for downstream tasks.&quot;<span class=\"accentTextColor\">The researchers gave an example: &quot;When humans watch a video containing trees, they don&#039;t particularly care about the movement of leaves.&quot; Therefore, the model using this abstract concept is more efficient than competing products in the industry.<\/span>.<\/p>\n<p data-vmark=\"e19c\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-4028\" title=\"3a8d924f-49ca-4679-9e1a-5791fe158c15\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/02\/3a8d924f-49ca-4679-9e1a-5791fe158c15.png\" alt=\"3a8d924f-49ca-4679-9e1a-5791fe158c15\" width=\"1312\" height=\"706\" \/><\/p>\n<p data-vmark=\"98b0\">The researchers also mentioned that V-JEPA uses a design structure called &quot;Frozen Evaluations&quot;, which means that &quot;the core part of the model will not change after pre-training&quot;, so only a small specialized layer needs to be added to the model to adapt to new tasks, making it more universal.<\/p>","protected":false},"excerpt":{"rendered":"<p>Meta's chief AI scientist, Yann LeCun, introduced the JEPA (Joint Embedding Predictive Architectures) modeling architecture in 2022, and the following year developed an \"I-JEPA\" image prediction model based on the JEPA architecture, and now a video prediction model called \"V-JEPA\". The following year, an \"I-JEPA\" image prediction model was developed based on the JEPA architecture, and now a video prediction model called \"V-JEPA\" has been launched. According to the introduction, the relevant JEPA architecture and I-JEPA \/ V-JPA model focus on \"predictive power\", claiming that it can be \"human-understandable\" way to use abstraction to efficiently predict the generation of images \/ videos in the part of the occluded. IT House notes that the researchers used a series of specific videos that were masked.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[297,1262,1261],"collection":[],"class_list":["post-4025","post","type-post","status-publish","format-standard","hentry","category-news","tag-meta","tag-v-jepa","tag-1261"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/4025","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=4025"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/4025\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=4025"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=4025"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=4025"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=4025"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}