{"id":41156,"date":"2025-08-12T10:48:13","date_gmt":"2025-08-12T02:48:13","guid":{"rendered":"https:\/\/www.1ai.net\/?p=41156"},"modified":"2025-08-12T10:48:13","modified_gmt":"2025-08-12T02:48:13","slug":"%e6%98%86%e4%bb%91%e4%b8%87%e7%bb%b4%e4%b8%8a%e7%ba%bf%e5%85%a8%e6%96%b0%e3%80%8c%e6%95%b0%e5%ad%97%e4%ba%ba%e3%80%8d%e6%a8%a1%e5%9e%8b%ef%bc%8c%e6%ad%a3%e5%bc%8f%e5%8f%91%e5%b8%83skyreels-a3%e6%a8%a1","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/41156.html","title":{"rendered":"Kunlun Wanwei Goes Online with New \"Digital People\" Model, Officially Releases SkyReels-A3 Model"},"content":{"rendered":"<p>August 11<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%98%86%e4%bb%91%e4%b8%87%e7%bb%b4\" title=\"[Sees articles with [Konlen] tags]\" target=\"_blank\" >Kunlun Wanwei<\/a>Official release of the SkyReels-A3 model.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-41157\" title=\"b92960fej00t0v12d00gkd000u000ckm\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/08\/b92960fej00t0v12d00gkd000u000ckm.jpg\" alt=\"b92960fej00t0v12d00gkd000u000ckm\" width=\"1080\" height=\"452\" \/><\/p>\n<p>Officially, the SkyReels-A3 model is based on \"DiT (Diffusion Transformer) video diffusion model + frame insertion model for video extension + reinforcement learning based action optimization + mirror control\", which can realize full-modal audio drive of any duration.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%95%b0%e5%ad%97%e4%ba%ba\" title=\"[View articles tagged with [digital people]]\" target=\"_blank\" >Digital Human<\/a>Creation.<\/p>\n<p>The SkyReels-A3 model is said to bring new experiences to users in the following four directions:<\/p>\n<p>Text Prompt supports screen changes;<\/p>\n<p>More natural movement interactions, including interactions with goods, hand movements while speaking, etc;<\/p>\n<p>The use and control of mirrors is more advanced, giving artistic scenes such as music and music videos a higher level of artistic beauty;<\/p>\n<p>It can generate single-scene minute-level videos and support up to 60 seconds of output; multi-scenes can support unlimited duration.<\/p>\n<p>In a quantitative evaluation, SkyReels-A3 is compared to methods such as the advanced open-source model OmniAvatar and the closed-source model OmniHuman in different audio-driven scenarios.<strong>The results show that SkyReels-A3 outperforms these methods in most of the metrics, especially in lip synchronization (sync-c and sync-d).<\/strong><\/p>\n<p>In addition, Kunlun also took manual evaluation to more fully respond to the effect of model generation. The results show that SkyReels-A3 achieves the best results for face and subject stability, naturalness of movement, while achieving the best comparable results in mouth synchronization and face, and also has obvious advantages in audio and video synchronization and video quality.<\/p>\n<p><strong>\ud83d\udd17 SkyReels-A3 project homepage:<\/strong>https:\/\/skyworkai.github.io\/skyreels-a3.github.io\/<\/p>\n<p><strong>\ud83d\udd17 SkyReels Official Website Address:<\/strong>https:\/\/www.skyreels.ai\/home<\/p>\n<p><strong>\ud83d\udd17 SkyReels series open source model address:<\/strong>https:\/\/huggingface.co\/Skywork<\/p>","protected":false},"excerpt":{"rendered":"<p>On August 11th, Kunlun World Wide officially released SkyReels-A3 model. Officially, SkyReels-A3 model is based on \"DiT (Diffusion Transformer) video diffusion model + frame insertion model for video extension + action optimization based on reinforcement learning + controllable mirror operation\", which can realize full-modal audio-driven digital human creation of any duration. It is reported that the SkyReels-A3 model brings new experiences to users in the following four directions: Text Prompt (text prompt input) to support screen changes; more natural action interactions, including interactions with commodities, hand movements when speaking, etc.; more advanced use and control of the camera, so that the art scene such as music\/MV, etc. can have the ability to be used as an audio visualizer; and more advanced control of the camera, so that art scenes such as music\/MV, etc. can have the ability to be used as an audio visualizer.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[1252,1050],"collection":[],"class_list":["post-41156","post","type-post","status-publish","format-standard","hentry","category-news","tag-1252","tag-1050"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/41156","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=41156"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/41156\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=41156"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=41156"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=41156"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=41156"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}