According to the news of December 5KelingOfficial public call, a new generationDigital Human 2.0 A full-scale go-live from the day, with the uploading of a rolechart → and the addition of a voice → to describe the three steps of role performance, can generate a digital person who can speak and play。

According to the presentation, the update brought about three breakthroughs for the old version:The performance force is full, the hand and mouth is accurate, and supports up to 5 minutes of contentI don't know. Its “advanced” body movements, gestures, expressions, and lenses allow for more lively emotional communication。
According to 1AI, Clin AI launched digital function in September this year. At that time, a character picture with a text or audio generated 1080p / 48FPS, up to 1 minute digital human video. The digital function is based on a multimodular understanding combined with the depth of the video generation model, which achieves a precision of oral synchronization and fine control of emotional movements. The DiT structure, based on Transformer, has a unique advantage in processing time-series information and fine-particle control, and is capable of finely deciphering facial features, understanding audio semantics and extrapolating appropriate facial expressions and micro-actions based on voice content, thus ensuring that the digital person generated remains consistent throughout the video。