Ali Tongyi Open Source "Spatial Audio Generation" Model

Yesterday, Ali Tongyi Big Model announced the "spatial audio generation" model - OmniAudio. according to the Tongyi team, OmniAudio is able to generate spatial audio directly from 360° video. In addition, OmniAudio's training method is divided into two stages: "self-supervised coarse-to-fine stream matching pre-training" and "supervised fine-tuning based on two-branch video representation". Currently, OmniAudio has been uploaded to GitHub and released the code, data open source repository, and related technical papers. Project homepage: https://omniaudio-360v2sa.github.io/

Search