{"id":16332,"date":"2024-07-24T08:52:03","date_gmt":"2024-07-24T00:52:03","guid":{"rendered":"https:\/\/www.1ai.net\/?p=16332"},"modified":"2024-07-24T08:52:03","modified_gmt":"2024-07-24T00:52:03","slug":"stability-ai%e5%bc%80%e6%ba%90%e9%9f%b3%e9%a2%91%e7%94%9f%e6%88%90%e6%a8%a1%e5%9e%8bstable-audio-open%ef%bc%8c%e5%8f%af%e7%94%9f%e6%88%9047%e7%a7%92%e7%9a%84%e7%ab%8b%e4%bd%93%e5%a3%b0%e9%9f%b3","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/16332.html","title":{"rendered":"Stability AI open source audio generation model Stable Audio Open, which can generate 47 seconds of stereo audio"},"content":{"rendered":"<p data-pm-slice=\"0 0 []\">recent,<a href=\"https:\/\/www.1ai.net\/en\/tag\/stability-ai\" title=\"_Other Organiser\" target=\"_blank\" >Stability AI<\/a> The team launched a new<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%ba%90%e9%9f%b3%e9%a2%91%e7%94%9f%e6%88%90%e6%a8%a1%e5%9e%8b\" title=\"[Sees articles with [Open Audio Generation Model] labels]\" target=\"_blank\" >Open Source Audio Generation Model<\/a>, named <a href=\"https:\/\/www.1ai.net\/en\/tag\/stable-audio-open\" title=\"[See article with [Stable Audio Open] label]\" target=\"_blank\" >Stable Audio Open<\/a>What\u2019s special about this model is that it can generate up to 47 seconds of stereo audio from text prompts, with a sampling rate of up to 44.1kHz.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-16333\" title=\"get-783\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/07\/get-783.jpg\" alt=\"get-783\" width=\"1070\" height=\"836\" \/><\/div>\n<p data-track=\"48\">With many currently popular<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e9%9f%b3%e9%a2%91%e7%94%9f%e6%88%90%e6%a8%a1%e5%9e%8b\" title=\"[Sees articles with tags]\" target=\"_blank\" >Audio Generation Model<\/a>Unlike the previous model, the weights of Stable Audio Open are open, which means that anyone can view, modify and extend the model. This design concept not only promotes the progress of scientific research, but also provides more possibilities for developers. More importantly, this model is trained only with audio files licensed under Creative Commons, which not only ensures the legality of the data, but also avoids potential copyright issues, reflecting the high attention paid to the ethical use of data.<\/p>\n<p data-track=\"49\">In terms of technical architecture, Stable Audio Open uses an advanced architecture to ensure high fidelity of text-to-audio generation. It can generate high-quality stereo audio, which allows users to enjoy a clear and realistic sound experience. During the training process, the model is exposed to a variety of audio samples, which also helps it learn a richer soundscape, making the generated audio more realistic and diverse.<\/p>\n<p data-track=\"50\">In addition, to ensure that the performance of the new model is comparable to the industry&#039;s top models, the development team conducted a comprehensive performance evaluation. Through the key evaluation indicator FDopenl3, the researchers found that the model performed well in generating high-quality audio, comparable to other excellent models in the industry. This comparative study further proves the superiority and practicality of Stable Audio Open.<\/p>\n<p data-track=\"51\">The launch of Stable Audio Open not only focuses on openness and high-quality audio synthesis, but also provides an important tool for researchers, artists and developers.<\/p>\n<p>&nbsp;<\/p>","protected":false},"excerpt":{"rendered":"<p>Recently, the Stability AI team introduced a new open source audio generation model called Stable Audio Open, which is unique in its ability to generate stereo audio from text cues up to 47 seconds in length, at a sampling rate of up to 44.1kHz. Unlike many of the popular audio generation models available today, Stable Audio Open's weighting is open, meaning that anyone can view, modify, and extend the model. This design philosophy not only advances scientific research, but also opens up more possibilities for developers. What's more, the model is trained using only Creative Commons-licensed audio files, which not only ensures that the<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[388,718,2951,219,3682,3681],"collection":[],"class_list":["post-16332","post","type-post","status-publish","format-standard","hentry","category-news","tag-stability","tag-stability-ai","tag-stable-audio-open","tag-219","tag-3682","tag-3681"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/16332","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=16332"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/16332\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=16332"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=16332"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=16332"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=16332"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}