{"id":17899,"date":"2024-08-13T19:13:00","date_gmt":"2024-08-13T11:13:00","guid":{"rendered":"https:\/\/www.1ai.net\/?p=17899"},"modified":"2024-08-13T19:13:00","modified_gmt":"2024-08-13T11:13:00","slug":"%e9%98%bf%e9%87%8c%e9%80%9a%e4%b9%89%e5%8d%83%e9%97%ae%e5%bc%80%e6%ba%90-qwen2-audio-7b-%e8%af%ad%e9%9f%b3%e4%ba%a4%e4%ba%92%e5%a4%a7%e6%a8%a1%e5%9e%8b%ef%bc%9a%e8%87%aa%e7%94%b1%e4%ba%92%e5%8a%a8","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/17899.html","title":{"rendered":"Ali Tongyi Qianwen open-sources Qwen2-Audio 7B voice interaction model: free interaction without text input"},"content":{"rendered":"<p><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e9%98%bf%e9%87%8c\" title=\"[View articles tagged with [Ali]]\" target=\"_blank\" >Ali<\/a><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e9%80%9a%e4%b9%89%e5%8d%83%e9%97%ae\" title=\"[View articles tagged with [Tongyi Thousand Questions]]\" target=\"_blank\" >Thousand Questions on Tongyi<\/a><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%ba%90\" title=\"[View articles tagged with [open source]]\" target=\"_blank\" >Open Source<\/a>\u00a0There are two models in the Qwen2-Audio series: Qwen2-Audio-7B and Qwen2-Audio-7B-Instruct.<\/p>\n<p>As a large-scale audio language model, Qwen2-Audio is able to accept various audio signal inputs and perform audio analysis or directly respond to text based on voice commands. It has two different audio interaction modes:<\/p>\n<ul>\n<li>Voice chat: Users can freely interact with Qwen2-Audio through voice.<strong>No text input required<\/strong><\/li>\n<li>Audio analysis: Users can provide audio and text instructions to analyze the audio during the interaction<\/li>\n<\/ul>\n<p>Officially tested on a series of benchmark datasets, Qwen2-Audio surpassed the previous best model.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-17900\" title=\"0e995a13j00si5lsl00bdd0014000wim\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/08\/0e995a13j00si5lsl00bdd0014000wim.jpg\" alt=\"0e995a13j00si5lsl00bdd0014000wim\" width=\"1440\" height=\"1170\" \/><\/p>\n<p>The relevant links are as follows:<\/p>\n<ul>\n<li><strong>Trial Link<\/strong>\uff1ahttps:\/\/huggingface.co\/spaces\/Qwen\/Qwen2-Audio-Instruct-Demo<\/li>\n<li><strong>Paper address:<\/strong>https:\/\/arxiv.org\/abs\/2407.10759<\/li>\n<li><strong>Evaluation criteria:<\/strong>https:\/\/github.com\/OFA-Sys\/AIR-Bench<\/li>\n<li><strong>Open Source Code:<\/strong>https:\/\/github.com\/QwenLM\/Qwen2-Audio<\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>Ali Tongyi Qianqian open source Qwen2-Audio series of two models Qwen2-Audio-7B and Qwen2-Audio-7B-Instruct. As a large-scale audio language model, Qwen2-Audio is able to accept a wide range of audio signal input and perform audio analysis or respond directly to the text based on the voice commands in two different Audio Interaction Modes: Voice Chat: Users are free to interact with Qwen2-Audio phonetically without the need for text input Audio Analysis: Users can provide audio and text commands to analyze the audio during the interaction Officially tested on a series of benchmark datasets, Qwen2-Audio outperforms the previous best models.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[219,3995,331,1759],"collection":[],"class_list":["post-17899","post","type-post","status-publish","format-standard","hentry","category-news","tag-219","tag-3995","tag-331","tag-1759"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/17899","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=17899"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/17899\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=17899"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=17899"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=17899"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=17899"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}