{"id":6685,"date":"2024-03-30T08:46:20","date_gmt":"2024-03-30T00:46:20","guid":{"rendered":"https:\/\/www.1ai.net\/?p=6685"},"modified":"2024-03-30T08:46:20","modified_gmt":"2024-03-30T00:46:20","slug":"%e9%98%bf%e9%87%8c%e9%80%9a%e4%b9%89%e5%8d%83%e9%97%ae%e5%bc%80%e6%ba%90qwen1-5-moe-a2-7b%e6%a8%a1%e5%9e%8b","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/6685.html","title":{"rendered":"Ali Tongyi Qianwen open source Qwen1.5-MoE-A2.7B model"},"content":{"rendered":"<p><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e9%80%9a%e4%b9%89%e5%8d%83%e9%97%ae\" title=\"[View articles tagged with [Tongyi Thousand Questions]]\" target=\"_blank\" >Thousand Questions on Tongyi<\/a>The team has launched the Qwen series of<span class=\"spamTxt\">The first<\/span>\u00a0MoE model, named Qwen1.5-MoE-A2.7B. This model has only 2.7 billion activation parameters but performs as well as current<span class=\"spamTxt\">First<\/span>The Qwen1.5-MoE-A2.7B model has only 2 billion non-embedded parameters compared to the Qwen1.5-7B model. Compared to Qwen1.5-7B, Qwen1.5-MoE-A2.7B has only 2 billion non-embedded parameters, which is about one-third of the size of the original model. In addition, Qwen1.5-MoE-A2.7B reduces the training cost by 75% and improves the inference speed by a factor of 1.74 compared to Qwen1.5-7B.<\/p>\n<p class=\"article-content__img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-6686\" title=\"202310311416147098_0\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/03\/202310311416147098_0.jpg\" alt=\"202310311416147098_0\" width=\"776\" height=\"501\" \/><\/p>\n<p>The Qwen1.5-MoE model utilizes a specially designed MoE architecture. Unlike traditional MoE methods, Qwen1.5-MoE uses 64 finegrained experts and introduces new routing mechanisms DeepSeek-MoE and DBRX.This finegrained experts design aims to generate more experts without increasing the number of parameters. The Qwen1.5-MoE model performs well in terms of training cost and inference efficiency, with a performance close to that of the<span class=\"spamTxt\">First<\/span>The 7B model.<\/p>\n<p>The Qwen1.5-MoE-A2.7B model has 1.43 billion activation parameters and 200 million non-embedded parameters, and reduces the training cost by 75%.In experiments, the inference speed of Qwen1.5-MoE-A2.7B is improved by about 1.74 times when tested with a single NVIDIA A100-80G GPU.The Qwen1.5-MoE model has been open-sourced in the ModelScope community and can be downloaded and used directly.<\/p>\n<p>In addition to performance and efficiency, the Qwen1.5-MoE model will continue to be updated with support for third-party frameworks, including llama.cpp, MLX, and others.<\/p>\n<p>Overall, the Qwen1.5-MoE model achieves significant advantages in terms of performance, efficiency, and inference speed, and is the inference training<span class=\"spamTxt\">optimal<\/span>One of the practices.<\/p>\n<p><strong>Qwen1.5-MoE<\/strong>Link to experience.<\/p>\n<p>https:\/\/modelscope.cn\/studios\/qwen\/qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4-demo<\/p>","protected":false},"excerpt":{"rendered":"<p>The Tongyi Qianqian team has released the first MoE model in the Qwen family, named Qwen1.5-MoE-A2.7B, which has only 2.7 billion activation parameters, but with performance comparable to the current state-of-the-art 7 billion parameter model. Compared to Qwen1.5-7B, Qwen1.5-MoE-A2.7B has only 2 billion non-embedded parameters, which is about one-third the size of the original model. In addition, Qwen1.5-MoE-A2.7B reduces the training cost by 75% and improves the inference speed by 1.74 times compared to Qwen1.5-7B. The Qwen1.5-MoE model utilizes a specially designed MoE architecture. Unlike traditional MoE methods, Qwen1.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[862,331,1759],"collection":[],"class_list":["post-6685","post","type-post","status-publish","format-standard","hentry","category-news","tag-862","tag-331","tag-1759"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/6685","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=6685"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/6685\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=6685"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=6685"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=6685"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=6685"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}