{"id":7179,"date":"2024-04-04T09:38:16","date_gmt":"2024-04-04T01:38:16","guid":{"rendered":"https:\/\/www.1ai.net\/?p=7179"},"modified":"2024-04-04T09:38:16","modified_gmt":"2024-04-04T01:38:16","slug":"%e5%85%83%e8%b1%a1%e5%8f%91%e5%b8%83-xverse-moe-a4-2b-%e5%a4%a7%e6%a8%a1%e5%9e%8b-%e5%8f%af%e5%85%8d%e8%b4%b9%e5%95%86%e7%94%a8","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/7179.html","title":{"rendered":"Yuanxiang releases XVERSE-MoE-A4.2B large model for free commercial use"},"content":{"rendered":"<p><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%85%83%e8%b1%a1\" title=\"[Sees articles with [earth] labels]\" target=\"_blank\" >Yuanxiang<\/a>Released XVERSE-MoE-A4.2B\u00a0<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%a4%a7%e6%a8%a1%e5%9e%8b\" title=\"[View articles tagged with [large models]]\" target=\"_blank\" >Large Model<\/a>, using a hybrid expert model architecture with an activation parameter of 4.2B, the effect is comparable to the 13B model.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%ba%90\" title=\"[View articles tagged with [open source]]\" target=\"_blank\" >Open Source<\/a>,<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%85%8d%e8%b4%b9%e5%95%86%e7%94%a8\" title=\"[See articles with [free commercial] labels]\" target=\"_blank\" >Free for commercial use<\/a>, which can be used by a large number of small and medium-sized enterprises, researchers and developers to promote low-cost deployment.<\/p>\n<p>The model has<span class=\"spamTxt\">Extreme<\/span>Compression and extraordinary performance are two major advantages. Sparse activation technology is used, and the effect exceeds many top models in the industry and is close to super large models. Yuanxiang MoE technology is self-developed and innovative, and efficient fusion operators, fine-grained expert design, load balancing loss terms, etc. are developed. Finally, the architecture setting corresponding to Experiment 4 is adopted.<\/p>\n<p class=\"article-content__img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-7180\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/6384775113342083014263156.jpg\" alt=\"\" width=\"691\" height=\"515\" \/><\/p>\n<p>In terms of commercial applications, Yuanxiang Big Model has carried out in-depth cooperation with multiple Tencent products to provide innovative user experience for the fields of culture, entertainment, tourism and finance.<\/p>\n<ul>\n<li>Hugging Face: https:\/\/huggingface.co\/xverse\/XVERSE-MoE-A4.2B<\/li>\n<li>ModelScope Magic: https:\/\/modelscope.cn\/models\/xverse\/XVERSE-MoE-A4.2B<\/li>\n<li>Github: https:\/\/github.com\/xverse-ai\/XVERSE-MoE-A4.2B<\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>Yuanxiang has released the XVERSE-MoE-A4.2B large model, which adopts a hybrid expert model architecture with 4.2B activation parameters, comparable to a 13B model. The model is fully open source, free for commercial use, and can be used by a large number of SMEs, researchers and developers to promote low-cost deployment. The model has two major advantages: extreme compression and extraordinary performance. Using sparse activation technology, the model surpasses many top stream models in the industry and is close to mega models. Meta-Elephant MoE technology is self-researching and innovative, developing efficient fusion operators, fine-grained expert design, load-balancing loss terms, etc., and finally adopting the architecture setup corresponding to Experiment 4. In terms of commercial applications, Yuanxiang MoE has been deeply cooperating with several Tencent products, providing innovative user experiences in the fields of culture, entertainment, tourism, and finance. Huggin<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[1936,2074,216,219],"collection":[],"class_list":["post-7179","post","type-post","status-publish","format-standard","hentry","category-news","tag-1936","tag-2074","tag-216","tag-219"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/7179","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=7179"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/7179\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=7179"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=7179"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=7179"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=7179"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}