{"id":12546,"date":"2024-06-07T09:49:41","date_gmt":"2024-06-07T01:49:41","guid":{"rendered":"https:\/\/www.1ai.net\/?p=12546"},"modified":"2024-06-07T09:49:41","modified_gmt":"2024-06-07T01:49:41","slug":"%e9%98%bf%e9%87%8c%e4%ba%91%e9%80%9a%e4%b9%89%e5%8d%83%e9%97%ae%e7%b3%bb%e5%88%97-ai-%e5%bc%80%e6%ba%90%e6%a8%a1%e5%9e%8b%e5%8d%87%e8%87%b3-qwen2%ef%bc%9a5-%e4%b8%aa%e5%b0%ba%e5%af%b8%e3%80%81","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/12546.html","title":{"rendered":"Alibaba Cloud Tongyi Qianwen series AI open source model upgraded to Qwen2: 5 sizes, context length supports up to 128K tokens"},"content":{"rendered":"<p data-vmark=\"64cf\"><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e9%80%9a%e4%b9%89%e5%8d%83%e9%97%ae\" title=\"[View articles tagged with [Tongyi Thousand Questions]]\" target=\"_blank\" >Thousand Questions on Tongyi<\/a>(Qwen) announced today that after months of hard work, the Qwen series models have been significantly upgraded from Qwen1.5 to Qwen2.<strong>And it has been synchronized on Hugging Face and ModelScope<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%ba%90\" title=\"[View articles tagged with [open source]]\" target=\"_blank\" >Open Source<\/a>.<\/strong><\/p>\n<p data-vmark=\"9406\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12547\" title=\"a43d45ed-dcc4-42ff-8d61-7d45b232166c\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/06\/a43d45ed-dcc4-42ff-8d61-7d45b232166c.jpg\" alt=\"a43d45ed-dcc4-42ff-8d61-7d45b232166c\" width=\"1440\" height=\"577\" \/><\/p>\n<p data-vmark=\"b794\">Attached is Qwen 2.0. The main contents are as follows:<\/p>\n<ul class=\"list-paddingleft-2\">\n<li>\n<p data-vmark=\"f579\">Pre-trained and fine-tuned models in 5 sizes, including Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B, and Qwen2-72B<\/p>\n<\/li>\n<li>\n<p data-vmark=\"eb41\">In addition to Chinese and English, high-quality data related to 27 languages has been added to the training data;<\/p>\n<\/li>\n<li>\n<p data-vmark=\"c0fc\">Leading performance on multiple benchmarks;<\/p>\n<\/li>\n<li>\n<p data-vmark=\"7114\">Significant improvement in coding and math skills;<\/p>\n<\/li>\n<li>\n<p data-vmark=\"0110\">Increased the context length support, up to 128K tokens (Qwen2-72B-Instruct).<\/p>\n<\/li>\n<\/ul>\n<h3 data-vmark=\"e845\">Basic information of the model<\/h3>\n<p data-vmark=\"bfc2\">The Qwen2 series includes pre-trained and instruction fine-tuned models of 5 sizes, including Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B, and Qwen2-72B.<\/p>\n<table>\n<thead>\n<tr class=\"firstRow\">\n<th>Model<\/th>\n<th>Qwen2-0.5B<\/th>\n<th>Qwen2-1.5B<\/th>\n<th>Qwen2-7B<\/th>\n<th>Qwen2-57B-A14B<\/th>\n<th>Qwen2-72B<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Parameter quantity<\/td>\n<td>0.49B<\/td>\n<td>1.54B<\/td>\n<td>7.07B<\/td>\n<td>57.41B<\/td>\n<td>72.71B<\/td>\n<\/tr>\n<tr>\n<td>Non-Embedding Parameters<\/td>\n<td>0.35B<\/td>\n<td>1.31B<\/td>\n<td>5.98B<\/td>\n<td>56.32B<\/td>\n<td>70.21B<\/td>\n<\/tr>\n<tr>\n<td>GQA<\/td>\n<td>True<\/td>\n<td>True<\/td>\n<td>True<\/td>\n<td>True<\/td>\n<td>True<\/td>\n<\/tr>\n<tr>\n<td>Tie Embedding<\/td>\n<td>True<\/td>\n<td>True<\/td>\n<td>False<\/td>\n<td>False<\/td>\n<td>False<\/td>\n<\/tr>\n<tr>\n<td>Context length<\/td>\n<td>32K<\/td>\n<td>32K<\/td>\n<td>128K<\/td>\n<td>64K<\/td>\n<td>128K<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p data-vmark=\"88fb\">In the Qwen1.5 series, only 32B and 110B models used GQA. This time, all models of all sizes use GQA so that everyone can experience the advantages of GQA&#039;s inference acceleration and reduced video memory usage.<\/p>\n<h3 data-vmark=\"3e02\">Model Evaluation<\/h3>\n<p data-vmark=\"5b69\">Compared with Qwen1.5, Qwen2 has achieved a significant improvement in performance on large-scale models. We conducted a comprehensive evaluation of Qwen2-72B.<\/p>\n<p data-vmark=\"697f\">In the evaluation of pre-trained language models, compared with the current best open source models, Qwen2-72B significantly surpasses the current leading models such as Llama-3-70B and Qwen1.5&#039;s largest model Qwen1.5-110B in many capabilities including natural language understanding, knowledge, code, mathematics and multilingualism.<\/p>","protected":false},"excerpt":{"rendered":"<p>Qwen announced today that after months of hard work, the Qwen family of models has been significantly upgraded from Qwen1.5 to Qwen2, and has been open-sourced simultaneously on Hugging Face and ModelScope. Attached are the key features of Qwen 2.0: 5 sizes of pre-trained and instruction-tuned models, including Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B, and Qwen2-72B; 27 languages added to the training data with high quality data; leading performance on multiple benchmarks; code and math capabilities; and a new, more robust model. Significantly improved code and math skills; Increased context length; Increase in the number of languages in the training data; Increase in the number of languages in the training data.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[219,331,334],"collection":[],"class_list":["post-12546","post","type-post","status-publish","format-standard","hentry","category-news","tag-219","tag-331","tag-334"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/12546","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=12546"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/12546\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=12546"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=12546"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=12546"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=12546"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}