{"id":12421,"date":"2024-06-06T09:39:16","date_gmt":"2024-06-06T01:39:16","guid":{"rendered":"https:\/\/www.1ai.net\/?p=12421"},"modified":"2024-06-06T09:39:16","modified_gmt":"2024-06-06T01:39:16","slug":"%e6%99%ba%e8%b0%b1ai%e5%ae%a3%e5%b8%83%e5%bc%80%e6%ba%90-glm-%e7%ac%ac%e5%9b%9b%e4%bb%a3%e6%a8%a1%e5%9e%8b-glm-4-9b","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/12421.html","title":{"rendered":"Zhipu AI announces the open source of GLM fourth-generation model GLM-4-9B"},"content":{"rendered":"<p>GLM technical team on March 14, 2023<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%ba%90\" title=\"[View articles tagged with [open source]]\" target=\"_blank\" >Open Source<\/a>ChatGLM-6B was released, which attracted wide attention and recognition. ChatGLM3-6B was later released, and developers are looking forward to the open source of the fourth generation of GLM models. After nearly half a year of exploration, the GLM technical team launched the fourth generation of GLM series open source models: GLM-4-9B.<\/p>\n<p>In terms of pre-training, GLM-4-9B introduced a large language model for data screening and obtained 10T of high-quality multilingual data, which is more than 3 times the data volume of ChatGLM3-6B. At the same time, FP8 technology was used for efficient pre-training, which increased the training efficiency by 3.5 times. In the case of limited video memory, the performance limit was explored and it was found that the performance of the 6B model was limited. Considering the video memory size of most users, the model size was increased to 9B, and the pre-training calculation amount was increased by 5 times.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-12422\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/06\/6385319293807863504149550.jpg\" alt=\"\" width=\"1000\" height=\"510\" \/><\/p>\n<p>The GLM-4-9B model has more powerful reasoning performance, longer context processing capabilities, multi-language, multi-modal and All Tools capabilities, including the basic version GLM-4-9B (8K), the conversation version GLM-4-9B-Chat (128K), the extra-long context version GLM-4-9B-Chat-1M (1M) and the multi-modal version GLM-4V-9B-Chat (8K).<\/p>\n<p>GLM-4-9B capabilities include:<\/p>\n<p>1. Basic capabilities: The comprehensive performance of the model in Chinese and English is 40% higher than that of ChatGLM3-6B;<\/p>\n<p>2. Long text capability: The context is expanded from 128K to 1M tokens, which is equivalent to the length of 2 volumes of Dream of the Red Chamber or 125 papers;<\/p>\n<p>3. Multi-language capability: supports 26 languages, the vocabulary size is expanded to 150k, and the encoding efficiency is improved by 30%;<\/p>\n<p>4. Function Call Ability: Excellent performance on the Berkeley Function-Calling Leaderboard;<\/p>\n<p>5. All Tools capability: The model can use external tools to complete tasks;<\/p>\n<p>6. Multimodal capabilities: Multimodal models were introduced for the first time with remarkable performance.<\/p>\n<p>Code:<\/p>\n<p>Github:https:\/\/github.com\/THUDM\/GLM-4<\/p>\n<p>Model:<\/p>\n<p>huggingface:https:\/\/huggingface.co\/collections\/THUDM\/glm-4-665fcf188c414b03c2f7e3b7<\/p>\n<p>Modelscope Community: https:\/\/modelscope.cn\/organization\/ZhipuAI<\/p>","protected":false},"excerpt":{"rendered":"<p>The launch of the GLM technical team on 14 March 2023, ChatGLM-6B, attracted widespread attention and recognition. Then there was the opening of ChatGLM3-6B, and the developers looked forward to the opening of the GLM-IV model. After nearly six months of exploration, the GLM technical team launched the fourth generation of the GLM series open source model: GLM-4-9B. GLM-4-9B introduced large-language models for data screening in pre-training and obtained 10T high-quality multilingual data, more than three times that of ChatGLM3-6B. At the same time, FP8 technology was used for efficient pre-training, which resulted in a 3.5-fold increase in training efficiency. In a limited presence, the performance limit was explored, and 6B models were found<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[219,379],"collection":[],"class_list":["post-12421","post","type-post","status-publish","format-standard","hentry","category-news","tag-219","tag-ai"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/12421","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=12421"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/12421\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=12421"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=12421"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=12421"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=12421"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}