{"id":18778,"date":"2024-08-28T09:46:52","date_gmt":"2024-08-28T01:46:52","guid":{"rendered":"https:\/\/www.1ai.net\/?p=18778"},"modified":"2024-08-28T09:46:52","modified_gmt":"2024-08-28T01:46:52","slug":"%e6%99%ba%e8%b0%b1ai%ef%bc%9aglm-4-flash%e5%a4%a7%e6%a8%a1%e5%9e%8bapi%e6%8e%a5%e5%8f%a3%e5%85%8d%e8%b4%b9%e5%90%91%e5%85%ac%e4%bc%97%e5%bc%80%e6%94%be","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/18778.html","title":{"rendered":"Zhipu AI: GLM-4-Flash large model API interface is open to the public for free"},"content":{"rendered":"<p>Beijing Zhipu Huazhang Technology Co., Ltd. recently announced that it will<a href=\"https:\/\/www.1ai.net\/en\/tag\/api\" title=\"_OTHER ORGANISER\" target=\"_blank\" >API<\/a>The interface is open to the public free of charge to promote the popularization and application of large-scale model technology.<\/p>\n<p>The GLM-4-Flash model shows significant advantages in both speed and performance, especially in inference speed. By adopting optimization measures such as adaptive weight quantization, parallel processing technology, batch processing strategy and speculative sampling, it achieves a stable speed of up to 72.14 token\/s, which is outstanding among similar models.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-18779\" title=\"613ed12cj00siwnko0009d000fc00a2m\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/08\/613ed12cj00siwnko0009d000fc00a2m.jpg\" alt=\"613ed12cj00siwnko0009d000fc00a2m\" width=\"552\" height=\"362\" \/><\/p>\n<p>In terms of performance optimization, the GLM-4-Flash model uses 10TB of high-quality multilingual data in the pre-training stage, which enables the model to not only handle tasks such as multi-round dialogues, web page searches, and tool calls, but also supports long text reasoning, with a maximum context length of up to 128K. In addition, the model also supports 26 languages including Chinese, English, Japanese, Korean, German, etc., showing its strong multilingual capabilities.<\/p>\n<p>In order to meet the specific needs of different users for models,<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%99%ba%e8%b0%b1ai\" title=\"[SEES ARTICLES WITH [INTELLIGENCE AI] LABELS]\" target=\"_blank\" >Zhipu AI<\/a>A model fine-tuning function is also provided to help users better adapt the GLM-4-Flash model to various application scenarios.<\/p>\n<p>Interface address: https:\/\/open.bigmodel.cn\/dev\/api#glm-4<\/p>","protected":false},"excerpt":{"rendered":"<p>Ltd. recently announced to open the API interface of its GLM-4-Flash large-scale language model to the public free of charge, in order to promote the popularization and application of large-scale model technology. The GLM-4-Flash model shows significant advantages in both speed and performance. Especially in terms of inference speed, it achieves a stable speed of up to 72.14token\/s by adopting optimization measures such as adaptive weight quantization, parallel processing technology, batch processing strategy, and speculative sampling, and this speed is outstanding among similar models. In terms of performance optimization, the GLM-4-Flash model uses 10TB of high-quality multilingual data in the pre-training phase, which enables the model to not only handle tasks such as multi-round conversations, web searches, and tool invocations, but also supports the<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[1033,379],"collection":[],"class_list":["post-18778","post","type-post","status-publish","format-standard","hentry","category-news","tag-api","tag-ai"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/18778","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=18778"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/18778\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=18778"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=18778"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=18778"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=18778"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}