{"id":8920,"date":"2024-04-25T09:37:24","date_gmt":"2024-04-25T01:37:24","guid":{"rendered":"https:\/\/www.1ai.net\/?p=8920"},"modified":"2024-04-25T09:37:24","modified_gmt":"2024-04-25T01:37:24","slug":"%e8%8b%b9%e6%9e%9c%e5%8f%91%e5%b8%83-openelm%ef%bc%8c%e5%9f%ba%e4%ba%8e%e5%bc%80%e6%ba%90%e8%ae%ad%e7%bb%83%e5%92%8c%e6%8e%a8%e7%90%86%e6%a1%86%e6%9e%b6%e7%9a%84%e9%ab%98%e6%95%88%e8%af%ad%e8%a8%80","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/8920.html","title":{"rendered":"Apple releases OpenELM, an efficient language model based on an open source training and inference framework"},"content":{"rendered":"<p data-vmark=\"d063\">Before WWDC24,<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%8b%b9%e6%9e%9c\" title=\"[View articles tagged with [apple]]\" target=\"_blank\" >apple<\/a>An \u201cefficient language model with an open source training and inference framework\u201d was released on the Hugging Face platform, called <a href=\"https:\/\/www.1ai.net\/en\/tag\/openelm\" title=\"[See articles with [OpenEBM] label]\" target=\"_blank\" >OpenELM<\/a>.<\/p>\n<p data-vmark=\"f102\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8921\" title=\"e4b31489-0712-44b2-b3b6-61fbdb082fe4\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/e4b31489-0712-44b2-b3b6-61fbdb082fe4.jpg\" alt=\"e4b31489-0712-44b2-b3b6-61fbdb082fe4\" width=\"680\" height=\"367\" \/><\/p>\n<p data-vmark=\"42e0\">Of course, this is an open source language model, and its source code, pre-trained model weights, and training recipes are available in Apple&#039;s Github repository.<\/p>\n<p data-vmark=\"a589\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8922\" title=\"84901084-6587-4a3f-b861-65c38f4b2690\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/84901084-6587-4a3f-b861-65c38f4b2690.png\" alt=\"84901084-6587-4a3f-b861-65c38f4b2690\" width=\"902\" height=\"862\" \/><\/p>\n<p data-vmark=\"08ac\">The official introduction is translated as follows:<\/p>\n<blockquote>\n<p data-vmark=\"950f\">Reproducibility and transparency of large language models are critical to advancing open research, ensuring trustworthiness of results, and investigating data and model biases and potential risks. To this end, we release OpenELM, a state-of-the-art open source language model.<\/p>\n<p data-vmark=\"8df9\">OpenELM uses a layered scaling strategy to effectively distribute the parameters of each layer of the Transformer model, thereby improving accuracy. For example, when the number of parameters is about 1 billion, OpenELM improves the accuracy by 2.36% compared to OLMo, while the number of pre-training tokens required is only 50%.<\/p>\n<p data-vmark=\"0934\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8923\" title=\"b7f063e4-8461-40fa-ae4f-9621f796ef08\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/b7f063e4-8461-40fa-ae4f-9621f796ef08.jpg\" alt=\"b7f063e4-8461-40fa-ae4f-9621f796ef08\" width=\"1440\" height=\"636\" \/><\/p>\n<p data-vmark=\"539f\">Unlike previous practices of only providing model weights and inference code and pre-training on private datasets, our release includes a complete framework for training and evaluating language models on public datasets, including training logs, multiple checkpoints, and pre-training configurations.<\/p>\n<p data-vmark=\"e3d9\">We also released code to convert the model to the MLX library for inference and fine-tuning on Apple devices. This comprehensive release is intended to empower and consolidate the open research community and pave the way for future open research work.<\/p>\n<\/blockquote>","protected":false},"excerpt":{"rendered":"<p>Prior to WWDC24, Apple released an \"efficient language model with an open source training and inference framework\" on the Hugging Face platform called OpenELM. Of course, this is an open source language model, and the source code, as well as pre-trained model weights and training recipes, are available on Apple's Github repository at The source code, as well as pre-trained model weights and training recipes, are available in the Apple Github repository. The translation of the official synopsis is as follows: Reproducibility and transparency of large-scale language models are critical to advancing open research, ensuring the credibility of results, and investigating data and model bias and potential risks. To this end, we release OpenELM, a state-of-the-art open source language model. OpenELM uses a hierarchical scaling strategy that efficiently assigns parameters to each layer of the Transformer model.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[2368,345],"collection":[],"class_list":["post-8920","post","type-post","status-publish","format-standard","hentry","category-news","tag-openelm","tag-345"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/8920","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=8920"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/8920\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=8920"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=8920"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=8920"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=8920"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}