{"id":2646,"date":"2024-01-07T09:15:42","date_gmt":"2024-01-07T01:15:42","guid":{"rendered":"https:\/\/www.1ai.net\/?p=2646"},"modified":"2024-01-07T09:15:42","modified_gmt":"2024-01-07T01:15:42","slug":"%e8%bf%b7%e4%bd%a0ai%e6%a8%a1%e5%9e%8btinyllama%e5%8f%91%e5%b8%83%ef%bc%9a%e9%ab%98%e6%80%a7%e8%83%bd%e3%80%81%e4%bb%85637mb","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/2646.html","title":{"rendered":"\u200bMini AI model TinyLlama released: high performance, only 637MB"},"content":{"rendered":"<p>After some anticipation,<a href=\"https:\/\/www.1ai.net\/en\/tag\/tinyllama\" title=\"_Other Organiser\" target=\"_blank\" >TinyLlama<\/a>The project released a striking<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%ba%90%e6%a8%a1%e5%9e%8b\" title=\"[See articles with [open source model] labels]\" target=\"_blank\" >Open Source Model<\/a>The project started last September, with developers working to train a small model on trillions of tokens. After some hard work and some setbacks, the TinyLlama team has now released the model. The model has 1 billion parameters and took about three epochs, or three cycles through the training data.<\/p>\n<p class=\"article-content__img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2647\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/01\/6384013560858214868901966.png\" alt=\"\" width=\"660\" height=\"363\" \/><\/p>\n<p>The final version of TinyLlama outperforms existing open source language models of similar size, including Pythia-1.4B, OPT-1.3B, and MPT-1.3B. This marks a milestone and opens up new possibilities for the field of language models.<\/p>\n<p>Not only is this model small, but its superior performance makes it ideal for deployment on edge devices as it only takes up 637MB of storage space. Even more exciting is that TinyLlama can also be used to assist in the inferred decoding of larger models, which provides a more flexible solution for tasks that rely on large models.<span class=\"spamTxt\">advanced<\/span>A tutorial by Andrej Karpathy, Director of AI and now at OpenAI, was cited, highlighting the potential of TinyLlama in this area.<\/p>\n<p>The TinyLlama team designed it to be a compact version of Meta&#039;s open source language model Llama2, and even has the same architecture and word segmenter. This means that it can be easily embedded into projects built on Llama, providing researchers and practitioners with an &quot;attractive&quot; platform for language model research. Despite its small size, TinyLlama has demonstrated a wide range of uses in multi-domain language model research.<\/p>\n<p>In practical applications, Awni Hannun, a machine learning research scientist at Apple, fine-tuned TinyLlama for LoRA on an 8GB Mac Mini using MLX (Apple&#039;s open source training toolkit), which shows the flexibility and plasticity of this model in various scenarios. The team said, &quot;With its compact architecture and excellent performance, TinyLlama can realize end-user applications on mobile devices and become a lightweight platform for testing innovative ideas related to language models.&quot;<\/p>\n<p>With the release of TinyLlama, the team said they plan to release \u201cimproved versions\u201d that include plans to expand its performance and versatility. This opens up more possibilities for future language model research.<\/p>\n<p>This is also a small<a href=\"https:\/\/www.1ai.net\/en\/tag\/ai%e6%a8%a1%e5%9e%8b\" title=\"[View articles tagged with [AI models]]\" target=\"_blank\" >AI Models<\/a>Some companies have begun to focus on making relatively small but superior models to reduce the cost of hardware operation. Microsoft&#039;s Phi project is one of them, and its Phi-2 model exceeds the model 25 times in size, showing the potential of small models. Google also announced the launch of Gemini Nano, a small version of its new flagship base model, which is expected to be about 3.2 billion parameters in size.<\/p>\n<p>These small models outperform by using synthetic data generated by larger models during training. This trend is driving innovation in the field of artificial intelligence and has enabled many small models to perform comparable to cutting-edge models like OpenAI\u2019s GPT.<\/p>\n<p>Project URL: https:\/\/github.com\/jzhang38\/TinyLlama<\/p>","protected":false},"excerpt":{"rendered":"<p>With some expectation, the TinyLlama project has released an impressive open source model. The project was launched last September and developers are committed to training a small model on the trillion mark. After some hard work and some setbacks, the TinyLlama team has now released the model. The model has 1 billion parameters and has been used for approximately three periods of training data, or three cycles of training data. The final version of TinyLlama has outperformed existing open-source language models of considerable size, including Pythia-1.4B, OPT-1.3B and MPT-1.3B. This marked a milestone and opened up new possibilities for the development of the field of linguistic models. This model is not just small<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[167,861,862],"collection":[],"class_list":["post-2646","post","type-post","status-publish","format-standard","hentry","category-news","tag-ai","tag-tinyllama","tag-862"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/2646","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=2646"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/2646\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=2646"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=2646"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=2646"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=2646"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}