{"id":10153,"date":"2024-05-11T12:04:57","date_gmt":"2024-05-11T04:04:57","guid":{"rendered":"https:\/\/www.1ai.net\/?p=10153"},"modified":"2024-05-11T12:04:57","modified_gmt":"2024-05-11T04:04:57","slug":"%e5%9c%a8%e5%af%8c%e5%b2%b3%e8%b6%85%e7%ae%97%e4%b8%8a%e8%ae%ad%e7%bb%83%e5%a4%a7%e6%a8%a1%e5%9e%8b%ef%bc%8c%e6%97%a5%e6%9c%ac%e8%81%94%e5%90%88%e7%a0%94%e7%a9%b6%e5%9b%a2%e9%98%9f%e5%8f%91%e5%b8%83-f","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/10153.html","title":{"rendered":"Training large models on Fugaku supercomputer, Japanese joint research team releases Fugaku-LLM"},"content":{"rendered":"<p data-vmark=\"bb49\">Composed of multiple companies and institutions<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%97%a5%e6%9c%ac\" title=\"_Other Organiser\" target=\"_blank\" >Japan<\/a>The joint research team released the Fugaku-LLM yesterday. <a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%a4%a7%e6%a8%a1%e5%9e%8b\" title=\"[View articles tagged with [large models]]\" target=\"_blank\" >Large Model<\/a>The biggest feature of this model is that it is trained on the Arm architecture supercomputer &quot;Fugaku&quot;.<\/p>\n<p data-vmark=\"7cdc\">Development of the Fugaku-LLM model began in May 2023, with initial participants including Fujitsu, owner of the Fugaku supercomputer, Tokyo Institute of Technology, Tohoku University, and the RIKEN Institute of Physical and Chemical Research (RIKEN).<\/p>\n<p data-vmark=\"86a6\">In August 2023, three other partners - Nagoya University, CyberAgent (also the parent company of game company Cygames) and HPC-AI startup Kotoba Technologies also joined the model development plan.<\/p>\n<p data-vmark=\"4046\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-10154\" title=\"7e6716a8-e352-40f3-8e34-721b309280c7\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/05\/7e6716a8-e352-40f3-8e34-721b309280c7.jpg\" alt=\"7e6716a8-e352-40f3-8e34-721b309280c7\" width=\"800\" height=\"600\" \/><\/p>\n<p>\u25b2 Fugaku supercomputer. Image source: Fujitsu press release<\/p>\n<p data-vmark=\"7c7d\">In a press release released yesterday, the research team said it had fully exploited the performance of the Fugaku supercomputer, increasing the calculation speed of matrix multiplication by 6 times and the communication speed by 3 times.<span class=\"accentTextColor\">Prove that large pure CPU supercomputers can also be used for large model training<\/span>.<\/p>\n<p data-vmark=\"54a1\"><span class=\"accentTextColor\">The parameter size of the Fugaku-LLM model is 13B<\/span>, is the largest large-scale language model in Japan.<\/p>\n<p data-vmark=\"2589\">It used 13,824 Fugaku supercomputing nodes to train on 380 billion tokens. Among its training materials, 60% were in Japanese, and the other 40% included English, mathematics, code, etc.<\/p>\n<p data-vmark=\"8603\">The model&#039;s research team claims that the Fugaku-LLM model can naturally use special expressions such as Japanese honorifics in communication.<\/p>\n<p data-vmark=\"5789\">Specifically in terms of test results, the model achieved an average score of 5.5 on the Japanese MT-Bench model benchmark test, ranking first among open models based on Japanese corpus resources, and received a high score of 9.18 in the humanities and social sciences category.<\/p>\n<p data-vmark=\"39d4\">The Fugaku-LLM model is now publicly available on GitHub and Hugging Face platforms. External researchers and engineers can use the model for academic and commercial purposes as long as they comply with the license agreement.<\/p>","protected":false},"excerpt":{"rendered":"<p>A Japanese joint research team of multiple enterprises and institutions released a large Fugaku-LLLM model yesterday. The greatest feature of the model is that it was trained in the Arm architecture on super-rich. The development of the Fugaku-LLM model was launched in May 2023, with the initial participants including Fuston, the Master of the Richness, Tokyo Industrial University, North-East Japan University and the Japan Institute of Science and Chemistry (JST). In August 2023, three other partners - Nagoya University, CyberAgent (who is also the parent of the game company Cygames) and the HPC-AI project Kotoba Technologies - joined the model development programme. Zenium\u00a0<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[216,226],"collection":[],"class_list":["post-10153","post","type-post","status-publish","format-standard","hentry","category-news","tag-216","tag-226"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/10153","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=10153"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/10153\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=10153"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=10153"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=10153"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=10153"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}