{"id":18152,"date":"2024-08-17T09:45:48","date_gmt":"2024-08-17T01:45:48","guid":{"rendered":"https:\/\/www.1ai.net\/?p=18152"},"modified":"2024-08-17T09:45:48","modified_gmt":"2024-08-17T01:45:48","slug":"%e5%b0%8f%e8%80%8c%e5%bc%ba%e6%82%8d%ef%bc%8110%e4%ba%ba%e5%9b%a2%e9%98%9f%e7%82%bc%e5%87%ba%e9%a6%96%e4%b8%aa%e5%be%ae%e8%b0%83llama-3-1-405b","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/18152.html","title":{"rendered":"Small but powerful! A 10-person team refines the first fine-tuned Llama 3.1 405B"},"content":{"rendered":"<p data-pm-slice=\"0 0 []\">A small team of only 10 people dares challenge the status of the tech giant Meta, which is a realistic version of David\u2019s victory over Goliath!<\/p>\n<p data-track=\"24\">This name is<a href=\"https:\/\/www.1ai.net\/en\/tag\/nous-research\" title=\"_Other Organiser\" target=\"_blank\" >Nous Research<\/a>of<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%88%9d%e5%88%9b%e5%85%ac%e5%8f%b8\" title=\"[Sees articles with labels]\" target=\"_blank\" >Startups<\/a>They are not unknown. They just launched<a href=\"https:\/\/www.1ai.net\/en\/tag\/hermes3\" title=\"[Sees articles with [Hermes3] label]\" target=\"_blank\" >Hermes3<\/a>, is based on<a href=\"https:\/\/www.1ai.net\/en\/tag\/llama\" title=\"_Other Organiser\" target=\"_blank\" >Llama<\/a>3.1 405B<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%a8%a1%e5%9e%8b\" title=\"_Other Organiser\" target=\"_blank\" >Model<\/a>It's fine-tuned. Don't look at the small number of teams, but they can't be underestimated. This Ten Man Sky group has successfully fine-tuned multiple models like Mistral, Yi, Llama, with over 33 million downloads. It's like an AI bomb maker!<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-18153\" title=\"get-319\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/08\/get-319.jpg\" alt=\"get-319\" width=\"979\" height=\"365\" \/><\/div>\n<p data-track=\"25\">The emergence of Hermes3 is like a shot in the arm for the AI world. Even after FP8 quantization, its performance is still amazing. This optimization not only greatly reduces the VRAM and disk requirements of the model, but also allows Hermes3 to run on a single node, which is a boon for developers!<\/p>\n<p data-track=\"26\">Hermes3 is a versatile player in terms of conversational capabilities. Whether it is long-term memory, multi-turn conversations, role-playing or internal monologues, it can handle it with ease. Thanks to Llama3.1&#039;s 128K context window, Hermes3 is like an experienced diplomat in maintaining the coherence of the conversation.<\/p>\n<p data-track=\"27\">But Hermes3 is more than that. It demonstrates a range of advanced capabilities beyond traditional language modeling, and is able to understand and assess the quality of generated text in a sophisticated and nuanced way. This means that it is not only eloquent, but also a rigorous text critic!<\/p>\n<p data-track=\"28\">More strikingly, Hermes3 has also brought together several intelligence capabilities, including structured outputs, intermediate steps, internal monologues to achieve transparent decision-making. It's like putting a \"transparent brain\" on AI, so we can look at its thinking\u3002<\/p>\n<p data-track=\"29\">Hermes 3's training process is called a \"demon training\" for the AI community. It went through two stages: oversight fine-tuning (SFT) and direct preference optimization (DPO). It took the team five months to screen and build the SFT data set, and this focus and patience is a most admirable one\u3002<\/p>\n<p data-track=\"30\">Nous Research, a private applied research team founded in 2023 and based in New York, is literally the \"brubber invader\" of AI. They strongly believe in the power of open sources and pledge to challenge the innovation constraints of closed technologies. The slogan of the company is loud and hot: \u201cWe challenge the assumption that closed technology will always hold the peak of innovation, rather, we provide powerful open-source codes.\u201d<\/p>\n<p data-track=\"31\">In just over a year, Nous Research has released 5 data sets and 89 models. This high productivity seems to declare to the world: size is not important, strength is king!<\/p>\n<p data-track=\"32\">Paper address: https:\/\/nousresearch.com\/wp-content\/uploads\/2024\/08\/Hermes-3-Technical-Report.pdf<\/p>\n<p data-track=\"33\">Official introduction: https:\/\/nousresearch.com\/freedom-at-the-frontier-hermes-3\/<\/p>\n<p>&nbsp;<\/p>","protected":false},"excerpt":{"rendered":"<p>A small team of only 10 people dared to challenge the status of the tech giant Meta, which is a realistic version of David\u2019s victory over Goliath. Nous Research\u2019s original company is no unknown. The Hermes 3 that they have just launched is based on the 405B model of Llama3.1. Don't look at the small number of teams, but they can't be underestimated. The Ten Man Sky has successfully fine-tuned many models, including Mistral, Yi, Llama, with over 33 million downloads, and it's like the \"explosive builder\" of the AI<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[4043,184,4042,309,1489],"collection":[],"class_list":["post-18152","post","type-post","status-publish","format-standard","hentry","category-news","tag-hermes3","tag-llama","tag-nous-research","tag-309","tag-1489"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/18152","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=18152"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/18152\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=18152"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=18152"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=18152"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=18152"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}