Small but tough! 10-member team refines first fine-tuned Llama 3.1 405B

A small team of only 10 people dared to challenge the tech giant Meta, which is like a real-life version of "David over Goliath"!

This name isNous ResearchofStartupsThey are not unknown. They just launchedHermes3, is based onLlama3.1 405BModelFine-tuned. Don't look at the team's small number, but their strength can not be underestimated. This "ten-member team" has successfully fine-tuned a number of models such as Mistral, Yi, Llama, etc., and the number of downloads exceeded 33 million times, which is simply the AI world's "hit-making machine"!

Small but powerful! A 10-person team refines the first fine-tuned Llama 3.1 405B

The emergence of Hermes3 is like a shot in the arm for the AI world. Even after FP8 quantization, its performance is still amazing. This optimization not only greatly reduces the VRAM and disk requirements of the model, but also allows Hermes3 to run on a single node, which is a boon for developers!

Hermes3 is a versatile player in terms of conversational capabilities. Whether it is long-term memory, multi-turn conversations, role-playing or internal monologues, it can handle it with ease. Thanks to Llama3.1's 128K context window, Hermes3 is like an experienced diplomat in maintaining the coherence of the conversation.

But Hermes3 is more than that. It demonstrates a range of advanced capabilities beyond traditional language modeling, and is able to understand and assess the quality of generated text in a sophisticated and nuanced way. This means that it is not only eloquent, but also a rigorous text critic!

What is even more amazing is that Hermes3 also integrates several intelligence capabilities, including structured output, output of intermediate steps, and generation of internal monologues for transparent decision-making. This is like equipping an AI with a "transparent brain", allowing us to get a glimpse of its thinking process.

The training process of Hermes3 is a "devil's training" in the AI world. It went through two phases: supervised fine-tuning (SFT) and direct preference optimization (DPO). It took the team five months to screen and build the SFT dataset, and the dedication and patience is nothing short of awe-inspiring.

Nous Research, a private applied research group founded in 2023 and based in New York City, is the "barbarian invader" of the AI world. They believe in the power of open source, and vow to challenge the limits of innovation in closed technologies. The company's slogan is so loud it makes your blood boil: "We challenge the assumption that closed technologies will always be at the top of the innovation ladder; instead, we provide powerful open source code."

In just over a year, Nous Research has released 5 data sets and 89 models. This high productivity seems to declare to the world: size is not important, strength is king!

Paper address: https://nousresearch.com/wp-content/uploads/2024/08/Hermes-3-Technical-Report.pdf

Official introduction: https://nousresearch.com/freedom-at-the-frontier-hermes-3/

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

{{userData.name}}Verify

Small but powerful! A 10-person team refines the first fine-tuned Llama 3.1 405B

Received a 28-page infringement notice! Mita AI search no longer includes CNKI document titles and abstracts

Runway releases Gen-3 Alpha Turbo: AI video generation speed increased by 7 times and cost halved!

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

Received a 28-page infringement notice! Mita AI search no longer includes CNKI document titles and abstracts

Runway releases Gen-3 Alpha Turbo: AI video generation speed increased by 7 times and cost halved!

AI voice company ElevenLabs raises $80 million in Series B funding, with a valuation of over $1 billion

Two American voice actors sued startup Lovo over their voices being used to train AI

Japanese startup Carelogy launches pet care app: Using AI to help owners identify whether their cats are in pain

B station open source lightweight Index-1.9B series model: 2.8T training data, support role-playing

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow