Grok 4.1 Highest LMArena Highlight, achieving 33%1 position jump

November 18th, Elon MaskElon MuskA.I.C xA Yesterday (17 November) announced the launch of the latest Big Language Model Grok 4.1And it's already facing i don't knowAll users of the X platform and mobile applications (iOS and Andre)。

Grok 4.1 Highest LMArena Highlight, achieving 33%1 position jump

The purpose of this update is to raise the overall availability of Grok in the real world. Officially, Grok 4.1 not only inherited the ingenuity and high reliability of previous-generation models, but also made significant improvements in creativity, emotional understanding and collaborative interaction, allowing them to more accurately sense user nuances and provide more attractive and consistent experience of dialogue。

The performance of Grok 4.1 has achieved the highest level of industry. On the LMArena list of text skills, the version with the ability to think in depth (code: quasarflux) is ranked first at 1483 Elo points, with the second highest number of 31 points。

1AI ATTACHES THE FOLLOWING SCREENSHOT:

Grok 4.1 Highest LMArena Highlight, achieving 33%1 position jump

More strikingly, its “immediate response” version, which does not require in-depth reflection, ranks second in the 1465 Elo fraction, with performance going even beyond the “total reasoning” model of all other models. This achievement represents a significant leap in comparison with the previous generation, Grok 4 (33rd place), and confirms its absolute advantage in the bottom capacity。

In addition to its performance in the generic competency benchmarking test, Grok 4.1 also made significant progress in “soft power”. The new model performed well in the Creative Writer v3 test, which measured model intelligence for the EQ-Bench3 benchmark and assessed creative capabilities。

In the EQ-Bench3 benchmarking test, which assessed emotional understanding, insight and interpersonal ability, Grok 4.1 ' s reasoning and non-debative model captured the top two lists。

In the area of creative writing, according to the results of the Creative Writer v3 baseline test, the two models of Grok 4.1 rank second and third, after the earlier GPT-51 model。

This means that not only can Grok 4.1 deal with complex logics, but also can better understand and respond to the hints of human emotion and create imaginative content, making it more “human” in human interaction。

Another key improvement is the significant reduction of the “discovery” rate of the model. For rapid response models equipped with search tools, factual errors can occur due to the depth of reasoning and the limited budget of the tool。

x.ai, at the advanced stage of training in Grok 4.1, focuses on the reduction of factual hallucinations, especially in order to optimize information query-type tips. Based on an assessment of the real-world search sample, the hallucinogenic rate of the new model has been significantly reduced, thus providing users with more reliable and accurate information。

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

JAPAN RELEASED AI FORECAST MAPS TO ASSESS THE PROBABILITY OF A BEAR OUT OF SAPPORO AKITA AND OTHERS IN TOKYO

2025-11-17 11:38:21

Information

INVESTORS SOLD $100 MILLION WORTH OF BRITISH WEED STOCK, AND AI'S BUBBLES GOT WORSE

2025-11-18 12:03:13

Search