Genre GLM-5-Turbo tops, bytes, millimetres, four models, top 10 worldwide

31 March, Agent Evaluation Agency ClawBench It was released yesterdayLarge ModelChecklist, covering 30 complex Agent missions, covering five core business scenarios of office collaboration, information retrieval, content creation, data processing and software engineering。

Genre GLM-5-Turbo tops, bytes, millimetres, four models, top 10 worldwide

This list includes over 40 major mainstream models, and the top 10 of the world's top four national production models, i.e., spectra, byte, and millimetres。

GLM-5-Turbo, with 93.9 points of CLAW SCORE at the top of the list, is the most highly performing model of the evaluation

Byte beat Doubao-Seed-2.0-lite is second in 93.1 with only $0.33, the lowest in the list

MiMo-V2-Omni is ranked 9th in 91.2 and runs at the fastest speed and takes only 848 seconds to complete the full task flow。

From the overall list, OpenAI GPT-54 ranks third in 92.2, Claude Opus 4.5 ranks seventh in 91.5, and Ali Qwen3.5-35B-A3B stands eighth in 91.4。

ClawBench uses a sandbox enforcement mechanism, where each model is designed to perform its tasks in a genuinely simulated business development environment and deliberately embeds engineering challenges such as "unsatisfactory name " "missing directory" "date trap"。

In terms of scoring, ClawBench introduced a “triple scoring mechanism” with automated script assertions based on the type of task, front-line LLM acting as “expert assessor” and a mixed rating that combines the weighting of the two, with a view to more accurately reflecting the actual deployment capacity of the model in a complex workflow。

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

OpenAI close Sora details: losing millions a day, Disney executives learned an hour before the announcement

2026-3-31 11:37:51

Information

BLOOMBERG: AI IS STEALING THE FIRST JOB OF A YOUNG MAN IN LONDON

2026-3-31 11:42:32

Search