September 20th, under Elon Musk xA The company launched a new one today AI Models Grok 4 FastAccomplishing close-to-Glog 4 in business and consumption-level assignments, with an average reduction of 40% reasoning tokens。
In the performance test, Grok 4 Fast's reasoning benchmark was equivalent to Grok 4, but the average reasoning used tokens reduced 40% and prices 98%。

For example, in the AIME 2025 toolless test, the correct rate was 92.0%, surpassing Grok 3 Mini and achieving significant advantage in mathematical reasoning tasks such as HTML 2025。
In search and information acquisition, Grok 4 Fast displays multitrip search performance in front. In the LMArena search arena, the Grok-4-fast-search ranked first in the 1163 Elo score and first in the 17; in the Chinese search and cross-platform data integration missions, the accuracy rate was significantly higher than that of the same model。
In LMArena 's Text Arena, Grok-4-fast (alias tahoe) ranked 8th, with performance comparable to grok-4-0709, highlighting its remarkable intellectual density. It is worth noting that all models of equal size are ranked 18 or behind。
With regard to architecture, Grok 4 Fast for the first time harmonized the long-chain reasoning model with the rapid response model into the same model and moved dynamically through the system. This not only reduces delays, but also further reduces the cost of Token, which applies to a variety of scenarios, such as real-time search, code execution, etc. Developer can adjust the depth of reasoning to different needs by xAI API。
Grok 4 Fast is now open to all users (including free users) and free of charge at the OpenRouter and Vercel AI Gateway ceilings。
In terms of API transfers, the cost of input per 1 million tokens is $0.20 (note: current exchange rate is about RMB 1.4), and the cost of output per 1 million tokens is $0.50 (current exchange rate is about RMB 3.6)。