{"id":51286,"date":"2026-03-19T12:14:16","date_gmt":"2026-03-19T04:14:16","guid":{"rendered":"https:\/\/www.1ai.net\/?p=51286"},"modified":"2026-03-19T12:14:16","modified_gmt":"2026-03-19T04:14:16","slug":"kimi-%e9%a6%96%e6%8a%ab-k2-5-%e6%8a%80%e6%9c%af%e8%b7%af%e7%ba%bf%e5%9b%be%ef%bc%9a%e4%b8%89%e5%a4%a7%e5%ba%95%e5%b1%82%e9%87%8d%e6%9e%84%ef%bc%8c%e9%a9%ac%e6%96%af%e5%85%8b%e7%82%b9%e8%b5%9e%e3%80%8c","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/51286.html","title":{"rendered":"Kimi Pit K2.5 Technology Road Map: Three Bottom Reconstructions, Mask's \"enlightened\""},"content":{"rendered":"<p>March 19th news, yesterday, the dark side of the moon <a href=\"https:\/\/www.1ai.net\/en\/tag\/kimi\" title=\"[View articles tagged with [Kimi]]\" target=\"_blank\" >Kimi<\/a> Founder<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%9d%a8%e6%a4%8d%e9%ba%9f\" title=\"[Sees articles with [Yan Shizuku] label]\" target=\"_blank\" >Yang Chik Lun<\/a>The keynote speech \" How We Scaled Kimi K2.5 \" , delivered at the GTC 2026 Congress in Wevinda, systematically revealed for the first time a complete technology road map for Kimi, re-engineered around three sub-structures:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-51287\" title=\"82a8220j00tc4p2m001fd000u000idm\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2026\/03\/82a82220j00tc4p2m001fd000u000idm.jpg\" alt=\"82a8220j00tc4p2m001fd000u000idm\" width=\"1080\" height=\"661\" \/><\/p>\n<p><strong>MuonClip Optimizer<\/strong>(a) For the Adam Optimizer, which has been in use since 2014, the team introduced the Newton-Schulz iterative and QK-Clip mechanisms based on the Muon Optimizer, which solved the Logits blast in the hundreds of billions of parameter-scale trainings and achieved two times the computing efficiency of the traditional Adam W<\/p>\n<p><strong>Kimi Linear<\/strong>: THE HYBRID LINEAR FOCUS STRUCTURE BASED ON THE KDA STRUCTURE CHALLENGES THE PRACTICE THAT \"ALL LAYERS MUST USE FULL ATTENTION\" BY INCREASING THE DECODER SPEED BY 5 TO 6 TIMES IN THE 128K AND 1M SUPER-LONG CONTEXT SCENARIO<\/p>\n<p><strong>Organisation<\/strong>: For a residual connection that continues for 10 years, the traditional equation is replaced by a cross layer of Softmax attention, allowing each layer to extract information proactively and selectively from the front\u3002<\/p>\n<p>Among them, the release of the Attention Reviews has generated widespread interest in the industry: the Mask review called \"an impressive\" and the former co-founder of OpenAI, Karpathy, \"it seems that we have not understood \"Attention is All You Need\" literally, and the main inventor of OpenAI o1 called it the beginning of \"Deep Learning 2.0\"\u3002<\/p>\n<p>Yang Shih stated that Kimi would adhere to the open source path by contributing to open source communities the bottom-level innovations of Muonclip, Kimi Linear and Attention Resics\u3002<\/p>","protected":false},"excerpt":{"rendered":"<p>News of March 19, yesterday, the dark side of the month. Yang Shizhi, founder of Kimi, delivered a keynote address at the GTC 2026 Congress in Wewda, How We Scaled Kimi K2.5, which systematically revealed for the first time the complete technical road map for Kimi, re-engineered around three sub-structures: Muoncrip Optimator: For Adam Optimizers since 2014, the team introduced the Newton-Schulz iterative and QK-Clip mechanisms based on the Muon Optimizer, which addressed the problem of Logits explosion in the training of trillion parameters, two times the computing efficiency of the traditional Adam W<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[1814,4994],"collection":[],"class_list":["post-51286","post","type-post","status-publish","format-standard","hentry","category-news","tag-kimi","tag-4994"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/51286","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=51286"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/51286\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=51286"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=51286"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=51286"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=51286"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}