{"id":30740,"date":"2025-03-14T20:53:15","date_gmt":"2025-03-14T12:53:15","guid":{"rendered":"https:\/\/www.1ai.net\/?p=30740"},"modified":"2025-03-14T20:53:15","modified_gmt":"2025-03-14T12:53:15","slug":"%e6%b8%85%e5%8d%8e%e5%9b%a2%e9%98%9f%e5%bc%80%e6%ba%90%e5%a4%a7%e6%a8%a1%e5%9e%8b%e6%8e%a8%e7%90%86%e5%bc%95%e6%93%8e%e8%b5%a4%e5%85%94-chitu%ef%bc%8c%e5%ae%9e%e7%8e%b0-deepseek","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/30740.html","title":{"rendered":"Tsinghua team open-sources large model inference engine \"Chitu\", realizing DeepSeek inference to halve cost and double performance"},"content":{"rendered":"<p>March 14th.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%b8%85%e5%8d%8e\" title=\"[Sees articles with [Qinghua] labels]\" target=\"_blank\" >Tsinghua University<\/a>Professor Zhai Jidong's team at the Institute of High Performance Computing at the University, and Tsinghua-based startup Qingcheng Jizhi, jointly announced today that the<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%a4%a7%e6%a8%a1%e5%9e%8b%e6%8e%a8%e7%90%86%e5%bc%95%e6%93%8e\" title=\"[Sees articles with labels of the Great Model Logic Engine]\" target=\"_blank\" >Large Model Inference Engine<\/a>\u201c<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%b5%a4%e5%85%94\" title=\"Look at the article with the label\" target=\"_blank\" >Chiba<\/a> Chitu\" is now open source.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-30741\" title=\"85d7fc1dj00st46f80041d000fa0088p\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/03\/85d7fc1dj00st46f80041d000fa0088p.jpg\" alt=\"85d7fc1dj00st46f80041d000fa0088p\" width=\"550\" height=\"296\" \/><\/p>\n<p>According to the introduction, this engine realizes for the first time to run FP8 accuracy models natively on non-NVIDIA Hopper architecture GPUs and various types of domestic chips, halving the cost and doubling the performance of DeepSeek inference. Positioned as a \"production-grade large model inference engine\", it offers the following features:<\/p>\n<ul>\n<li>Diversified computing power adaptation: not only supports NVIDIA's latest flagship to the old series of products, but also provides optimization support for domestic chips.<\/li>\n<li>Full Scenario Scalability: From CPU-only deployments, single GPU deployments to large-scale cluster deployments, Red Rabbit Engine provides scalable solutions.<\/li>\n<li>Long-term stable operation: can be applied to the actual production environment, stable enough to carry concurrent business traffic.<\/li>\n<\/ul>\n<p>Officially, the current open-source Red Rabbit Engine, when deployed with the DeepSeek-R1-671B full-blooded version, achieved a 3.15 times increase in inference speed while reducing GPU usage by 50% in comparison to some foreign open-source frameworks in the tests on the A800 cluster.<\/p>\n<p>1AI with open source address: https:\/\/github.com\/thu-pacman\/chitu<\/p>","protected":false},"excerpt":{"rendered":"<p>On March 14th, the Qinghua University Institute of High Performance Calculator, the Zhong Winter Professor's team and the Qinghua Department of Technology, jointly announced today that the Big Model Logic Engine Red Chitu is now open. The engine was described as having achieved, for the first time, a prototype FP8 accuracy model running on non-British Webper architecture and various nationally produced chips, reducing DeepSeek reasoning costs by half and doubling performance. It is located as a \u201cproduction-grade large model reasoning engine\u201d and provides the following characteristics: Multi-comparison fit: not only supports NVIDIA's latest flagship-to-old multi-series products, but also optimizes support for national production chips. The whole scene is scalable: from pure CPU deployment, single GPU deployment to large-scale cluster deployment, Red Rabbit<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[5980,391,841,5981],"collection":[],"class_list":["post-30740","post","type-post","status-publish","format-standard","hentry","category-news","tag-5980","tag-391","tag-841","tag-5981"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/30740","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=30740"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/30740\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=30740"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=30740"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=30740"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=30740"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}