{"id":28612,"date":"2025-02-12T11:50:53","date_gmt":"2025-02-12T03:50:53","guid":{"rendered":"https:\/\/www.1ai.net\/?p=28612"},"modified":"2025-02-12T11:50:53","modified_gmt":"2025-02-12T03:50:53","slug":"%e5%8d%95%e6%9c%ba%e5%8d%b3%e5%8f%af%e9%83%a8%e7%bd%b2%e8%bf%90%e8%a1%8c-deepseek-r1-671b-%e6%a8%a1%e5%9e%8b%ef%bc%8c%e6%b5%aa%e6%bd%ae%e4%bf%a1%e6%81%af%e6%8e%a8%e5%87%ba%e5%85%83%e8%84%91-r1","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/28612.html","title":{"rendered":"DeepSeek R1 671B model can be deployed and run on a standalone machine, Wave Information launches Metabrain R1 inference server."},"content":{"rendered":"<p>February 12 news.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%b5%aa%e6%bd%ae%e4%bf%a1%e6%81%af\" title=\"_Other Organiser\" target=\"_blank\" >Wave Information<\/a>Today announced the launch of the Metabrain R1 <a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%8e%a8%e7%90%86%e6%9c%8d%e5%8a%a1%e5%99%a8\" title=\"[Sees articles with [Deference Server] labels]\" target=\"_blank\" >Inference Server<\/a>, through system innovation and optimization of hardware and software synergies.<strong><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%8d%95%e6%9c%ba\" title=\"_Other Organiser\" target=\"_blank\" >stand-alone<\/a>Ready to deploy <a href=\"https:\/\/www.1ai.net\/en\/tag\/deepseek\" title=\"[View articles tagged with [DeepSeek]]\" target=\"_blank\" >DeepSeek<\/a> Model R1 671B<\/strong>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-28613\" title=\"5f3452d8j00srjxbu00b2d000pa00mop\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/02\/5f3452d8j00srjxbu00b2d000pa00mop.jpg\" alt=\"5f3452d8j00srjxbu00b2d000pa00mop\" width=\"910\" height=\"816\" \/><\/p>\n<p>Note: DeepSeek open-sources a multi-version model in which the<strong>DeepSeek R1 671B model as a fully parametric base macromodel<\/strong>, which provides stronger generalization, higher accuracy and better contextual understanding than the distillation model, but also places higher demands on the system's video memory capacity, video memory bandwidth, interconnect bandwidth and latency:<\/p>\n<p><strong>At least 800GB of memory is required at FP8 accuracy, and 1.4TB or more at FP16 \/ BF16 accuracy.<\/strong>.<\/p>\n<p>In addition, DeepSeek R1 is a typical long chain-of-mind model with short-input, long-output applications, and the inference and decoding phase relies on higher memory bandwidth and very low communication latency.<\/p>\n<p>The Metabrain R1 Reasoning Server NF5688G7 comes with the FP8 compute engine natively.<strong>Provides 1128GB of HBM3e memory.<\/strong>In order to meet the requirement of no less than 800GB of video memory capacity under FP8 accuracy of 671B model, and to retain sufficient KV cache space while supporting full model inference on a stand-alone basis, the video memory bandwidth of this machine is up to 4.8 TB\/s.<\/p>\n<p>In terms of communication, GPU P2P bandwidth reaches 900GB\/s, and based on the latest inference framework, it can support 20-30 users concurrently on a single machine. At the same time, a single NF5688G7 is equipped with 3200Gbps lossless expansion network, which can realize agile expansion according to the growth of user's business demand and provide R1 server cluster Turnkey solution.<\/p>\n<p>The Metabrain R1 Reasoning Server NF5868G8 is a high-throughput reasoning server designed for Large Reasoning Model.<strong>Industry's first 16 standard PCIe double-width cards on a single machine<\/strong>It provides up to 1536GB of memory and supports standalone deployment of DeepSeek 671B models at FP16\/BF16 accuracy.<\/p>\n<p>The machine adopts a 16-card fully interconnected topology based on PCIe fabric, and the P2P communication bandwidth of any two cards can reach 128GB\/s, reducing the communication latency by more than 60%. Through the optimization of hardware and software collaboration, compared with the traditional 2-machine and 8-card PCIe model, the NF5868G8 can improve the inference performance of the DeepSeek 671B model by nearly 40%, and it currently supports a wide range of AI acceleration card options. The NF5868G8 supports multiple AI acceleration card options.<\/p>","protected":false},"excerpt":{"rendered":"<p>On February 12, the wave message announced today the launch of the neural R1 reasoning server, which will allow the deployment of the DeepSeek R1,671B model on a single machine through system innovation and soft and hard synergy optimization. Note: The DeepSeek open source multi-version model, in which the DeepSeek R1,671B model is used as a base large-scale full-parameter model, with greater panorama, greater accuracy and better context understanding than the distillation model, but also higher requirements for the system's visible capacity, visible bandwidth, interconnectivity bandwidth and delay: at least 800 GB memory at FP8 accuracy and 1.4 TB above FP16\/BF16 accuracy. Besides, Dee<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[3606,5696,5698,5697],"collection":[],"class_list":["post-28612","post","type-post","status-publish","format-standard","hentry","category-news","tag-deepseek","tag-5696","tag-5698","tag-5697"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/28612","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=28612"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/28612\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=28612"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=28612"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=28612"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=28612"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}