{"id":27256,"date":"2025-01-17T18:47:46","date_gmt":"2025-01-17T10:47:46","guid":{"rendered":"https:\/\/www.1ai.net\/?p=27256"},"modified":"2025-01-17T18:47:46","modified_gmt":"2025-01-17T10:47:46","slug":"%e7%a0%94%e7%a9%b6%e5%85%ac%e5%8f%b8%e5%85%ac%e5%b8%83-swiftkv-%e6%8a%80%e6%9c%af%ef%bc%9a%e4%bc%98%e5%8c%96%e5%a4%a7%e6%a8%a1%e5%9e%8b%e6%8f%90%e7%a4%ba%e8%af%8d%e5%a4%84%e7%90%86%e8%bf%87%e7%a8%8b","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/27256.html","title":{"rendered":"Research Firm Announces SwiftKV Technology: Optimizes Large Model Cue Word Processing, Reduces 50% AI Inference Time"},"content":{"rendered":"<p>Jan. 17 (Bloomberg) -- Research firm <a href=\"https:\/\/www.1ai.net\/en\/tag\/snowflake\" title=\"[See articles with [Snowflake] labels]\" target=\"_blank\" >Snowflake<\/a> Announced a new product called \"<a href=\"https:\/\/www.1ai.net\/en\/tag\/swiftkv\" title=\"_Other Organiser\" target=\"_blank\" >SwiftKV<\/a>&quot;of <a href=\"https:\/\/www.1ai.net\/en\/tag\/ai%e6%a8%a1%e5%9e%8b%e8%b0%83%e6%a0%a1\" title=\"[SEES ARTICLES WITH [AI MODEL] LABELS]\" target=\"_blank\" >AI model tuning<\/a>The company has also opened three Llama 3.1 AI models tuned with \"SwiftKV\" technology on Hugging Face (<a href=\"https:\/\/huggingface.co\/collections\/Snowflake\/swiftkv-models-674f7d7474eb789e185d31cb\">Click here to visit<\/a>).<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-27257\" title=\"506de03cj00sq8ba5007dd000v900php\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/01\/506de03cj00sq8ba5007dd000v900php.jpg\" alt=\"506de03cj00sq8ba5007dd000v900php\" width=\"1125\" height=\"917\" \/><\/p>\n<p>1AI has learned that the core of SwiftKV's technology is to optimize the model's cue word processing. The researchers note that typically the most computationally intensive part of a large model is processing the prompts that users enter for the model, and that many organizations customize extremely long prompts for their models, which on average are said to be \"about 10 times as long as the output of the generated content\".<\/p>\n<p>According to Snowflake, this \"SwiftKV\" model tuning technology is specially optimized for the corresponding prefabricated cue word processing, which is said to break through the traditional Key-Value (KV) cache compression technology, and also introduces model restructuring and knowledge preservation self-distillation methods in the model inference process, thus effectively improving model throughput and reducing latency and computation costs. This effectively improves model throughput, reduces latency and computing costs, and is claimed to help AI models significantly reduce inference time.<strong>Can reduce model 50% inference time<\/strong>.<\/p>\n<p>Experimental results show that after optimizing Llama 3.1's 8 and 70 billion parameter models with SwiftKV, the overall throughput of the corresponding models can be tripled, and the corresponding models also perform well in terms of code auto-completion and text summarization.<\/p>","protected":false},"excerpt":{"rendered":"<p>On January 17th, the research company Snowflake published an AI model calibration technique called \u201cSwiftKV\u201d and the Llama 3.1 AI model (point of access) that was adapted using the \u201cSwiftKV\u201d technology in three sections of Hugging Face Open Source. 1AI was informed that the core of SwiftKV technology is the optimization of the model alert process. Researchers have pointed out that, in general, the most resource-consuming component of large models is the processing of user-inputed hints for models, while many enterprises have self-defined very long hints for models, which are said to average \u201cabout 10 times the output-generated content\u201d. According to Snowflake, this \"SwiftKV\"<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[5578,5579,5577],"collection":[],"class_list":["post-27256","post","type-post","status-publish","format-standard","hentry","category-news","tag-ai","tag-snowflake","tag-swiftkv"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/27256","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=27256"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/27256\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=27256"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=27256"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=27256"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=27256"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}