{"id":17475,"date":"2024-08-07T17:55:22","date_gmt":"2024-08-07T09:55:22","guid":{"rendered":"https:\/\/www.1ai.net\/?p=17475"},"modified":"2024-08-07T17:55:22","modified_gmt":"2024-08-07T09:55:22","slug":"%e6%9c%88%e4%b9%8b%e6%9a%97%e9%9d%a2-kimi-%e5%bc%80%e6%94%be%e5%b9%b3%e5%8f%b0%e4%b8%8a%e4%b8%8b%e6%96%87%e7%bc%93%e5%ad%98-cache-%e5%ad%98%e5%82%a8%e8%b4%b9%e7%94%a8%e9%99%8d%e4%bb%b7-50%ef%bc%9a","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/17475.html","title":{"rendered":"Dark Side of the Moon Kimi Open Platform Context Cache Storage Fee Price Reduction 50%: Current Price 5 Yuan\/1M tokens\/min"},"content":{"rendered":"<p data-track=\"1\" data-pm-slice=\"0 0 []\">AI unicorn companies<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%9c%88%e4%b9%8b%e6%9a%97%e9%9d%a2\" title=\"[Sees articles with labels]\" target=\"_blank\" >Dark Side of the Moon<\/a>Announced today,<a href=\"https:\/\/www.1ai.net\/en\/tag\/kimi\" title=\"[View articles tagged with [Kimi]]\" target=\"_blank\" >Kimi<\/a> <a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%94%be%e5%b9%b3%e5%8f%b0\" title=\"_Other Organiser\" target=\"_blank\" >open platform<\/a>The context cache Cache storage cost is reduced by 50%, and the Cache storage cost is reduced from $10 \/ 1M tokens \/ min to $5 \/ 1M tokens \/ min, effective immediately.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-17476\" title=\"get-216\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/08\/get-216.jpg\" alt=\"get-216\" width=\"1380\" height=\"736\" \/><\/div>\n<p data-track=\"9\">July 1, Kimi open platform context caching (Context Caching) function opened public beta. The official said that the technology in the API price remains unchanged under the premise of the developer can reduce the maximum 90% long text flagship large model use cost, and improve the model response speed.<\/p>\n<p data-track=\"10\">IT home with Kimi open platform context caching function public beta details are as follows:<\/p>\n<h1 class=\"pgc-h-arrow-right\" spellcheck=\"false\" data-track=\"11\">Technical Brief<\/h1>\n<p data-track=\"12\">Context caching is described as a data management technique that allows a system to pre-store large amounts of data or information that will be frequently requested. When a user requests the same information, the system can provide it directly from the cache without recalculating or retrieving it from the original data source.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-17477\" title=\"get-217\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/08\/get-217.jpg\" alt=\"get-217\" width=\"1080\" height=\"608\" \/><\/div>\n<h1 class=\"pgc-h-arrow-right\" spellcheck=\"false\" data-track=\"13\">Applicable scenarios<\/h1>\n<p data-track=\"14\">Context caching is suitable for frequent requests and repeated references to a large number of initial context scenarios, which can reduce the cost of long text models and improve efficiency. Officially, the maximum cost reduction is 90 %, and the first Token delay reduction is 83%. The applicable business scenarios are as follows:<\/p>\n<ul>\n<li data-track=\"15\">QA Bot with lots of preset content, e.g. Kimi API helper<\/li>\n<li data-track=\"16\">Frequent queries for a fixed set of documents, such as public company disclosure Q&amp;A tools<\/li>\n<li data-track=\"17\">Periodic analysis of static codebases or knowledge bases, such as various Copilot Agents<\/li>\n<li data-track=\"18\">Pop-up AI apps with huge instantaneous traffic, e.g. Coax Simulator, LLM Riddles<\/li>\n<li data-track=\"19\">Agent-like applications with complex interaction rules, etc.<\/li>\n<\/ul>\n<h1 class=\"pgc-h-arrow-right\" spellcheck=\"false\" data-track=\"20\">Billing Instructions<\/h1>\n<p data-track=\"21\">The Context Cache charging model is divided into three main parts:<\/p>\n<h1 class=\"pgc-h-arrow-right\" spellcheck=\"false\" data-track=\"22\">Cache creation costs<\/h1>\n<ul>\n<li data-track=\"23\">Call the Cache creation interface, after successfully creating a Cache, the actual amount of Tokens in the Cache will be billed. 24$\/M token<\/li>\n<\/ul>\n<h1 class=\"pgc-h-arrow-right\" spellcheck=\"false\" data-track=\"24\">Cache storage costs<\/h1>\n<ul>\n<li data-track=\"25\">Cache storage fees are charged by the minute during the Cache's survival time. $10 \/ M token \/ minute.<\/li>\n<\/ul>\n<h1 class=\"pgc-h-arrow-right\" spellcheck=\"false\" data-track=\"26\">Cache call charges<\/h1>\n<ul>\n<li data-track=\"27\">Charges for Cache calls to incremental tokens: per-model pricing<\/li>\n<li data-track=\"28\">Cache call times charge: Cache survival time, the user through the chat interface request has been created successfully Cache, if the content of the chat message and the survival of the Cache match successfully, according to the number of times to call the Cache call charge. 0.02 yuan \/ times<\/li>\n<\/ul>\n<h1 class=\"pgc-h-arrow-right\" spellcheck=\"false\" data-track=\"29\">Public test time and eligibility instructions<\/h1>\n<ul>\n<li data-track=\"30\">Public test period: 3 months after the function is launched, the price may be adjusted at any time during the public test period.<\/li>\n<li data-track=\"31\">Public beta eligibility: Context Caching is available to Tier 5 users during the beta period, with other users to be released at a later date.<\/li>\n<\/ul>\n<p>&nbsp;<\/p>","protected":false},"excerpt":{"rendered":"<p>AI unicorn company Dark Side of the Moon today announced that Kimi Open Platform's Context Caching Cache storage cost has been reduced by 50%, and the Cache storage cost has been reduced from $10 \/ 1M tokens \/ min to $5 \/ 1M tokens \/ min, which will take effect from now on. July 1, Kimi open platform Context Caching (Context Caching) function to open public testing. Officials said that the technology in the API price remains unchanged under the premise of the developer can reduce the maximum 90% long text flagship large model use cost, and improve the model response speed. IT home with Kimi open platform Context Caching function public testing details are as follows: Technology Introduction<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[1814,3220,1168],"collection":[],"class_list":["post-17475","post","type-post","status-publish","format-standard","hentry","category-news","tag-kimi","tag-3220","tag-1168"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/17475","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=17475"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/17475\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=17475"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=17475"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=17475"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=17475"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}