{"id":34921,"date":"2025-05-10T12:26:07","date_gmt":"2025-05-10T04:26:07","guid":{"rendered":"https:\/\/www.1ai.net\/?p=34921"},"modified":"2025-05-10T12:26:16","modified_gmt":"2025-05-10T04:26:16","slug":"12gb-%e6%98%be%e5%ad%98%e5%8f%af%e5%ae%9e%e7%8e%b0-128k-%e4%b8%8a%e4%b8%8b%e6%96%87-5-%e5%b9%b6%e5%8f%91%e4%bc%9a%e8%af%9d%ef%bc%8cibm-%e9%a2%84%e8%a7%88-granite-4-0-tiny-%e6%a8%a1%e5%9e%8b","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/34921.html","title":{"rendered":"12GB Video Memory Enables 128K Context 5 Concurrent Sessions, IBM Previews Granite 4.0 Tiny Models"},"content":{"rendered":"<p>May 10 News.<a href=\"https:\/\/www.1ai.net\/en\/tag\/ibm\" title=\"_OTHER ORGANISER\" target=\"_blank\" >IBM<\/a> On the 2nd of this month, we introduced its <a href=\"https:\/\/www.1ai.net\/en\/tag\/granite\" title=\"[Sees articles with [Granite] labels]\" target=\"_blank\" >Granite<\/a> One of the smallest versions in the 4.0 family of models: a preview version of Granite 4.0 Tiny.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-34922\" title=\"e9de93b8j00sw12xw002td000v900hdp\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/05\/e9de93b8j00sw12xw002td000v900hdp.jpg\" alt=\"e9de93b8j00sw12xw002td000v900hdp\" width=\"1125\" height=\"625\" \/><\/p>\n<p>Granite 4.0 Tiny Preview of the<strong>Advantages include high computational efficiency and<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e4%bd%8e%e5%86%85%e5%ad%98\" title=\"[See articles with [low memory] labels]\" target=\"_blank\" >low memory<\/a>demand (economics)<\/strong>The NVIDIA GeForce RTX 3060 12GB consumer graphics card with a suggested retail price of $329 (note: $2,383 at current exchange rates) requires only 12GB of RAM to run a concurrent dialog of five 128KB context windows at FP8 precision.<\/p>\n<p>Granite 4.0 Tiny plans to train at least 15T of tokens, and the current Preview preview version only trains 2.5T, but the<strong>Granite 3.3 2B Instruct, which already delivers comparable performance to 12T training tokens<\/strong>The memory requirements are reduced by about 72% for 16 concurrent sessions in a 128KB context window, and the final performance is expected to be comparable to that of Granite 3.3 8B Instruct.<\/p>\n<p>The Granite 4.0 Tiny Preview has a total parameter size of 7B with 1B active parameters, and is based on the hybrid Mamba-2 \/ Transformer architecture that has been adopted by the entire Granite 4.0 family, combining the speed and accuracy of both, and lowering memory consumption without a significant loss of performance.<\/p>\n<p>A preview version of Granite 4.0 Tiny is now available on Hugging Face under the standard Apache 2.0 license, and will be available from IBM at<strong>Officially launched this summer<\/strong>\u00a0Tiny and Small, Medium versions of the Granite 4.0 family of models.<\/p>","protected":false},"excerpt":{"rendered":"<p>May 10, 2011 - IBM on May 2 introduced a preview of one of the smallest versions of its Granite 4.0 family of models: Granite 4.0 Tiny. The advantage of Granite 4.0 Tiny Preview is its high computational efficiency and low memory requirements: at FP8 precision, running a concurrent dialog with five 128KB context windows requires only 12GB of graphics memory, which can be met by a NVIDIA GeForce RTX 3060 12GB consumer graphics card with a suggested retail price of $329 (note: current exchange rate is about RMB 2,383). Granite 4.0 Tiny program. Training Token for Granite 4.0 Tiny program<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[4705,373,6552,216],"collection":[],"class_list":["post-34921","post","type-post","status-publish","format-standard","hentry","category-news","tag-granite","tag-ibm","tag-6552","tag-216"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/34921","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=34921"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/34921\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=34921"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=34921"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=34921"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=34921"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}