{"id":39280,"date":"2025-07-11T20:44:46","date_gmt":"2025-07-11T12:44:46","guid":{"rendered":"https:\/\/www.1ai.net\/?p=39280"},"modified":"2025-07-11T20:44:46","modified_gmt":"2025-07-11T12:44:46","slug":"%e5%be%ae%e8%bd%af%e5%8f%91%e5%b8%83-phi-4-mini-flash-reasoning-%e7%ab%af%e4%be%a7ai%e6%a8%a1%e5%9e%8b%ef%bc%9a10-%e5%80%8d%e5%90%9e%e5%90%90%e9%87%8f%ef%bc%8c%e6%8e%a8%e7%90%86%e8%83%bd%e5%8a%9b","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/39280.html","title":{"rendered":"Microsoft releases Phi-4-mini-flash-reasoning end-side AI model: 10x throughput, upgraded inference capabilities"},"content":{"rendered":"<p>July 11 - Technology media outlet NeoWin published a blog post yesterday, July 10, reporting that<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%be%ae%e8%bd%af\" title=\"[View articles tagged with [Microsoft]]\" target=\"_blank\" >Microsoft<\/a>Launch of the Phi-4-mini-flash-reasoning mini-language model.<strong>focus on improving<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e7%ab%af%e4%be%a7ai%e6%a8%a1%e5%9e%8b\" title=\"[SEE ARTICLES WITH [END-AI MODEL] LABELS]\" target=\"_blank\" >End-side AI model<\/a>of math and logical reasoning.<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-39281\" title=\"d403fa6fj00sz8jde00akd000sg00g0p\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/07\/d403fa6fj00sz8jde00akd000sg00g0p.jpg\" alt=\"d403fa6fj00sz8jde00akd000sg00g0p\" width=\"1024\" height=\"576\" \/><\/p>\n<p>The main advantage of Phi-4-mini-flash-reasoning is its ability to introduce advanced reasoning capabilities in under-resourced scenarios such as edge devices, mobile applications and embedded systems.<\/p>\n<p>In terms of architecture, Phi-4-mini-flash-reasoning innovatively introduces the SambaY architecture, and one of the highlights of this architecture is the component called Gated Memory Unit (GMU), which efficiently shares information between the internals of the model, thus improving the efficiency of the model.<\/p>\n<p>These improvements allow the model to generate answers and complete tasks faster, even when faced with very long inputs, and the Phi model can also process large amounts of data and understand very long texts or conversations.<\/p>\n<p>In terms of performance, Phi-4-mini-flash-reasoning delivers up to 10 times higher throughput compared to other Phi models, which means that Phi-4-mini-flash-reasoning can do more work in a given amount of time.<\/p>\n<p>It can process 10x more requests or generate 10x more text in the same amount of time, which is a huge improvement for real-world applications, and in addition, latency is reduced to 1\/2 to 1\/3 of other Phi models.<\/p>\n<p>The Phi-4-mini-flash-reasoning novel model is available on Azure AI Foundry, NVIDIA API Catalog and Hugging Face.<\/p>","protected":false},"excerpt":{"rendered":"<p>July 11 - Technology media NeoWin published a blog post yesterday (July 10), reporting that Microsoft has introduced Phi-4-mini-flash-reasoning small language model that focuses on improving the mathematical and logical reasoning capabilities of end-side AI models. The main advantage of Phi-4-mini-flash-reasoning is its ability to introduce advanced reasoning capabilities in under-resourced scenarios such as edge devices, mobile applications and embedded systems. In terms of architecture, Phi-4-mini-flash-reasoning introduces the innovative SambaY architecture, which is highlighted by the Gated Memory Unit (GMU).<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[280,7170],"collection":[],"class_list":["post-39280","post","type-post","status-publish","format-standard","hentry","category-news","tag-280","tag-ai"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/39280","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=39280"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/39280\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=39280"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=39280"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=39280"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=39280"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}