July 11 - Technology media outlet NeoWin published a blog post yesterday, July 10, reporting thatMicrosoftLaunch of the Phi-4-mini-flash-reasoning mini-language model.focus on improvingEnd-side AI modelof math and logical reasoning.

The main advantage of Phi-4-mini-flash-reasoning is its ability to introduce advanced reasoning capabilities in under-resourced scenarios such as edge devices, mobile applications and embedded systems.
In terms of architecture, Phi-4-mini-flash-reasoning innovatively introduces the SambaY architecture, and one of the highlights of this architecture is the component called Gated Memory Unit (GMU), which efficiently shares information between the internals of the model, thus improving the efficiency of the model.
These improvements allow the model to generate answers and complete tasks faster, even when faced with very long inputs, and the Phi model can also process large amounts of data and understand very long texts or conversations.
In terms of performance, Phi-4-mini-flash-reasoning delivers up to 10 times higher throughput compared to other Phi models, which means that Phi-4-mini-flash-reasoning can do more work in a given amount of time.
It can process 10x more requests or generate 10x more text in the same amount of time, which is a huge improvement for real-world applications, and in addition, latency is reduced to 1/2 to 1/3 of other Phi models.
The Phi-4-mini-flash-reasoning novel model is available on Azure AI Foundry, NVIDIA API Catalog and Hugging Face.