{"id":18441,"date":"2024-08-22T09:19:12","date_gmt":"2024-08-22T01:19:12","guid":{"rendered":"https:\/\/www.1ai.net\/?p=18441"},"modified":"2024-08-22T09:19:12","modified_gmt":"2024-08-22T01:19:12","slug":"%e5%be%ae%e8%bd%af%e5%8f%91%e5%b8%83-phi-3-5-%e7%b3%bb%e5%88%97-ai-%e6%a8%a1%e5%9e%8b%ef%bc%9a%e4%b8%8a%e4%b8%8b%e6%96%87%e7%aa%97%e5%8f%a3-128k%ef%bc%8c%e9%a6%96%e6%ac%a1%e5%bc%95%e5%85%a5%e6%b7%b7","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/18441.html","title":{"rendered":"Microsoft releases Phi-3.5 series AI models: context window 128K, first introduction of hybrid expert model"},"content":{"rendered":"<p><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%be%ae%e8%bd%af\" title=\"[View articles tagged with [Microsoft]]\" target=\"_blank\" >Microsoft<\/a>The company released the Phi-3.5 series <a href=\"https:\/\/www.1ai.net\/en\/tag\/ai%e6%a8%a1%e5%9e%8b\" title=\"[View articles tagged with [AI models]]\" target=\"_blank\" >AI Models<\/a>,<strong>The most notable of these is the launch of the first Mixed Model of Expertise (MoE) version of the series, Phi-3.5-MoE.<\/strong>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-18442\" title=\"e59523e4j00silib9000fd000l400bwm\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/08\/e59523e4j00silib9000fd000l400bwm.jpg\" alt=\"e59523e4j00silib9000fd000l400bwm\" width=\"760\" height=\"428\" \/><\/p>\n<p>The Phi-3.5 series released includes three lightweight AI models, Phi-3.5-MoE, Phi-3.5-vision, and Phi-3.5-mini, built on synthetic data and filtered public websites, with a 128K context window, all of which are now available on Hugging Face under the MIT license. IT Home has attached the relevant descriptions below:<\/p>\n<p>Phi-3.5-MoE: the first hybrid expert model<\/p>\n<p>Phi-3.5-MoE is the first model in the Phi family to utilize the Mixed Expert (MoE) technique. The model activated only 6.6 billion parameters in a 16 x 3.8B MoE model using 2 experts and was trained on 4.9T tokens using 512 H100s.<\/p>\n<p>The Microsoft research team designed the model from scratch to further improve its performance. In standard AI benchmarks, Phi-3.5-MoE outperforms Llama-3.1 8B, Gemma-2-9B, and Gemini-1.5-Flash, and is close to the current leader, GPT-4o-mini.<\/p>\n<p>Phi-3.5-vision: enhanced multi-frame image understanding<\/p>\n<p>With a total of 4.2 billion parameters, Phi-3.5-vision uses 256 A100 GPUs trained on 500B markers and now supports multi-frame image understanding and inference.<\/p>\n<p>Phi-3.5-vision has improved performance on MMMU (from 40.2 to 43.0), MMBench (from 80.5 to 81.9), and the document understanding benchmark TextVQA (from 70.9 to 72.0).<\/p>\n<p>Phi-3.5-mini: lightweight and strong features<\/p>\n<p>Phi-3.5-mini is a 3.8 billion parameter model, surpassing Llama3.1 8B and Mistral 7B, and even rivaling Mistral NeMo 12B.<\/p>\n<p>The model was trained using 512 H100s on 3.4T tokens. With only 3.8B effective parameters, the model is competitive in multilingual tasks compared to LLMs with more effective parameters.<\/p>\n<p>In addition, Phi-3.5-mini now supports 128K context windows, while its main competitor, the Gemma-2 series, only supports 8K.<\/p>","protected":false},"excerpt":{"rendered":"<p>Microsoft released the Phi-3.5 Series AI model, the most noteworthy of which was the launch of the first version of the Mixed Expert Model (MoE) series, Phi-3.5-MoE. The Phi-3.5 series, published this time, includes the three lightweight AI models Phi-3.5-MoE, Phi-3.5-vision and Phi-3.5-mini, built on synthetic data and a filtered public website with a context window of 128K, and all models are now available as MIT-licensed on Hugging Face. Phi-3.5-MoE: First Mixed Expert Model Phi-3.5-MoE<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[167,280],"collection":[],"class_list":["post-18441","post","type-post","status-publish","format-standard","hentry","category-news","tag-ai","tag-280"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/18441","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=18441"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/18441\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=18441"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=18441"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=18441"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=18441"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}