{"id":19802,"date":"2024-09-12T09:44:56","date_gmt":"2024-09-12T01:44:56","guid":{"rendered":"https:\/\/www.1ai.net\/?p=19802"},"modified":"2024-09-12T09:44:56","modified_gmt":"2024-09-12T01:44:56","slug":"pixtral-12b-%e5%8f%91%e5%b8%83%ef%bc%9amistral%e5%bc%80%e6%ba%90%e9%a6%96%e4%b8%aa%e5%a4%9a%e6%a8%a1%e6%80%81ai%e5%a4%a7%e6%a8%a1%e5%9e%8b","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/19802.html","title":{"rendered":"Pixtral 12B Released: Mistral Open Sources First Multimodal AI Big Model"},"content":{"rendered":"<p>TechCrunch, a technology media outlet, reported yesterday (September 11) that<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%b3%95%e5%9b%bd\" title=\"[Sees articles with [French] labels]\" target=\"_blank\" >France<\/a> AI <a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%88%9d%e5%88%9b%e5%85%ac%e5%8f%b8\" title=\"[Sees articles with labels]\" target=\"_blank\" >Startups<\/a> <a href=\"https:\/\/www.1ai.net\/en\/tag\/mistral\" title=\"[See article with [Mistral] label]\" target=\"_blank\" >Mistral<\/a> Release of Pixtral 12B.<strong>is the company's first multimodal AI big speech model capable of processing images and text simultaneously.<\/strong><\/p>\n<p>The Pixtral 12B model has 12 billion parameters and is about 24 GB in size; the parameters roughly correspond to the model's solving power, and models with more parameters usually perform better than models with fewer parameters.<\/p>\n<p>The Pixtral 12B model is built on the text model Nemo 12B and is capable of answering questions about any number of images of any size.<\/p>\n<p>Similar to other multimodal models such as Anthropic's Claude series and OpenAI's GPT-4o, the Pixtral 12B should theoretically be able to perform tasks such as adding descriptions to images and counting the number of objects in a photo.<\/p>\n<p>Users can download and fine-tune the Pixtral 12B model and use it under an Apache 2.0 license.<\/p>\n<p>Pixtral 12B will soon be available for open beta testing on Mistral's chatbot and API service platforms Le Chat and Le Plateforme, said Sophia Yang, Mistral's head of developer relations, in a post on the X platform.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-19803\" title=\"b1169173j00sjof9k003xd000hb00a4m\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/09\/b1169173j00sjof9k003xd000hb00a4m.jpg\" alt=\"b1169173j00sjof9k003xd000hb00a4m\" width=\"623\" height=\"364\" \/><\/p>\n<p>In terms of technical specifications, Pixtral12B is equally impressive: 40-story network structure, 14,336 hidden dimensions, 32 attention headers, and a 400M dedicated visual encoder to support the processing of 1024 x 1024 resolution images\u3002<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-19804\" title=\"c23e1309j00sjof9l004rd000gs007ym\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/09\/c23e1309j00sjof9l004rd000gs007ym.jpg\" alt=\"c23e1309j00sjof9l004rd000gs007ym\" width=\"604\" height=\"286\" \/><\/p>\n<p>On platforms such as MMMU, Mathvista, ChartQA, and DocVQA, it outperforms a number of well-known multimodal models including Phi-3 and Qwen-27B, which fully proves its strong strength.<\/p>\n<p>Huggingface address.<\/p>\n<p>https:\/\/huggingface.co\/mistral-community\/pixtral-12b-240910<\/p>","protected":false},"excerpt":{"rendered":"<p>TechCrunch reported yesterday (September 11) that French AI startup Mistral has released Pixtral 12B, the company's first multimodal AI big speech model capable of processing images and text simultaneously. The Pixtral 12B model has 12 billion parameters, or about 24GB in size, and the parameters roughly correspond to the model's problem-solving ability, with models with more parameters generally performing better than those with fewer parameters. The Pixtral 12B model is based on the textual model Nemo 12B, and is capable of answering questions about any number of images of any size. It is similar to Anthropic's Claude series and OpenAI's<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[559,309,602,2529],"collection":[],"class_list":["post-19802","post","type-post","status-publish","format-standard","hentry","category-news","tag-mistral","tag-309","tag-602","tag-2529"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/19802","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=19802"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/19802\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=19802"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=19802"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=19802"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=19802"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}