{"id":17063,"date":"2024-08-03T09:01:33","date_gmt":"2024-08-03T01:01:33","guid":{"rendered":"https:\/\/www.1ai.net\/?p=17063"},"modified":"2024-08-03T09:01:33","modified_gmt":"2024-08-03T01:01:33","slug":"ai%e5%9b%be%e5%83%8f%e7%94%9f%e6%88%90%e8%bf%8e%e6%9d%a5%e6%96%b0%e9%9c%b8%e4%b8%bb%ef%bc%81%e5%bc%80%e6%ba%90%e6%a8%a1%e5%9e%8bflux-1%e6%a8%aa%e7%a9%ba%e5%87%ba%e4%b8%96%ef%bc%8cmidjourney%e3%80%81da","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/17063.html","title":{"rendered":"AI image generation has a new leader! The open source model FLUX.1 has been released. Are Midjourney and DALL\u00b7E 3 nervous?"},"content":{"rendered":"<p data-pm-slice=\"0 0 []\">In the field of artificial intelligence, disruptive changes can happen every day. Just one day after Midjourney was significantly updated,<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%ba%90\" title=\"[View articles tagged with [open source]]\" target=\"_blank\" >Open Source<\/a>The field of image generation has ushered in an eye-catching dark horse\u2014\u2014<a href=\"https:\/\/www.1ai.net\/en\/tag\/flux\" title=\"_OTHER ORGANISER\" target=\"_blank\" >FLUX<\/a>.1. This sudden new player not only claimed to have greatly surpassed the performance of closed-source models such as DALL E3 and Midjourney V6, but also killed the open-source SD3 series across the board, instantly setting off a sensation in the AI circle.<\/p>\n<p data-track=\"23\">Let us first get to know the mastermind behind FLUX.1. Its founder, Robin Rombach, is no unknown person, but an authority in the field of diffusion models. His representative works include VQGAN, Taming Transformers, and Latent Diffusion. He served as the chief scientist of Stability AI and led the world-renowned Stable Diffusion series of projects. It can be said that Robin Rombach is<a href=\"https:\/\/www.1ai.net\/en\/tag\/ai%e5%9b%be%e5%83%8f\" title=\"[View articles tagged with [AI images]]\" target=\"_blank\" >AI Image<\/a>The field of generation can be described as \u201cold driver\u201d in \u201cold driver\u201d\u3002<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-17064\" title=\"get-49\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/08\/get-49.jpg\" alt=\"get-49\" width=\"777\" height=\"639\" \/><\/div>\n<p data-track=\"24\">In March this year, Robin left Stability AI due to internal turmoil. After four months of development, he returned with a new open source large model platform FLUX.1. Even more surprising is that FLUX.1 received a $32 million seed round of financing led by the famous venture capital firm Andreessen Horowitz as soon as it was launched. This undoubtedly injected a shot in the arm for the future development of FLUX.1.<\/p>\n<p data-track=\"25\">So, what is so special about FLUX.1? First of all, it is based on the Vision Transformer architecture, adopts a process matching training method, and uses rotation position embedding and parallel attention layers to improve model performance and hardware utilization efficiency. This 12 billion parameter model has been launched in three versions:<\/p>\n<ul>\n<li data-track=\"26\"><strong>Pro version:<\/strong>Used through API, the performance is the strongest.<\/li>\n<li data-track=\"27\"><strong>Dev version:<\/strong>A non-commercial guided distillation model that inherits most of the performance of the Pro version.<\/li>\n<li data-track=\"28\"><strong>Schnell version:<\/strong>The open source model can be used commercially and its performance is also quite outstanding.<\/li>\n<\/ul>\n<p data-track=\"29\">According to the test data of the FLUX.1 team, even the open source Schnell version has surpassed mainstream models such as Midjourney v6.0, DALL\u00b7E3 (HD) and SD3-Ultra in terms of text semantic restoration, image quality, motion consistency, coherence and diversity. In particular, FLUX.1 has shown a clear advantage in text embedding into images.<\/p>\n<p data-track=\"43\">Of course, FLUX.1&#039;s ambitions are clearly not limited to this. The team said that Vincent images are just the beginning, and in the future they also plan to launch Vincent video models to challenge first-line products such as Sora, Gen-3, and Luma.<\/p>\n<p data-track=\"44\">For developers and AI enthusiasts, the emergence of FLUX.1 is undoubtedly a major benefit. The Schnell version is completely open source and has been supported by Comfyui. If you have more than 36G of video memory, you can even run the fp16 version of t5. However, it should be noted that t5xxl_fp16.safetensors or clip_l.safetensors and VAE need to be downloaded separately.<\/p>\n<p data-track=\"45\">The emergence of FLUX.1 not only brings new hope to the field of open source AI image generation, but also injects new vitality into the entire AI industry. Its powerful performance and open source characteristics are likely to accelerate the popularization and innovation of AI image generation technology. For ordinary users, this means that we may soon be able to run AI image generation models that are comparable to or even surpass Midjourney on home computers.<\/p>\n<p data-track=\"46\">Project address: https:\/\/github.com\/black-forest-labs\/flux<\/p>\n<p data-track=\"47\">Trial address: https:\/\/replicate.com\/black-forest-labs\/flux-pro<\/p>\n<p data-track=\"48\">ComfyUI workflow: https:\/\/comfyanonymous.github.io\/ComfyUI_examples\/flux\/<\/p>\n<p>&nbsp;<\/p>","protected":false},"excerpt":{"rendered":"<p>In the area of artificial intelligence, subversive changes can occur every day. Just the second day that Midjourney had a major update, an impressive black horse, FLUX.1, had arrived in the field of open-source image generation. This new and unexpected player not only claims to have significantly exceeded the closed-source model of DALL E3, Midjourney V6, but also killed the open-source SD3 series in full-line seconds and detonated the AI circle in an instant. Let's start with the head of FLUX.1. Its founder, Robin Rombach, was not a nobody, but an authoritative expert in proliferation modelling. His work includes VQGAN, Taming Transformers and Latent Diffu<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[148,146],"tags":[524,3853,219],"collection":[],"class_list":["post-17063","post","type-post","status-publish","format-standard","hentry","category-headline","category-news","tag-ai","tag-flux","tag-219"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/17063","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=17063"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/17063\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=17063"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=17063"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=17063"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=17063"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}