{"id":8617,"date":"2024-04-21T10:18:09","date_gmt":"2024-04-21T02:18:09","guid":{"rendered":"https:\/\/www.1ai.net\/?p=8617"},"modified":"2024-04-21T10:18:09","modified_gmt":"2024-04-21T02:18:09","slug":"ai%e5%9b%be%e7%89%87%e5%b7%a5%e5%85%b7instantid%e6%95%99%e7%a8%8b%ef%bc%8c%e4%b8%80%e5%bc%a0%e7%85%a7%e7%89%87%e7%a7%92%e7%94%9f%e4%b8%8d%e5%90%8c%e9%a3%8e%e6%a0%bc%e7%9a%84%e5%9b%be%e7%89%87","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/8617.html","title":{"rendered":"AI photo tool InstantID tutorial, one photo can generate pictures of different styles in seconds"},"content":{"rendered":"<p data-track=\"271\" data-pm-slice=\"0 0 []\">Recently, AI portrait generation technology has become very popular. This article introduces<a href=\"https:\/\/www.1ai.net\/en\/tag\/instantid\" title=\"[See articles with [InstantID] labels]\" target=\"_blank\" >InstantID<\/a>, it can achieve personalized image synthesis using only a single facial image reference while maintaining high-fidelity identity preservation, and supports a variety of different styles.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8618\" title=\"get-662\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-662.jpg\" alt=\"get-662\" width=\"1080\" height=\"690\" \/><\/div>\n<blockquote class=\"pgc-blockquote-abstract\">\n<p style=\"text-align: left;\" data-track=\"219\"><strong>Project Homepage<\/strong>\uff1ahttps:\/\/instantid.github.io\/<\/p>\n<p style=\"text-align: left;\" data-track=\"220\"><strong>Code address<\/strong>: https:\/\/github.com\/InstantID\/InstantID<\/p>\n<p style=\"text-align: left;\" data-track=\"221\"><strong>Experience Address<\/strong>\uff1ahttps:\/\/huggingface.co\/spaces\/InstantX\/InstantID<\/p>\n<\/blockquote>\n<p data-track=\"222\"><strong>1. Introduction to InstantID<\/strong><\/p>\n<p data-track=\"223\">Presented in the paper, InstantID: \u201cZero-shot Identity-Preserving General in Security\u201d, translated as \u201czero identity creation in seconds\u201d\u3002<\/p>\n<p data-track=\"224\">InstantID is a powerful solution based on diffusion models. The designed plug-and-play module can skillfully handle various styles of image personalization using only a single face image while ensuring high fidelity. At its core, it designs a novel IdentityNet that combines face and landmark images with textual cues to guide image generation by imposing semantic and weak spatial conditions.<\/p>\n<p data-track=\"225\">Given only one reference ID image, InstantID aims to generate customized images with various poses or styles from a single reference ID image while ensuring high fidelity. It consists of three key components:<\/p>\n<p data-track=\"226\">(1) ID embedding that captures semantic face information;<\/p>\n<p data-track=\"227\">(2) A lightweight adaptation module with decoupled cross-attention to facilitate the use of images as visual cues<\/p>\n<p data-track=\"228\">(3) IdentityNet, which encodes detailed features of reference facial images through additional spatial control.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8619\" title=\"get-663\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-663.jpg\" alt=\"get-663\" width=\"1080\" height=\"664\" \/><\/div>\n<p data-track=\"229\"><strong>2. Introduction to InstantID Function<\/strong><\/p>\n<p data-track=\"230\"><strong>Function 1: Generate a picture of any style from a face<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8620\" title=\"get-664\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-664.jpg\" alt=\"get-664\" width=\"1080\" height=\"607\" \/><\/div>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8622\" title=\"get-666\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-666.jpg\" alt=\"get-666\" width=\"1080\" height=\"615\" \/><\/div>\n<p data-track=\"231\"><strong>Feature 2: Editability<\/strong><\/p>\n<p data-track=\"232\">You can edit the generated images through text prompts, such as changing the expressions, background or other elements of the characters in the image. You can also use the ControlNet plug-in to more accurately control the details of image generation and achieve personalized customization.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8621\" title=\"get-665\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-665.jpg\" alt=\"get-665\" width=\"1080\" height=\"596\" \/><\/div>\n<p data-track=\"233\"><strong>Function 3: Multiple references<\/strong><\/p>\n<p data-track=\"234\">It allows multiple reference images to be used to generate a new image, thereby enhancing the richness and diversity of the generated images.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8623\" title=\"get-667\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-667.jpg\" alt=\"get-667\" width=\"1080\" height=\"392\" \/><\/div>\n<p data-track=\"235\">For multiple reference images, the average of the ID embeddings is taken as the image hint. InstantID achieves good results even with only one reference image.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8624\" title=\"get-668\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-668.jpg\" alt=\"get-668\" width=\"1080\" height=\"473\" \/><\/div>\n<p data-track=\"236\">InstantID also has the flexibility to support adding identity attributes to non-human roles.<\/p>\n<p data-track=\"237\"><strong>3. Comparison between InstantID and similar products<\/strong><\/p>\n<p data-track=\"238\"><strong>Comparison 1: InstantID vs. IP-Adapter\/IP-Adapter-FaceID\/PhotoMaker<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8625\" title=\"get-669\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-669.jpg\" alt=\"get-669\" width=\"1080\" height=\"623\" \/><\/div>\n<p data-track=\"239\">Compare with IP-Adapter (IPA), IP-Adapter-FaceID and the latest PhotoMaker. Among them, PhotoMaker needs to train the LoRA parameters of UNet. It can be seen that both PhotoMaker and IP-Adapter-FaceID achieve good fidelity, but the text control ability has obvious degradation. In contrast, InstantID achieves better fidelity and retains good text editability (faces and styles are better integrated).<\/p>\n<p data-track=\"240\"><strong>Comparison 2: InstantID vs. LoRa<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8626\" title=\"get-670\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-670.jpg\" alt=\"get-670\" width=\"1080\" height=\"458\" \/><\/div>\n<p data-track=\"241\">InstantID can achieve competitive results like LoRA without any training.<\/p>\n<p data-track=\"242\"><strong>Comparison 3: InstantID vs. InsightFace Swapper<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8627\" title=\"get-671\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-671.jpg\" alt=\"get-671\" width=\"1080\" height=\"375\" \/><\/div>\n<p data-track=\"243\">In the non-realistic style, InstantID is more flexible in the fusion of face and background.<\/p>\n<p data-track=\"244\"><strong>4. InstantID User Experience<\/strong><\/p>\n<p data-track=\"245\">Next, let\u2019s experience it on the huggingface website.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8628\" title=\"get-672\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-672.jpg\" alt=\"get-672\" width=\"1080\" height=\"826\" \/><\/div>\n<p data-track=\"246\">There is an explanation of the operation steps at the top, and the core operation only requires 4 steps.<\/p>\n<p data-track=\"247\"><strong>[Step 1]: Upload personal pictures<\/strong><\/p>\n<p data-track=\"248\">For multi-person images, we will only detect the largest face. Make sure the face is not too small and not significantly occluded or blurred.<\/p>\n<p data-track=\"249\">For example, we upload a photo of Fairy Zixia here.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8629\" title=\"get-673\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-673.jpg\" alt=\"get-673\" width=\"721\" height=\"778\" \/><\/div>\n<p data-track=\"250\"><strong>Step 2: (Optional) Upload an image of another person as a reference pose<\/strong><\/p>\n<p data-track=\"251\">If not uploaded, we will use the first person image to extract landmarks. If a cropped face was used in step 1, it is recommended to upload it to extract a new pose.<\/p>\n<p data-track=\"252\"><strong>\u3010Step 3\u3011\uff1aWriting prompt words<\/strong><\/p>\n<p data-track=\"253\">Prompt word: A beautiful woman was sitting on the grass in the park<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8630\" title=\"get-674\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-674.jpg\" alt=\"get-674\" width=\"723\" height=\"177\" \/><\/div>\n<p data-track=\"254\"><strong>[Step 4]: Image generation<\/strong><\/p>\n<p data-track=\"255\">We'll start with a different style, and then click on the Submit button, and we'll generate a picture. Let's take a look at the picture of different styles\u3002<\/p>\n<p data-track=\"256\"><strong>Style 1: WaterColor<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8631\" title=\"get-675\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-675.jpg\" alt=\"get-675\" width=\"832\" height=\"1280\" \/><\/div>\n<p data-track=\"257\"><strong>Style 2: Film Noir (black and white film)<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8632\" title=\"get-676\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-676.jpg\" alt=\"get-676\" width=\"832\" height=\"1280\" \/><\/div>\n<p data-track=\"258\"><strong>Style 3: Neon<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8633\" title=\"get-677\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-677.jpg\" alt=\"get-677\" width=\"832\" height=\"1280\" \/><\/div>\n<p data-track=\"259\"><strong>Style 4: Jungle<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8636\" title=\"get-680\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-680.jpg\" alt=\"get-680\" width=\"832\" height=\"1280\" \/><\/div>\n<p data-track=\"260\"><strong>Style 5: Mars<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8634\" title=\"get-678\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-678.jpg\" alt=\"get-678\" width=\"832\" height=\"1280\" \/><\/div>\n<p data-track=\"261\"><strong>Style 6: Vibrant Color<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8635\" title=\"get-679\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-679.jpg\" alt=\"get-679\" width=\"832\" height=\"1280\" \/><\/div>\n<p data-track=\"262\"><strong>Style 7: Snow<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8637\" title=\"get-681\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-681.jpg\" alt=\"get-681\" width=\"832\" height=\"1280\" \/><\/div>\n<p data-track=\"263\"><strong>Style 8: Line art<\/strong><\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8639\" title=\"get-683\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-683.jpg\" alt=\"get-683\" width=\"832\" height=\"1280\" \/><\/div>\n<p data-track=\"264\">Judging from the effect of the produced pictures, the character images remain very uniform and are very similar to the original pictures.<\/p>\n<p data-track=\"265\"><strong>5. Related Notes<\/strong><\/p>\n<p data-track=\"266\">(1) If you are not satisfied with the similarity, you can increase the weights of controlnet_conditioning_scale (IdentityNet) and ip_adapter_scale (Adapter) appropriately.<\/p>\n<div class=\"pgc-img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-8638\" title=\"get-682\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/04\/get-682.jpg\" alt=\"get-682\" width=\"706\" height=\"342\" \/><\/div>\n<p data-track=\"267\">(2) If the generated image is oversaturated, reduce the weight of ip_adapter_scale. If that does not work, reduce the weight of controlnet_conditioning_scale.<\/p>\n<p data-track=\"268\">(3) If the text prompt word does not meet expectations, reduce the weight of ip_adapter_scale.<\/p>\n<p data-track=\"269\">(4) It is important to choose a good basic model.<\/p>\n<p data-track=\"270\">Okay, that\u2019s all for today\u2019s sharing. If you are interested, go and experience it.<\/p>\n<p>&nbsp;<\/p>","protected":false},"excerpt":{"rendered":"<p>Recently, the AI-Perception technology has been very hot, and this paper presents InstantID, which is able to achieve personalized image synthesis using single-face image references while maintaining high security identity, and supports different styles. Project home page: https:\/\/istantid.github.io\/ code address: https:\/\/github.com\/InstantID\/InstantID experience address: https:\/\/huggingface.co\/spaces\/InstantX\/InstantID I. Introduction to InstantID<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[144],"tags":[338,1221],"collection":[],"class_list":["post-8617","post","type-post","status-publish","format-standard","hentry","category-baike","tag-ai","tag-instantid"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/8617","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=8617"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/8617\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=8617"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=8617"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=8617"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=8617"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}