{"id":2705,"date":"2024-01-09T09:25:23","date_gmt":"2024-01-09T01:25:23","guid":{"rendered":"https:\/\/www.1ai.net\/?p=2705"},"modified":"2024-01-09T09:25:23","modified_gmt":"2024-01-09T01:25:23","slug":"%e6%96%b0ai%e5%9b%be%e5%83%8f%e5%88%86%e5%89%b2%e6%96%b9%e6%b3%95gensam%ef%bc%9a%e4%b8%80%e4%b8%aa%e6%8f%90%e7%a4%ba%e5%ae%9e%e7%8e%b0%e6%89%b9%e9%87%8f%e5%9b%be%e7%89%87%e5%88%86%e5%89%b2","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/2705.html","title":{"rendered":"New AI image segmentation method GenSAM: A hint to achieve batch image segmentation"},"content":{"rendered":"<p>Recently, researchers proposed a new image segmentation method called Generalizable SAM (<a href=\"https:\/\/www.1ai.net\/en\/tag\/gensam\" title=\"_Other Organiser\" target=\"_blank\" >GenSAM<\/a>) model. The design goal of this model is to achieve targeted segmentation of images through a general task description, getting rid of the reliance on sample-specific cues. In a specific task, given a task description, such as &quot;camouflaged sample segmentation&quot;, the model needs to accurately segment the camouflaged animals in the image according to the task description, without relying on manual specific cues for each image.<\/p>\n<p class=\"article-content__img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-2706\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/01\/6384033018159870244646112.jpg\" alt=\"\" width=\"768\" height=\"522\" \/><\/p>\n<p>To solve this problem, the GenSAM model introduces the Cross-modal Chains of Thought Prompting (CCTP) and Progressive Mask Generation (PMG) frameworks. The CCTP thought chain maps the common text prompts of the task to all the images under the task, generating a personalized consensus heat map of the object of interest and its background, thereby obtaining reliable visual cues to guide segmentation. In order to achieve adaptability during testing, the PMG framework iteratively reweights the generated heat map to the original image, guiding the model to focus on possible target areas from coarse to fine.<\/p>\n<p>The experimental results of GenSAM show that the model performs better than the baseline method and the weakly supervised method in the task of camouflaged sample segmentation and has good generalization performance. The introduction of this model is an important step for the practical application of cued segmentation methods such as SAM.<\/p>\n<p>The innovation of this research is that by providing a general task description, the GenSAM model can process unlabeled images of all relevant tasks in batches without manually providing specific hints for each image. This makes the model more efficient and scalable when processing large amounts of data.<\/p>\n<p>In the future, the method of the GenSAM model may provide new ideas and solutions for image segmentation tasks in other fields. The researchers hope that this general task description-guided image segmentation method can promote the development of computer vision and improve the segmentation accuracy of the model in complex scenes.<\/p>\n<ul>\n<li>Paper link: https:\/\/arxiv.org\/pdf\/2312.07374.pdf<\/li>\n<li>Project link: https:\/\/lwpyh.github.io\/GenSAM\/<\/li>\n<\/ul>","protected":false},"excerpt":{"rendered":"<p>Recently, researchers have proposed a novel image segmentation method called Generalizable SAM (GenSAM) model. The design goal of this model is to achieve targeted segmentation of images through generalized task descriptions, free from the reliance on sample-specific cues. In a specific task, given a task description, e.g., \"camouflage sample segmentation\", the model needs to accurately segment camouflaged animals in an image based on the task description, without relying on manually providing specific cues for each image. To address this problem, the GenSAM model introduces Cross-modal Chains of Thought Prompting (CCTP) thought chains and Progressive Mask Generat<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[875],"collection":[],"class_list":["post-2705","post","type-post","status-publish","format-standard","hentry","category-news","tag-gensam"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/2705","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=2705"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/2705\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=2705"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=2705"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=2705"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=2705"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}