April 27th.Step StarAnnounced todayOpen SourceThe image editing macromodel Step1X-Edit, which performs up to open-source SOTA, has 19B total parameters (7B MLLM + 12B DiT) and is equipped with aPrecise semantic parsing, identity consistency preservation, high-precision area-level controlThree key capabilities; support for 11 high-frequency image editing task types such asText Replacement, Style Migration, Material Transformation, Character Retouchingwait.

1AI with open source link:
- Github:https://github.com/stepfun-ai/Step1X-Edit
- HuggingFace:https://huggingface.co/stepfun-ai/Step1X-Edit
- ModelScope:https://www.modelscope.cn/models/stepfun-ai/Step1X-Edit/summary
- Technical Report:https://arxiv.org/pdf/2504.17761
Officially, Step1X-Edit has the following core capabilities for natural language image editing tasks:
- Semantic Precision Parsing:It supports complex combination of commands described in natural language, and the commands do not require templates, so it can flexibly respond to the needs of multi-round and multi-task editing, and at the same time, it supports recognizing, replacing and reconstructing the text in the image;
- Identity consistency is maintained:After editing, it can stably retain face, posture and identity features, which is suitable for high consistency scenarios such as avatars, e-commerce models and social images;
- High precision regional level control:Supports directional editing of text, material, color, etc. for the specified area to maintain a unified image style and finer control.