according to Stability AI Official press release, Stability AI recently launched a new generation ofWenshengtu Model, the model is built on the Würstchen architecture and claims to be easy to train and fine-tune on consumer-grade hardware.

▲ Image source: Stability AI official press release (the same below)
Officials claim that compared to the SDXL familiar to the industry,The new Stable Cascade model has improved performance and claimed content qualityCurrently, the relevant data of the Stable Cascade model has been made public on the GitHub page, but only non-commercial use is allowed.
IT House notes that after the user inputs a text segment, the relevant content will be converted by the Stable Cascade model into a small collection of data of 24×24 volume, after which the model will decode these small image data to generate images and continue to zoom in on the image to a high-resolution image, and because the series of steps are separated from each other, various aspects of the model can be subjected to a variety of additional training and fine-tuning.

Stability AI said that because the Stable Cascade model adopts such a "modular" design,Therefore, it can effectively reduce the video memory used for inference, claiming that only 20 GB of video memory is needed to run.
Stability AI also compared the Stable Cascade model with other industry competitors such as Playground v2, SDXL, SDXL Turbo, and Würstchen v2, claiming that Stable Cascade is "almost always the best performing model" in terms of prompt alignment and generated image details. In terms of inference speed, even though the largest Stable Cascade model has 1.4 billion more parameters than Stable Diffusion XL, it still has a faster inference speed.

Based on this, the official believes that Stable Cascade has better architectural design and can maintain efficient inference speed while maintaining high-quality output.
Model-generated content pictures:
