Stable Diffusion 3.5: Enhanced Prompt Response and Increased Diversity in Character Generation

Stable Diffusion, the open-source alternative to AI image generators like Midjourney and DALL-E, has launched version 3.5. This update addresses the criticisms of the previous Stable Diffusion 3 Medium, which was met with widespread disapproval. Stability AI claims that the 3.5 model offers improved prompt adherence and competes with larger models in image quality. Additionally, it is designed to produce a diverse range of styles, skin tones, and features without explicit prompts.

The new model is available in three versions:

1. Stable Diffusion 3.5 Large: This is the most powerful variant, delivering the highest quality and leading the industry in prompt adherence. Stability AI states that it is suitable for professional use at 1 MP resolution.

2. Stable Diffusion 3.5 Large Turbo: A streamlined version of the Large model, this variant prioritizes efficiency while still generating high-quality images with excellent prompt adherence in just four steps.

3. Stable Diffusion 3.5 Medium: Designed for consumer hardware, this model balances quality and accessibility, allowing for image generation between 0.25 and 2 megapixels. However, this version will not be available until October 29, unlike the first two models, which are currently accessible.

The 3.5 release follows the problematic launch of Stable Diffusion 3 Medium in June, where the model produced absurdly grotesque images in response to straightforward prompts. Stability AI acknowledged that this previous version “didn’t fully meet our standards or our communities’ expectations,” highlighting a strong focus on prompt adherence in the current release.

Moreover, the 3.5 series includes new filters that aim to better represent human diversity, showcasing various skin tones and features without extensive prompts. This enhancement comes in light of past missteps in representation, such as Google’s controversy earlier this year, when its Gemini model generated historically inaccurate images. The backlash from that incident led Google to delay the integration of human generations for six months.

With these improvements, we hope Stable Diffusion 3.5 can effectively capture the nuances of human diversity and historical contexts in its outputs.

Most people like

Find AI tools in YBX