Stability.ai makes publicly available its own AI development for creating images from text. The peculiarity of Stable Diffusion is that the model requires less than 10 GB of video memory and can work on consumer graphics processors, creating a picture of 512×512 pixels from text in a few seconds.
“This will allow both researchers and soon the public to run this under a range of conditions, democratizing image generation. We look forward to the open ecosystem that will emerge around this and further models to truly explore the boundaries of latent space,” say the developers.
The model was created in cooperation with Dall-E 2 from Open AI, Imagen by Google Brain and others. It has been tested by more than 10,000 beta testers, generating 1.7 million images per day. The developers warn that because the model was trained on a large number of image-text pairs from a broad Internet sample, it may reproduce some biases. The company encourages everyone to get involved in discussing these biases.
“We hope that everyone will use it ethically, morally, and legally, and contribute to the community and the discourse around it,” the developers note.
You can join the Stable Diffusion community on Discord at the link. And here you can learn more about the model at Hugging Face.