So-called text-to-image generators such as Dall-E 2 or Imagen from Google are currently in vogue. They take texts, keywords or whole sentences as input and use that to generate an image using artificial intelligence. Another generator called Stable diffusion It is currently doing the rounds because it has shown that it can produce good results and it is free in the basic version. We tried it and show you what you need too:
Here’s How To Try Stable Diffusion For Yourself
The easiest method is probably to use the generator in the browser via DreamStudio. For that you just go upstairs https://beta.dreamstudio.ai/ and create an account or log in. A short tutorial then explains how the interface works.
I recommend that you read through the quick guide beforehand so you don’t waste any of your free images. They are limited to about 150 standard images. But that’s enough to try it out. If that’s not enough for you, you can use a paid account or run Stable Diffusion on your own hardware.
The easiest way to do this is with the NMKD Stable Diffusion GUI, developed by a Reddit user. You can open this this website to download. Then extract the files into a folder. However, not a protected folder and not in your program files.
Then run the StableDiffusionGui.exe file on Windows and follow the program’s instructions. Please note that the program currently only works with an Nvidia graphics card with at least 10 gigabytes of memory. It will not work on AMD cards in this version.
Is it really that easy to make art?
What’s possible with Stable Diffusion, for example, shows this Twitter user, who took cats on a time journey through human history:
You can find many more examples in the library of encyclopedia. It shows images created with Stable Diffusion and indicates the “Promts” that were used. So the text is entered to create the image.
Many of the images look impressive, some even realistic. Of course we wonder if it is really that easy to create beautiful images with the text-to-image generator. For example, as an editor, this would save us searching for images for our articles, as we could generate them simply by entering text.
For our test run we have Stable Diffusion via Dream Studio used. You can also run the AI on your own computer, but at this point you need an Nvidia graphics card with at least 10 gigabytes of memory. Since I currently only have my Ultrabook with an integrated Intel Iris Xe, I tested the browser version.
Basically you use the generator here as you would on your own PC. The difference is that the offer is not completely free. After logging in to DreamStudio for the first time, you will be given a contingent of approximately 150 images to generate. You have to pay for more photos. If the AI runs on your own hardware, you can use it for free.
The DreamStudio version allows you to set how high and wide the image should be – and how close to the input conceptually. So you can give the AI some “artistic freedom”. You also set the number of steps in which the image is generated. Here, more steps generally lead to a better result.
To do this, you set how many images should be generated at the same time and which sampler should be used. There is also a “Promt Guide” for beginners like me. This provides tips and instructions on which prompts the generator should work with.
That’s what came out for us
For the test I had a total of about 100 images generated and I don’t want to withhold the results from you. The first images I made were just useless. But that was mostly because I intuitively entered prompts like “man with a ball” or “woman in the park”.
After reading the Prompts Guide and trying out some of Lexica’s prompts, the results were nicer. Some of the photos were even really good, like this small selection:
I have the impression that Stable Diffusion works best when the software doesn’t have to display faces of people or animals. There are often inconsistencies here. Other images look good at first glance, but have obvious flaws, such as the three-eared panda or the long-necked deer with slightly odd antlers:
The electronics store also looks good at first glance, but the details are lacking. There were no absolute failures in my test. With fairly reasonable input, images always came out that were at least presentable.
However, the AI has problems with proportions, especially on human faces, and faces are often just smeared, like in these photos:
It also happened once or twice that the image was completely blurry, just not sharp. Of course, this is especially annoying if you’re paying for the service.
In any case, after a first test of Stable Diffusion with about 100 images, I can say that the AI is not yet ready to generate realistic images at the touch of a button. The quality of the images also varies widely. Sometimes I wondered how good the photo actually looks, other times I wondered what kind of monster I had made.
Overall, I love what the AI can do. Especially considering that the input from me and others who use the AI is often minimal. The AI must not only understand what words I’m typing there, but also put them into context.
If I want an image with a “knight with a sword” from the AI, it needs to know what a knight is, what a sword is, and also what a knight usually does with a sword. For example, is he holding it in his hand or carrying it in his sheath? The AI seems quite capable of dealing with such information.
Since Stable Diffusion is free, you should definitely give the generator a try if you’re interested in the subject!