Top Free Methods for Creating AI Images Using Stable Diffusion
Have you ever wished to generate AI images without relying on online tools? Many free online image generators restrict the number of outputs and often prompt subscriptions after just a few attempts. Introducing Stable Diffusion: a free and open-source AI image generator that allows you to craft images at home without limitations.
What is Stable Diffusion?
Stable Diffusion is a free, open-source framework that transforms text into visual images based on your descriptions. Although it isn’t a standalone application, it serves as a key technology utilized by various applications. When it comes to generative AI for image creation, Stable Diffusion remains one of the top contenders. This guide highlights three approaches to utilizing Stable Diffusion, ranging from beginner-friendly to more complex, with unique features within each method.
System Requirements
Here are the recommended specifications for a successful experience:
- macOS: Apple Silicon (M series chip)
- Windows or Linux: NVIDIA or AMD GPU
- RAM: 16GB for optimal performance
- GPU VRAM: at least 4GB (8GB preferred)
- Storage: 60-70GB available space
1. Using Automatic1111 WebUI
The first approach involves using the AUTOMATIC1111 Web UI to access Stable Diffusion, compatible with all major operating systems.
Begin by downloading the latest stable version of Python. After downloading, run the installer and ensure you select Add python.exe to PATH before clicking Install Now.
Next, head to the AUTOMATIC1111 Web UI repository on GitHub, click on Code, and select Download ZIP. Once the download completes, unzip the file, and remember the location where the WebUI has been installed.
Install a Model
Before you begin using the Web UI, you need to install at least one model. These models are pre-trained checkpoints that determine the artistic style for image generation. To select a model, visit CIVITAI and choose the one that appeals to you.
After finding your preferred model, click the download button. Once completed, transfer the ‘.safetensors’ checkpoint file to the correct folder. Navigate to the download directory for your Automatic1111 WebUI, then move to webui -> models -> Stable-diffusion. Paste the downloaded model file in this directory, and you’re ready to go.
Run and Configure WebUI
Now, you can execute and use Stable Diffusion directly in your web browser.
On macOS, access your “stable-diffusion-webui” folder via Terminal and run the command ./webui.sh --xformers
. For Windows users, execute ./webui-user.bat
. Upon completion, copy the URL provided next to “Running on local URL,” which typically appears as http://127.0.0.1:7860.
Input the URL in your browser’s address bar and press Enter. The Web UI will load locally in your default internet browser. Although the initial interface may appear overwhelming, you won’t need to adjust many settings initially.
Start by adjusting the Width and Height parameters and setting the batch size to 4, which will generate four distinct images for each prompt.
Next, enter any creative prompt in the txt2img tab. Be specific about the details you want in the image, separating various descriptors with commas. Additionally, describe the artistic style using terms such as ‘realistic’, ‘detailed’, or ‘close-up portrait’.
In the box for negative prompts, include any elements that you wish to exclude from your image. Consider modifying the “CFG Scale” setting; a higher value causes the generator to adhere more closely to your given prompts, while a lower value allows for more creative outputs.
Leave the remaining settings unchanged and click Generate at the top to begin the image generation process. Afterward, you can click on the thumbnail images to view them and decide if they meet your expectations. If they don’t, feel free to adjust the CFG Scale and your prompts. During this stage, your GPU will be heavily utilized.
If you find an image you like but wish to refine or fix issues (like distorted features), click on Send to img2img or Send to inpaint. This option will transfer your image and prompts to their respective tabs for further enhancement.
2. Exploring Fooocus: The Easiest AI Image Generator
Fooocus stands out as one of the simplest and most effective AI image generation tools available. Its intuitive interface makes it accessible for beginners who want to experiment with AI image creation before diving into more intricate methods.
Download the Fooocus compressed file and extract it once the download is finished. Next, head over to CIVITAI to pick a checkpoint you like. After downloading the checkpoint, navigate to your Fooocus folder. Click on Fooocus -> models -> checkpoints, and place the checkpoint file you downloaded there.
You can also download LoRAs from Civitai, which are smaller files that enhance large language models with new concepts or styles. Unlike checkpoints, which can be several gigabytes, LoRAs add distinctive elements to the final images while utilizing an existing checkpoint.
If you choose to use a LoRA to enhance your AI images’ visual style, return to the models folder in your Fooocus directory and paste the LoRA file in the loras folder.
Running Fooocus
It’s time to start generating images in Fooocus. Navigate to the folder where you extracted the software and double-click run.bat. The command prompt will appear and automatically load the Fooocus interface in your web browser.
On the opening screen, make sure to check the Advanced option at the bottom, which will reveal additional settings. Here, you can select the desired aspect ratio, the number of images Fooocus will generate per prompt, and choose the image file format.
Initially, set the performance option to Speed, as this will significantly increase the image generation speed. At the bottom, input negative prompts for unwanted elements.
Hover over each style to preview it. Then, navigate to the Models tab, where you can select the base model you’ve placed in your Fooocus folder. Directly below that, choose a LoRA if you have any installed.
All that’s left is to click the Generate button and watch Fooocus create your desired images. While it may not be the most powerful image generator available, Fooocus certainly proves to be the most straightforward method, allowing for easy adjustments of styles, checkpoints, and LoRAs to create your ideal images.
Utilizing AI Face Swap in Fooocus
Fooocus even features a FaceSwap function, which allows you to replace faces in an image with others. First, check the Input Image option at the bottom, then select Image Prompt. Here, upload the image you want to swap the face with. Scroll down, click Advanced again, and from the options, choose FaceSwap.
Next to the Image Prompt section, click on the Inpaint or Outpaint tab and upload the image for the face swap. Outline the face and hair, then go to the Advanced tab in the top right corner. Activate Developer Debug Mode, click on Control, and check the box for Mixing Image Prompt and Inpaint.
Once done, clear the prompt box and click Generate. This will execute the face swap with your selected image, yielding varying outcomes.
After generating your images, you may wish to enhance them using some top-tier AI image upscaling tools to improve their resolution.
3. Generating AI Images with ComfyUI
ComfyUI is another favored method to leverage Stable Diffusion for AI image creation. While the workflow may be more engaging, it is also more complex. To begin, download and extract ComfyUI from GitHub.
You’re likely familiar with checkpoints and LoRAs at this point. As mentioned before, download a checkpoint file (and a LoRA file if desired) and place it in the correct folders within the models directory of ComfyUI. In your ComfyUI directory, open the Update folder and run update_comfyui.bat to prepare the setup.
Now, it’s time to run the ComfyUI AI image generator. Navigate back to your ComfyUI directory, where you should see two batch files. If you have an Nvidia GPU, double-click run_nvidia_gpu.bat; otherwise, run run_cpu.bat.
Once ComfyUI launches in your browser, you’ll see its default workflow, which includes several interconnected nodes. Although it may look complex initially, these nodes represent various steps in the AI image generation process.
The multiple nodes allow you to create a tailored workflow, integrating different nodes, models, LoRAs, and refiners, granting users extensive control over the final output. However, this complexity can make ComfyUI hard to navigate and master.
Running ComfyUI
To get started, select a checkpoint in the Load Checkpoint node. Proceed to the CLIP Text Encode (Prompt) node, where you’ll input your text prompt for the image. Below that is a corresponding negative prompt node for unwanted descriptors. In the Empty Latent Image node, you can adjust the width, height, and the number of images you wish to generate.
Once you’re set with your prompts, adjust the image dimensions, batch size, and key steps for processing. About 20 to 30 steps usually yield a good-quality image. Finally, hit the Queue Prompt button and let ComfyUI do the work.
Using LoRAs in ComfyUI
To include certain LoRAs in ComfyUI, simply right-click near the checkpoint node and choose Add Node -> loaders -> Load LoRA. Select any LoRA from your folder in the directory.
However, keep in mind that each time a new LoRA node is added, you’ll need to rearrange the connections. Drag the line from the Checkpoint node labeled Model to the LoRA node’s model entry point on the left side instead of the KSampler. Then connect the exit point of the LoRA node back to the KSampler’s model input.
Ensure that both Clip lines from the Checkpoint node are directed to each Prompt node. Similarly, connect the left-side entry points of the LoRA’s Clip to both positive and negative prompts.
By understanding the default workflow and progressively adding custom nodes, you’ll become proficient in utilizing ComfyUI for your AI image generation needs.
Frequently Asked Questions
How do Stable Diffusion, DALL-E, and Midjourney differentiate?
All three AI systems can produce images from text prompts, but only Stable Diffusion is entirely free and open-source. You can install and run it on your computer without any cost, whereas DALL-E and Midjourney are proprietary software.
What exactly is a model in Stable Diffusion?
A model serves as a file that embodies an AI algorithm trained using specific images and keywords. Various models excel at generating distinct types of visuals. For instance, some may be optimized for realistic human depictions, while others are better suited for 2D illustrations or diverse artistic styles.
Image credit: Feature image by Stable Diffusion. All screenshots provided by Brandon Li and Samarveer Singh.
Leave a Reply