This script leverages Stable Diffusion v1.5 from Hugging Face’s Diffusers library to generate image variations based on a given text prompt. By using torch and PIL, it processes an input image, applies AI-driven transformations, and saves the results.
You can clone this repo to get the code https://github.com/alexander-uspenskiy/image_variations
Source code:
<span>import</span> <span>torch</span><span>from</span> <span>diffusers</span> <span>import</span> <span>StableDiffusionImg2ImgPipeline</span><span>from</span> <span>PIL</span> <span>import</span> <span>Image</span><span>import</span> <span>requests</span><span>from</span> <span>io</span> <span>import</span> <span>BytesIO</span><span>def</span> <span>load_image</span><span>(</span><span>image_path</span><span>,</span> <span>target_size</span><span>=</span><span>(</span><span>768</span><span>,</span> <span>768</span><span>)):</span><span>"""</span><span> Load and preprocess the input image </span><span>"""</span><span>if</span> <span>image_path</span><span>.</span><span>startswith</span><span>(</span><span>'</span><span>http</span><span>'</span><span>):</span><span>response</span> <span>=</span> <span>requests</span><span>.</span><span>get</span><span>(</span><span>image_path</span><span>)</span><span>image</span> <span>=</span> <span>Image</span><span>.</span><span>open</span><span>(</span><span>BytesIO</span><span>(</span><span>response</span><span>.</span><span>content</span><span>))</span><span>else</span><span>:</span><span>image</span> <span>=</span> <span>Image</span><span>.</span><span>open</span><span>(</span><span>image_path</span><span>)</span><span># Resize and preserve aspect ratio </span> <span>image</span> <span>=</span> <span>image</span><span>.</span><span>convert</span><span>(</span><span>"</span><span>RGB</span><span>"</span><span>)</span><span>image</span><span>.</span><span>thumbnail</span><span>(</span><span>target_size</span><span>,</span> <span>Image</span><span>.</span><span>Resampling</span><span>.</span><span>LANCZOS</span><span>)</span><span># Create new image with padding to reach target size </span> <span>new_image</span> <span>=</span> <span>Image</span><span>.</span><span>new</span><span>(</span><span>"</span><span>RGB</span><span>"</span><span>,</span> <span>target_size</span><span>,</span> <span>(</span><span>255</span><span>,</span> <span>255</span><span>,</span> <span>255</span><span>))</span><span>new_image</span><span>.</span><span>paste</span><span>(</span><span>image</span><span>,</span> <span>((</span><span>target_size</span><span>[</span><span>0</span><span>]</span> <span>-</span> <span>image</span><span>.</span><span>size</span><span>[</span><span>0</span><span>])</span> <span>//</span> <span>2</span><span>,</span><span>(</span><span>target_size</span><span>[</span><span>1</span><span>]</span> <span>-</span> <span>image</span><span>.</span><span>size</span><span>[</span><span>1</span><span>])</span> <span>//</span> <span>2</span><span>))</span><span>return</span> <span>new_image</span><span>def</span> <span>generate_image_variation</span><span>(</span><span>input_image_path</span><span>,</span><span>prompt</span><span>,</span><span>model_id</span><span>=</span><span>"</span><span>stable-diffusion-v1-5/stable-diffusion-v1-5</span><span>"</span><span>,</span><span>num_images</span><span>=</span><span>1</span><span>,</span><span>strength</span><span>=</span><span>0.75</span><span>,</span><span>guidance_scale</span><span>=</span><span>7.5</span><span>,</span><span>seed</span><span>=</span><span>None</span><span>):</span><span>"""</span><span> Generate variations of an input image using a specified prompt Parameters: - input_image_path: Path or URL to the input image - prompt: Text prompt to guide the image generation - model_id: Hugging Face model ID - num_images: Number of variations to generate - strength: How much to transform the input image (0-1) - guidance_scale: How closely to follow the prompt - seed: Random seed for reproducibility Returns: - List of generated images </span><span>"""</span><span># Set random seed if provided </span> <span>if</span> <span>seed</span> <span>is</span> <span>not</span> <span>None</span><span>:</span><span>torch</span><span>.</span><span>manual_seed</span><span>(</span><span>seed</span><span>)</span><span># Load the model </span> <span>device</span> <span>=</span> <span>"</span><span>cuda</span><span>"</span> <span>if</span> <span>torch</span><span>.</span><span>cuda</span><span>.</span><span>is_available</span><span>()</span> <span>else</span> <span>"</span><span>cpu</span><span>"</span><span>pipe</span> <span>=</span> <span>StableDiffusionImg2ImgPipeline</span><span>.</span><span>from_pretrained</span><span>(</span><span>model_id</span><span>,</span><span>torch_dtype</span><span>=</span><span>torch</span><span>.</span><span>float16</span> <span>if</span> <span>device</span> <span>==</span> <span>"</span><span>cuda</span><span>"</span> <span>else</span> <span>torch</span><span>.</span><span>float32</span><span>).</span><span>to</span><span>(</span><span>device</span><span>)</span><span># Load and preprocess the input image </span> <span>init_image</span> <span>=</span> <span>load_image</span><span>(</span><span>input_image_path</span><span>)</span><span># Generate images </span> <span>result</span> <span>=</span> <span>pipe</span><span>(</span><span>prompt</span><span>=</span><span>prompt</span><span>,</span><span>image</span><span>=</span><span>init_image</span><span>,</span><span>num_images_per_prompt</span><span>=</span><span>num_images</span><span>,</span><span>strength</span><span>=</span><span>strength</span><span>,</span><span>guidance_scale</span><span>=</span><span>guidance_scale</span><span>)</span><span>return</span> <span>result</span><span>.</span><span>images</span><span>def</span> <span>save_generated_images</span><span>(</span><span>images</span><span>,</span> <span>output_prefix</span><span>=</span><span>"</span><span>generated</span><span>"</span><span>):</span><span>"""</span><span> Save the generated images with sequential numbering </span><span>"""</span><span>for</span> <span>i</span><span>,</span> <span>image</span> <span>in</span> <span>enumerate</span><span>(</span><span>images</span><span>):</span><span>image</span><span>.</span><span>save</span><span>(</span><span>f</span><span>"</span><span>images-out/</span><span>{</span><span>output_prefix</span><span>}</span><span>_</span><span>{</span><span>i</span><span>}</span><span>.png</span><span>"</span><span>)</span><span># Example usage </span><span>if</span> <span>__name__</span> <span>==</span> <span>"</span><span>__main__</span><span>"</span><span>:</span><span># Example parameters </span> <span>input_image</span> <span>=</span> <span>"</span><span>images-in/Image_name.jpg</span><span>"</span> <span># or URL </span> <span>prompt</span> <span>=</span> <span>"</span><span>Draw the image in modern art style, photorealistic and detailed.</span><span>"</span><span># Generate variations </span> <span>generated_images</span> <span>=</span> <span>generate_image_variation</span><span>(</span><span>input_image</span><span>,</span><span>prompt</span><span>,</span><span>num_images</span><span>=</span><span>3</span><span>,</span><span>strength</span><span>=</span><span>0.75</span><span>,</span><span>seed</span><span>=</span><span>42</span> <span># Optional: for reproducibility </span> <span>)</span><span># Save the results </span> <span>save_generated_images</span><span>(</span><span>generated_images</span><span>)</span><span>import</span> <span>torch</span> <span>from</span> <span>diffusers</span> <span>import</span> <span>StableDiffusionImg2ImgPipeline</span> <span>from</span> <span>PIL</span> <span>import</span> <span>Image</span> <span>import</span> <span>requests</span> <span>from</span> <span>io</span> <span>import</span> <span>BytesIO</span> <span>def</span> <span>load_image</span><span>(</span><span>image_path</span><span>,</span> <span>target_size</span><span>=</span><span>(</span><span>768</span><span>,</span> <span>768</span><span>)):</span> <span>"""</span><span> Load and preprocess the input image </span><span>"""</span> <span>if</span> <span>image_path</span><span>.</span><span>startswith</span><span>(</span><span>'</span><span>http</span><span>'</span><span>):</span> <span>response</span> <span>=</span> <span>requests</span><span>.</span><span>get</span><span>(</span><span>image_path</span><span>)</span> <span>image</span> <span>=</span> <span>Image</span><span>.</span><span>open</span><span>(</span><span>BytesIO</span><span>(</span><span>response</span><span>.</span><span>content</span><span>))</span> <span>else</span><span>:</span> <span>image</span> <span>=</span> <span>Image</span><span>.</span><span>open</span><span>(</span><span>image_path</span><span>)</span> <span># Resize and preserve aspect ratio </span> <span>image</span> <span>=</span> <span>image</span><span>.</span><span>convert</span><span>(</span><span>"</span><span>RGB</span><span>"</span><span>)</span> <span>image</span><span>.</span><span>thumbnail</span><span>(</span><span>target_size</span><span>,</span> <span>Image</span><span>.</span><span>Resampling</span><span>.</span><span>LANCZOS</span><span>)</span> <span># Create new image with padding to reach target size </span> <span>new_image</span> <span>=</span> <span>Image</span><span>.</span><span>new</span><span>(</span><span>"</span><span>RGB</span><span>"</span><span>,</span> <span>target_size</span><span>,</span> <span>(</span><span>255</span><span>,</span> <span>255</span><span>,</span> <span>255</span><span>))</span> <span>new_image</span><span>.</span><span>paste</span><span>(</span><span>image</span><span>,</span> <span>((</span><span>target_size</span><span>[</span><span>0</span><span>]</span> <span>-</span> <span>image</span><span>.</span><span>size</span><span>[</span><span>0</span><span>])</span> <span>//</span> <span>2</span><span>,</span> <span>(</span><span>target_size</span><span>[</span><span>1</span><span>]</span> <span>-</span> <span>image</span><span>.</span><span>size</span><span>[</span><span>1</span><span>])</span> <span>//</span> <span>2</span><span>))</span> <span>return</span> <span>new_image</span> <span>def</span> <span>generate_image_variation</span><span>(</span> <span>input_image_path</span><span>,</span> <span>prompt</span><span>,</span> <span>model_id</span><span>=</span><span>"</span><span>stable-diffusion-v1-5/stable-diffusion-v1-5</span><span>"</span><span>,</span> <span>num_images</span><span>=</span><span>1</span><span>,</span> <span>strength</span><span>=</span><span>0.75</span><span>,</span> <span>guidance_scale</span><span>=</span><span>7.5</span><span>,</span> <span>seed</span><span>=</span><span>None</span> <span>):</span> <span>"""</span><span> Generate variations of an input image using a specified prompt Parameters: - input_image_path: Path or URL to the input image - prompt: Text prompt to guide the image generation - model_id: Hugging Face model ID - num_images: Number of variations to generate - strength: How much to transform the input image (0-1) - guidance_scale: How closely to follow the prompt - seed: Random seed for reproducibility Returns: - List of generated images </span><span>"""</span> <span># Set random seed if provided </span> <span>if</span> <span>seed</span> <span>is</span> <span>not</span> <span>None</span><span>:</span> <span>torch</span><span>.</span><span>manual_seed</span><span>(</span><span>seed</span><span>)</span> <span># Load the model </span> <span>device</span> <span>=</span> <span>"</span><span>cuda</span><span>"</span> <span>if</span> <span>torch</span><span>.</span><span>cuda</span><span>.</span><span>is_available</span><span>()</span> <span>else</span> <span>"</span><span>cpu</span><span>"</span> <span>pipe</span> <span>=</span> <span>StableDiffusionImg2ImgPipeline</span><span>.</span><span>from_pretrained</span><span>(</span> <span>model_id</span><span>,</span> <span>torch_dtype</span><span>=</span><span>torch</span><span>.</span><span>float16</span> <span>if</span> <span>device</span> <span>==</span> <span>"</span><span>cuda</span><span>"</span> <span>else</span> <span>torch</span><span>.</span><span>float32</span> <span>).</span><span>to</span><span>(</span><span>device</span><span>)</span> <span># Load and preprocess the input image </span> <span>init_image</span> <span>=</span> <span>load_image</span><span>(</span><span>input_image_path</span><span>)</span> <span># Generate images </span> <span>result</span> <span>=</span> <span>pipe</span><span>(</span> <span>prompt</span><span>=</span><span>prompt</span><span>,</span> <span>image</span><span>=</span><span>init_image</span><span>,</span> <span>num_images_per_prompt</span><span>=</span><span>num_images</span><span>,</span> <span>strength</span><span>=</span><span>strength</span><span>,</span> <span>guidance_scale</span><span>=</span><span>guidance_scale</span> <span>)</span> <span>return</span> <span>result</span><span>.</span><span>images</span> <span>def</span> <span>save_generated_images</span><span>(</span><span>images</span><span>,</span> <span>output_prefix</span><span>=</span><span>"</span><span>generated</span><span>"</span><span>):</span> <span>"""</span><span> Save the generated images with sequential numbering </span><span>"""</span> <span>for</span> <span>i</span><span>,</span> <span>image</span> <span>in</span> <span>enumerate</span><span>(</span><span>images</span><span>):</span> <span>image</span><span>.</span><span>save</span><span>(</span><span>f</span><span>"</span><span>images-out/</span><span>{</span><span>output_prefix</span><span>}</span><span>_</span><span>{</span><span>i</span><span>}</span><span>.png</span><span>"</span><span>)</span> <span># Example usage </span><span>if</span> <span>__name__</span> <span>==</span> <span>"</span><span>__main__</span><span>"</span><span>:</span> <span># Example parameters </span> <span>input_image</span> <span>=</span> <span>"</span><span>images-in/Image_name.jpg</span><span>"</span> <span># or URL </span> <span>prompt</span> <span>=</span> <span>"</span><span>Draw the image in modern art style, photorealistic and detailed.</span><span>"</span> <span># Generate variations </span> <span>generated_images</span> <span>=</span> <span>generate_image_variation</span><span>(</span> <span>input_image</span><span>,</span> <span>prompt</span><span>,</span> <span>num_images</span><span>=</span><span>3</span><span>,</span> <span>strength</span><span>=</span><span>0.75</span><span>,</span> <span>seed</span><span>=</span><span>42</span> <span># Optional: for reproducibility </span> <span>)</span> <span># Save the results </span> <span>save_generated_images</span><span>(</span><span>generated_images</span><span>)</span>import torch from diffusers import StableDiffusionImg2ImgPipeline from PIL import Image import requests from io import BytesIO def load_image(image_path, target_size=(768, 768)): """ Load and preprocess the input image """ if image_path.startswith('http'): response = requests.get(image_path) image = Image.open(BytesIO(response.content)) else: image = Image.open(image_path) # Resize and preserve aspect ratio image = image.convert("RGB") image.thumbnail(target_size, Image.Resampling.LANCZOS) # Create new image with padding to reach target size new_image = Image.new("RGB", target_size, (255, 255, 255)) new_image.paste(image, ((target_size[0] - image.size[0]) // 2, (target_size[1] - image.size[1]) // 2)) return new_image def generate_image_variation( input_image_path, prompt, model_id="stable-diffusion-v1-5/stable-diffusion-v1-5", num_images=1, strength=0.75, guidance_scale=7.5, seed=None ): """ Generate variations of an input image using a specified prompt Parameters: - input_image_path: Path or URL to the input image - prompt: Text prompt to guide the image generation - model_id: Hugging Face model ID - num_images: Number of variations to generate - strength: How much to transform the input image (0-1) - guidance_scale: How closely to follow the prompt - seed: Random seed for reproducibility Returns: - List of generated images """ # Set random seed if provided if seed is not None: torch.manual_seed(seed) # Load the model device = "cuda" if torch.cuda.is_available() else "cpu" pipe = StableDiffusionImg2ImgPipeline.from_pretrained( model_id, torch_dtype=torch.float16 if device == "cuda" else torch.float32 ).to(device) # Load and preprocess the input image init_image = load_image(input_image_path) # Generate images result = pipe( prompt=prompt, image=init_image, num_images_per_prompt=num_images, strength=strength, guidance_scale=guidance_scale ) return result.images def save_generated_images(images, output_prefix="generated"): """ Save the generated images with sequential numbering """ for i, image in enumerate(images): image.save(f"images-out/{output_prefix}_{i}.png") # Example usage if __name__ == "__main__": # Example parameters input_image = "images-in/Image_name.jpg" # or URL prompt = "Draw the image in modern art style, photorealistic and detailed." # Generate variations generated_images = generate_image_variation( input_image, prompt, num_images=3, strength=0.75, seed=42 # Optional: for reproducibility ) # Save the results save_generated_images(generated_images)
Enter fullscreen mode Exit fullscreen mode
How It Works:
Load & Preprocess the Input Image
Accepts both local file paths and URLs.
Converts the image to RGB format and resizes it to 768×768, maintaining aspect ratio.
Adds padding to fit the target size.
Initialize Stable Diffusion v1.5
Loads the model on CUDA (if available) or falls back to CPU.
Uses StableDiffusionImg2ImgPipeline to process the input image.
Generate AI-Modified Image Variations
Takes in a text prompt to guide the transformation.
Parameters like strength (0-1) and guidance scale (higher = stricter prompt adherence) allow customization.
Supports multiple output images per prompt.
Save Results to image-out directory.
Outputs generated images with a sequential naming scheme (generated_0.png, generated_1.png, etc.).
Example Use Case
You can transform an image of a person into a medieval king using a prompt like:
prompt = "Draw this person as a powerful king, photorealistic and detailed, in a medieval setting."
Initial image:
Result:
Cons&Pros
Cons:
- Can be slow on some hardware configurations.
- Small size model limitations.
Pros:
- Runs locally (no need for cloud services).
- Customizable parameters for fine-tuning output.
- Reproducibility with optional random seed.
暂无评论内容