Build a simple image-generating chatbot
Today I’m guiding you through the creation, from scratch, of an image generating chatbot.
We’ll doing it following the script I used to build my already existent chatbot, awesome-tiny-sd: make sure to check it out and leave a ⭐ on GitHub!
First of all, we need to install all the necessary packages:
python3 <span>-m</span> pip <span>install </span><span>gradio</span><span>==</span>4.25.0 <span>diffusers</span><span>==</span>0.27.2 <span>torch</span><span>==</span>2.1.2 <span>pydantic</span><span>==</span>2.6.4 accelerate transformers trl peftpython3 <span>-m</span> pip <span>install </span><span>gradio</span><span>==</span>4.25.0 <span>diffusers</span><span>==</span>0.27.2 <span>torch</span><span>==</span>2.1.2 <span>pydantic</span><span>==</span>2.6.4 accelerate transformers trl peftpython3 -m pip install gradio==4.25.0 diffusers==0.27.2 torch==2.1.2 pydantic==2.6.4 accelerate transformers trl peft
Enter fullscreen mode Exit fullscreen mode
Once you did that, make sure to set up your folder like this:
./|__ app.py|__ imgen.py./ |__ app.py |__ imgen.py./ |__ app.py |__ imgen.py
Enter fullscreen mode Exit fullscreen mode
And let’s begin coding!
Block 1: import your favorite stable-diffusion model in imgen.py
- Import necessary dependencies:
<span>from</span> <span>diffusers</span> <span>import</span> <span>DiffusionPipeline</span><span>import</span> <span>torch</span><span>from</span> <span>diffusers</span> <span>import</span> <span>DiffusionPipeline</span> <span>import</span> <span>torch</span>from diffusers import DiffusionPipeline import torch
Enter fullscreen mode Exit fullscreen mode
- Define the image-generating pipeline (this will automatically download the stable-diffusion model you specified and all its related components):
<span>pipeline</span> <span>=</span> <span>DiffusionPipeline</span><span>.</span><span>from_pretrained</span><span>(</span><span>"</span><span>segmind/small-sd</span><span>"</span><span>,</span> <span>torch_dtype</span><span>=</span><span>torch</span><span>.</span><span>float32</span><span>)</span><span>pipeline</span> <span>=</span> <span>DiffusionPipeline</span><span>.</span><span>from_pretrained</span><span>(</span><span>"</span><span>segmind/small-sd</span><span>"</span><span>,</span> <span>torch_dtype</span><span>=</span><span>torch</span><span>.</span><span>float32</span><span>)</span>pipeline = DiffusionPipeline.from_pretrained("segmind/small-sd", torch_dtype=torch.float32)
Enter fullscreen mode Exit fullscreen mode
We chose to use segmind/small-sd because it’s small and CPU-friendly.
Block 2: Define chatbot essentials in app.py
- Import necessary dependencies:
<span>import</span> <span>gradio</span> <span>as</span> <span>gr</span><span>import</span> <span>time</span><span>from</span> <span>imgen</span> <span>import</span> <span>*</span><span>import</span> <span>gradio</span> <span>as</span> <span>gr</span> <span>import</span> <span>time</span> <span>from</span> <span>imgen</span> <span>import</span> <span>*</span>import gradio as gr import time from imgen import *
Enter fullscreen mode Exit fullscreen mode
- A simple function to print like and dislikes by the users:
<span>def</span> <span>print_like_dislike</span><span>(</span><span>x</span><span>:</span> <span>gr</span><span>.</span><span>LikeData</span><span>):</span><span>print</span><span>(</span><span>x</span><span>.</span><span>index</span><span>,</span> <span>x</span><span>.</span><span>value</span><span>,</span> <span>x</span><span>.</span><span>liked</span><span>)</span><span>def</span> <span>print_like_dislike</span><span>(</span><span>x</span><span>:</span> <span>gr</span><span>.</span><span>LikeData</span><span>):</span> <span>print</span><span>(</span><span>x</span><span>.</span><span>index</span><span>,</span> <span>x</span><span>.</span><span>value</span><span>,</span> <span>x</span><span>.</span><span>liked</span><span>)</span>def print_like_dislike(x: gr.LikeData): print(x.index, x.value, x.liked)
Enter fullscreen mode Exit fullscreen mode
- The function the appends new messages and/or uploaded files to the chatbot history:
<span>def</span> <span>add_message</span><span>(</span><span>history</span><span>,</span> <span>message</span><span>):</span><span>if</span> <span>len</span><span>(</span><span>message</span><span>[</span><span>"</span><span>files</span><span>"</span><span>])</span> <span>></span> <span>0</span><span>:</span><span>history</span><span>.</span><span>append</span><span>((</span><span>message</span><span>[</span><span>"</span><span>files</span><span>"</span><span>],</span> <span>None</span><span>))</span><span>if</span> <span>message</span><span>[</span><span>"</span><span>text</span><span>"</span><span>]</span> <span>is</span> <span>not</span> <span>None</span> <span>and</span> <span>message</span><span>[</span><span>"</span><span>text</span><span>"</span><span>]</span> <span>!=</span> <span>""</span><span>:</span><span>history</span><span>.</span><span>append</span><span>((</span><span>message</span><span>[</span><span>"</span><span>text</span><span>"</span><span>],</span> <span>None</span><span>))</span><span>return</span> <span>history</span><span>,</span> <span>gr</span><span>.</span><span>MultimodalTextbox</span><span>(</span><span>value</span><span>=</span><span>None</span><span>,</span> <span>interactive</span><span>=</span><span>False</span><span>)</span><span>def</span> <span>add_message</span><span>(</span><span>history</span><span>,</span> <span>message</span><span>):</span> <span>if</span> <span>len</span><span>(</span><span>message</span><span>[</span><span>"</span><span>files</span><span>"</span><span>])</span> <span>></span> <span>0</span><span>:</span> <span>history</span><span>.</span><span>append</span><span>((</span><span>message</span><span>[</span><span>"</span><span>files</span><span>"</span><span>],</span> <span>None</span><span>))</span> <span>if</span> <span>message</span><span>[</span><span>"</span><span>text</span><span>"</span><span>]</span> <span>is</span> <span>not</span> <span>None</span> <span>and</span> <span>message</span><span>[</span><span>"</span><span>text</span><span>"</span><span>]</span> <span>!=</span> <span>""</span><span>:</span> <span>history</span><span>.</span><span>append</span><span>((</span><span>message</span><span>[</span><span>"</span><span>text</span><span>"</span><span>],</span> <span>None</span><span>))</span> <span>return</span> <span>history</span><span>,</span> <span>gr</span><span>.</span><span>MultimodalTextbox</span><span>(</span><span>value</span><span>=</span><span>None</span><span>,</span> <span>interactive</span><span>=</span><span>False</span><span>)</span>def add_message(history, message): if len(message["files"]) > 0: history.append((message["files"], None)) if message["text"] is not None and message["text"] != "": history.append((message["text"], None)) return history, gr.MultimodalTextbox(value=None, interactive=False)
Enter fullscreen mode Exit fullscreen mode
- The function that, starting from the text-prompt, generates an image:
<span>def</span> <span>bot</span><span>(</span><span>history</span><span>):</span><span>if</span> <span>type</span><span>(</span><span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>0</span><span>])</span> <span>!=</span> <span>tuple</span><span>:</span> <span>## text prompt </span> <span>try</span><span>:</span><span>prompt</span> <span>=</span> <span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>0</span><span>]</span><span>image</span> <span>=</span> <span>pipeline</span><span>(</span><span>prompt</span><span>).</span><span>images</span><span>[</span><span>0</span><span>]</span> <span>## call the model </span> <span>image</span><span>.</span><span>save</span><span>(</span><span>"</span><span>generated_image.png</span><span>"</span><span>)</span><span>response</span> <span>=</span> <span>(</span><span>"</span><span>generated_image.png</span><span>"</span><span>,)</span><span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>1</span><span>]</span> <span>=</span> <span>response</span><span>yield</span> <span>history</span> <span>## return the image </span> <span>except</span> <span>Exception</span> <span>as</span> <span>e</span><span>:</span><span>response</span> <span>=</span> <span>f</span><span>"</span><span>Sorry, the error </span><span>'</span><span>{</span><span>e</span><span>}</span><span>'</span><span> occured while generating the response; check [troubleshooting documentation](https://astrabert.github.io/awesome-tiny-sd/#troubleshooting) for more</span><span>"</span><span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>1</span><span>]</span> <span>=</span> <span>""</span><span>for</span> <span>character</span> <span>in</span> <span>response</span><span>:</span><span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>1</span><span>]</span> <span>+=</span> <span>character</span><span>time</span><span>.</span><span>sleep</span><span>(</span><span>0.05</span><span>)</span><span>yield</span> <span>history</span><span>if</span> <span>type</span><span>(</span><span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>0</span><span>])</span> <span>==</span> <span>tuple</span><span>:</span> <span>## input are files </span> <span>response</span> <span>=</span> <span>f</span><span>"</span><span>Sorry, this version still does not support uploaded files :(</span><span>"</span> <span>## We will see how to add this functionality in the future </span> <span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>1</span><span>]</span> <span>=</span> <span>""</span><span>for</span> <span>character</span> <span>in</span> <span>response</span><span>:</span><span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>1</span><span>]</span> <span>+=</span> <span>character</span><span>time</span><span>.</span><span>sleep</span><span>(</span><span>0.05</span><span>)</span><span>yield</span> <span>history</span><span>def</span> <span>bot</span><span>(</span><span>history</span><span>):</span> <span>if</span> <span>type</span><span>(</span><span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>0</span><span>])</span> <span>!=</span> <span>tuple</span><span>:</span> <span>## text prompt </span> <span>try</span><span>:</span> <span>prompt</span> <span>=</span> <span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>0</span><span>]</span> <span>image</span> <span>=</span> <span>pipeline</span><span>(</span><span>prompt</span><span>).</span><span>images</span><span>[</span><span>0</span><span>]</span> <span>## call the model </span> <span>image</span><span>.</span><span>save</span><span>(</span><span>"</span><span>generated_image.png</span><span>"</span><span>)</span> <span>response</span> <span>=</span> <span>(</span><span>"</span><span>generated_image.png</span><span>"</span><span>,)</span> <span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>1</span><span>]</span> <span>=</span> <span>response</span> <span>yield</span> <span>history</span> <span>## return the image </span> <span>except</span> <span>Exception</span> <span>as</span> <span>e</span><span>:</span> <span>response</span> <span>=</span> <span>f</span><span>"</span><span>Sorry, the error </span><span>'</span><span>{</span><span>e</span><span>}</span><span>'</span><span> occured while generating the response; check [troubleshooting documentation](https://astrabert.github.io/awesome-tiny-sd/#troubleshooting) for more</span><span>"</span> <span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>1</span><span>]</span> <span>=</span> <span>""</span> <span>for</span> <span>character</span> <span>in</span> <span>response</span><span>:</span> <span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>1</span><span>]</span> <span>+=</span> <span>character</span> <span>time</span><span>.</span><span>sleep</span><span>(</span><span>0.05</span><span>)</span> <span>yield</span> <span>history</span> <span>if</span> <span>type</span><span>(</span><span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>0</span><span>])</span> <span>==</span> <span>tuple</span><span>:</span> <span>## input are files </span> <span>response</span> <span>=</span> <span>f</span><span>"</span><span>Sorry, this version still does not support uploaded files :(</span><span>"</span> <span>## We will see how to add this functionality in the future </span> <span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>1</span><span>]</span> <span>=</span> <span>""</span> <span>for</span> <span>character</span> <span>in</span> <span>response</span><span>:</span> <span>history</span><span>[</span><span>-</span><span>1</span><span>][</span><span>1</span><span>]</span> <span>+=</span> <span>character</span> <span>time</span><span>.</span><span>sleep</span><span>(</span><span>0.05</span><span>)</span> <span>yield</span> <span>history</span>def bot(history): if type(history[-1][0]) != tuple: ## text prompt try: prompt = history[-1][0] image = pipeline(prompt).images[0] ## call the model image.save("generated_image.png") response = ("generated_image.png",) history[-1][1] = response yield history ## return the image except Exception as e: response = f"Sorry, the error '{e}' occured while generating the response; check [troubleshooting documentation](https://astrabert.github.io/awesome-tiny-sd/#troubleshooting) for more" history[-1][1] = "" for character in response: history[-1][1] += character time.sleep(0.05) yield history if type(history[-1][0]) == tuple: ## input are files response = f"Sorry, this version still does not support uploaded files :(" ## We will see how to add this functionality in the future history[-1][1] = "" for character in response: history[-1][1] += character time.sleep(0.05) yield history
Enter fullscreen mode Exit fullscreen mode
Block 3: build the actual chatbot
- Define the chatbot blocks with Gradio:
<span>with</span> <span>gr</span><span>.</span><span>Blocks</span><span>()</span> <span>as</span> <span>demo</span><span>:</span><span>chatbot</span> <span>=</span> <span>gr</span><span>.</span><span>Chatbot</span><span>(</span><span>[[</span><span>None</span><span>,</span> <span>(</span><span>"</span><span>Hi, I am awesome-tiny-sd, a little stable diffusion model that lets you generate images:blush:</span><span>\n</span><span>Just write me a prompt, I</span><span>'</span><span>ll generate what you ask for:heart:</span><span>"</span><span>,)]],</span> <span>## the first argument is the chat history </span> <span>label</span><span>=</span><span>"</span><span>awesome-tiny-sd</span><span>"</span><span>,</span><span>elem_id</span><span>=</span><span>"</span><span>chatbot</span><span>"</span><span>,</span><span>bubble_full_width</span><span>=</span><span>False</span><span>,</span><span>)</span> <span>## this is the base chatbot architecture </span><span>chat_input</span> <span>=</span> <span>gr</span><span>.</span><span>MultimodalTextbox</span><span>(</span><span>interactive</span><span>=</span><span>True</span><span>,</span> <span>file_types</span><span>=</span><span>[</span><span>"</span><span>png</span><span>"</span><span>,</span><span>"</span><span>jpg</span><span>"</span><span>,</span><span>"</span><span>jpeg</span><span>"</span><span>],</span> <span>placeholder</span><span>=</span><span>"</span><span>Enter your image-generating prompt...</span><span>"</span><span>,</span> <span>show_label</span><span>=</span><span>False</span><span>)</span> <span>## types of supported input </span><span>chat_msg</span> <span>=</span> <span>chat_input</span><span>.</span><span>submit</span><span>(</span><span>add_message</span><span>,</span> <span>[</span><span>chatbot</span><span>,</span> <span>chat_input</span><span>],</span> <span>[</span><span>chatbot</span><span>,</span> <span>chat_input</span><span>])</span> <span>## receive a message </span> <span>bot_msg</span> <span>=</span> <span>chat_msg</span><span>.</span><span>then</span><span>(</span><span>bot</span><span>,</span> <span>chatbot</span><span>,</span> <span>chatbot</span><span>,</span> <span>api_name</span><span>=</span><span>"</span><span>bot_response</span><span>"</span><span>)</span> <span>## send a message </span> <span>bot_msg</span><span>.</span><span>then</span><span>(</span><span>lambda</span><span>:</span> <span>gr</span><span>.</span><span>MultimodalTextbox</span><span>(</span><span>interactive</span><span>=</span><span>True</span><span>),</span> <span>None</span><span>,</span> <span>[</span><span>chat_input</span><span>])</span><span>chatbot</span><span>.</span><span>like</span><span>(</span><span>print_like_dislike</span><span>,</span> <span>None</span><span>,</span> <span>None</span><span>)</span><span>clear</span> <span>=</span> <span>gr</span><span>.</span><span>ClearButton</span><span>(</span><span>chatbot</span><span>)</span> <span>## show clear button </span><span>with</span> <span>gr</span><span>.</span><span>Blocks</span><span>()</span> <span>as</span> <span>demo</span><span>:</span> <span>chatbot</span> <span>=</span> <span>gr</span><span>.</span><span>Chatbot</span><span>(</span> <span>[[</span><span>None</span><span>,</span> <span>(</span><span>"</span><span>Hi, I am awesome-tiny-sd, a little stable diffusion model that lets you generate images:blush:</span><span>\n</span><span>Just write me a prompt, I</span><span>'</span><span>ll generate what you ask for:heart:</span><span>"</span><span>,)]],</span> <span>## the first argument is the chat history </span> <span>label</span><span>=</span><span>"</span><span>awesome-tiny-sd</span><span>"</span><span>,</span> <span>elem_id</span><span>=</span><span>"</span><span>chatbot</span><span>"</span><span>,</span> <span>bubble_full_width</span><span>=</span><span>False</span><span>,</span> <span>)</span> <span>## this is the base chatbot architecture </span> <span>chat_input</span> <span>=</span> <span>gr</span><span>.</span><span>MultimodalTextbox</span><span>(</span><span>interactive</span><span>=</span><span>True</span><span>,</span> <span>file_types</span><span>=</span><span>[</span><span>"</span><span>png</span><span>"</span><span>,</span><span>"</span><span>jpg</span><span>"</span><span>,</span><span>"</span><span>jpeg</span><span>"</span><span>],</span> <span>placeholder</span><span>=</span><span>"</span><span>Enter your image-generating prompt...</span><span>"</span><span>,</span> <span>show_label</span><span>=</span><span>False</span><span>)</span> <span>## types of supported input </span> <span>chat_msg</span> <span>=</span> <span>chat_input</span><span>.</span><span>submit</span><span>(</span><span>add_message</span><span>,</span> <span>[</span><span>chatbot</span><span>,</span> <span>chat_input</span><span>],</span> <span>[</span><span>chatbot</span><span>,</span> <span>chat_input</span><span>])</span> <span>## receive a message </span> <span>bot_msg</span> <span>=</span> <span>chat_msg</span><span>.</span><span>then</span><span>(</span><span>bot</span><span>,</span> <span>chatbot</span><span>,</span> <span>chatbot</span><span>,</span> <span>api_name</span><span>=</span><span>"</span><span>bot_response</span><span>"</span><span>)</span> <span>## send a message </span> <span>bot_msg</span><span>.</span><span>then</span><span>(</span><span>lambda</span><span>:</span> <span>gr</span><span>.</span><span>MultimodalTextbox</span><span>(</span><span>interactive</span><span>=</span><span>True</span><span>),</span> <span>None</span><span>,</span> <span>[</span><span>chat_input</span><span>])</span> <span>chatbot</span><span>.</span><span>like</span><span>(</span><span>print_like_dislike</span><span>,</span> <span>None</span><span>,</span> <span>None</span><span>)</span> <span>clear</span> <span>=</span> <span>gr</span><span>.</span><span>ClearButton</span><span>(</span><span>chatbot</span><span>)</span> <span>## show clear button </span>with gr.Blocks() as demo: chatbot = gr.Chatbot( [[None, ("Hi, I am awesome-tiny-sd, a little stable diffusion model that lets you generate images:blush:\nJust write me a prompt, I'll generate what you ask for:heart:",)]], ## the first argument is the chat history label="awesome-tiny-sd", elem_id="chatbot", bubble_full_width=False, ) ## this is the base chatbot architecture chat_input = gr.MultimodalTextbox(interactive=True, file_types=["png","jpg","jpeg"], placeholder="Enter your image-generating prompt...", show_label=False) ## types of supported input chat_msg = chat_input.submit(add_message, [chatbot, chat_input], [chatbot, chat_input]) ## receive a message bot_msg = chat_msg.then(bot, chatbot, chatbot, api_name="bot_response") ## send a message bot_msg.then(lambda: gr.MultimodalTextbox(interactive=True), None, [chat_input]) chatbot.like(print_like_dislike, None, None) clear = gr.ClearButton(chatbot) ## show clear button
Enter fullscreen mode Exit fullscreen mode
- Launch the chatbot:
<span>demo</span><span>.</span><span>queue</span><span>()</span><span>if</span> <span>__name__</span> <span>==</span> <span>"</span><span>__main__</span><span>"</span><span>:</span><span>demo</span><span>.</span><span>launch</span><span>(</span><span>server_name</span><span>=</span><span>"</span><span>0.0.0.0</span><span>"</span><span>,</span> <span>share</span><span>=</span><span>False</span><span>)</span><span>demo</span><span>.</span><span>queue</span><span>()</span> <span>if</span> <span>__name__</span> <span>==</span> <span>"</span><span>__main__</span><span>"</span><span>:</span> <span>demo</span><span>.</span><span>launch</span><span>(</span><span>server_name</span><span>=</span><span>"</span><span>0.0.0.0</span><span>"</span><span>,</span> <span>share</span><span>=</span><span>False</span><span>)</span>demo.queue() if __name__ == "__main__": demo.launch(server_name="0.0.0.0", share=False)
Enter fullscreen mode Exit fullscreen mode
- Run the script:
python3 app.pypython3 app.pypython3 app.py
Enter fullscreen mode Exit fullscreen mode
Now the chatbot, once the stable diffusion pipeline is loaded, should be running on localhost:7860
(or 0.0.0.0:7860
for Linux-like OS).
You can give a try on this Hugging Face space: https://huggingface.co/spaces/as-cle-bert/awesome-tiny-sd
Otherwise, you can download awesome-tiny-sd Docker image and run it through container:
docker pull ghcr.io/astrabert/awesome-tiny-sd:latestdocker run <span>-p</span> 7860:7860 ghcr.io/astrabert/awesome-tiny-sd:latestdocker pull ghcr.io/astrabert/awesome-tiny-sd:latest docker run <span>-p</span> 7860:7860 ghcr.io/astrabert/awesome-tiny-sd:latestdocker pull ghcr.io/astrabert/awesome-tiny-sd:latest docker run -p 7860:7860 ghcr.io/astrabert/awesome-tiny-sd:latest
Enter fullscreen mode Exit fullscreen mode
Give it a try, you won’t be disappointed!!!
Do not forget to sponsor the project on GitHub: if we get far enough with sponsoring, we will upgrade the HF space to a GPU-powered one in order to make image generation faster.
What will be the first image you are going to generate with awesome-tiny-sd? Let me know in the comments below!️
Cover image by Google DeepMind
暂无评论内容