Image search with Streamlit in Snowflake (3 Part Series)
1 Image search with Streamlit in Snowflake (SiS) Part 1 – Creating an image gallery app –
2 Image search with Streamlit in Snowflake (SiS) Part 2 – Automatically generate image captions –
3 Image search with Streamlit in Snowflake (SiS) Part 3 – Add vector search function for images –
Introduction
This article is Part 3. It’s a continuation of Part 1 and Part 2, so if you haven’t read them yet, please check out the following links first:
In Part 1, we created an image gallery app using Streamlit in Snowflake to display images stored in the app’s default internal stage. In Part 2, we added a feature to generate captions for each image based on the image gallery app.
Finally, in Part 3, we’ll complete the application by adding an image search feature! We’ll use vector search for image searching, allowing for relevant search results even with ambiguous keywords.
Note: This article is my personal publication. Please understand that it does not represent official statements from Snowflake.
Feature Overview
Goals
- (Done) Display image data with Streamlit in Snowflake
- (Done) Add descriptions to images with Streamlit in Snowflake
- *Generate vector data based on image descriptions
- *Perform image searches with Streamlit in Snowflake
*: Areas to be implemented in Part 3
Features to be Implemented in Part 3
- Function to generate vector data from image captions
- Function to perform fuzzy searches on the image gallery
Final Image for Part 3
Prerequisites
- Snowflake
- A Snowflake account
- Streamlit in Snowflake installation package
- boto3 1.28.64
- AWS
- An AWS account with access to Amazon Bedrock (we’ll be using the us-east-1 region in this guide to use Claude 3.5 Sonnet)
Basic Confirmation
Vectorization Options in Snowflake
In this article, we’ll implement a search function by creating vector data from image captions. While vectorization might seem conceptually and technically challenging, Snowflake allows for easy implementation of vectorization and vector search. I hope this article will help you realize that “Vector search is easier than I thought and quite useful!”
For details on Snowflake’s vectorization methods and performance, please refer to my separate article:
https://zenn.dev/tsubasa_tech/articles/c0a2b8793a5d1f
Steps
(Omitted) Create a Streamlit in Snowflake app and upload images
If you haven’t done this yet, please follow the steps in Part 1’s article first.
(Omitted) Enable access to Amazon Bedrock from the Streamlit in Snowflake app
To automatically add captions to images, there are several options to consider:
- Implement processing using Python image processing libraries
- Create an ML model for image recognition to generate captions for images
- Use existing AI models for images like BLIP-2 to generate captions
- Pass images to a multimodal GenAI to generate captions
- Use a SaaS service for caption generation
Since we’ve previously introduced how to connect to Amazon Bedrock, we’ll use Amazon Bedrock’s anthropic.claude-3-5-sonnet
as a multimodal GenAI (option 4) to generate image captions.
For instructions on setting up access to Amazon Bedrock, please refer to the Calling Amazon Bedrock directly from Streamlit in Snowflake (SiS).
Run the Streamlit in Snowflake App
In the Streamlit in Snowflake app editing screen, simply copy and paste the following code:
<span>import</span> <span>streamlit</span> <span>as</span> <span>st</span><span>import</span> <span>pandas</span> <span>as</span> <span>pd</span><span>import</span> <span>os</span><span>import</span> <span>base64</span><span>import</span> <span>boto3</span><span>import</span> <span>json</span><span>from</span> <span>snowflake.snowpark.context</span> <span>import</span> <span>get_active_session</span><span>from</span> <span>snowflake.snowpark.functions</span> <span>import</span> <span>col</span><span>,</span> <span>when_matched</span><span>,</span> <span>when_not_matched</span><span>,</span> <span>lit</span><span>,</span> <span>call_udf</span><span>import</span> <span>_snowflake</span><span>from</span> <span>PIL</span> <span>import</span> <span>Image</span><span>import</span> <span>io</span><span># Set custom theme </span><span>st</span><span>.</span><span>set_page_config</span><span>(</span><span>page_title</span><span>=</span><span>"</span><span>Image Gallery</span><span>"</span><span>,</span><span>layout</span><span>=</span><span>"</span><span>wide</span><span>"</span><span>,</span><span>initial_sidebar_state</span><span>=</span><span>"</span><span>expanded</span><span>"</span><span>,</span><span>)</span><span># Add custom CSS </span><span>st</span><span>.</span><span>markdown</span><span>(</span><span>"""</span><span> <style> .reportview-container { background: #f0f2f6; } .main .block-container { padding-top: 2rem; padding-bottom: 2rem; padding-left: 5rem; padding-right: 5rem; } .stButton>button { background-color: #4CAF50; color: white; padding: 10px 20px; border: none; border-radius: 5px; cursor: pointer; transition: background-color 0.3s; } .stButton>button:hover { background-color: #45a049; } .stTextInput>div>div>input { border-radius: 5px; } .stSelectbox>div>div>select { border-radius: 5px; } h1, h2, h3 { color: #2c3e50; } .stProgress > div > div > div > div { background-color: #4CAF50; } </style> </span><span>"""</span><span>,</span> <span>unsafe_allow_html</span><span>=</span><span>True</span><span>)</span><span># Image folder path </span><span>IMAGE_FOLDER</span> <span>=</span> <span>"</span><span>image</span><span>"</span><span># Get Snowflake session </span><span>session</span> <span>=</span> <span>get_active_session</span><span>()</span><span># Create table (only on first run) </span><span>@st.cache_resource</span><span>def</span> <span>create_table_if_not_exists</span><span>():</span><span>session</span><span>.</span><span>sql</span><span>(</span><span>"""</span><span> CREATE TABLE IF NOT EXISTS IMAGE_METADATA ( FILE_NAME STRING, DESCRIPTION STRING, VECTOR VECTOR(FLOAT, 1024) ) </span><span>"""</span><span>).</span><span>collect</span><span>()</span><span>create_table_if_not_exists</span><span>()</span><span># Function to get AWS credentials </span><span>def</span> <span>get_aws_credentials</span><span>():</span><span>aws_key_object</span> <span>=</span> <span>_snowflake</span><span>.</span><span>get_username_password</span><span>(</span><span>'</span><span>bedrock_key</span><span>'</span><span>)</span><span>region</span> <span>=</span> <span>'</span><span>us-east-1</span><span>'</span><span>return</span> <span>{</span><span>'</span><span>aws_access_key_id</span><span>'</span><span>:</span> <span>aws_key_object</span><span>.</span><span>username</span><span>,</span><span>'</span><span>aws_secret_access_key</span><span>'</span><span>:</span> <span>aws_key_object</span><span>.</span><span>password</span><span>,</span><span>'</span><span>region_name</span><span>'</span><span>:</span> <span>region</span><span>},</span> <span>region</span><span># Set up Bedrock client </span><span>boto3_session_args</span><span>,</span> <span>region</span> <span>=</span> <span>get_aws_credentials</span><span>()</span><span>boto3_session</span> <span>=</span> <span>boto3</span><span>.</span><span>Session</span><span>(</span><span>**</span><span>boto3_session_args</span><span>)</span><span>bedrock</span> <span>=</span> <span>boto3_session</span><span>.</span><span>client</span><span>(</span><span>'</span><span>bedrock-runtime</span><span>'</span><span>,</span> <span>region_name</span><span>=</span><span>region</span><span>)</span><span># Get image data </span><span>@st.cache_data</span><span>def</span> <span>get_image_data</span><span>():</span><span>image_files</span> <span>=</span> <span>[</span><span>f</span> <span>for</span> <span>f</span> <span>in</span> <span>os</span><span>.</span><span>listdir</span><span>(</span><span>IMAGE_FOLDER</span><span>)</span> <span>if</span> <span>f</span><span>.</span><span>lower</span><span>().</span><span>endswith</span><span>((</span><span>'</span><span>.png</span><span>'</span><span>,</span> <span>'</span><span>.jpg</span><span>'</span><span>,</span> <span>'</span><span>.jpeg</span><span>'</span><span>,</span> <span>'</span><span>.gif</span><span>'</span><span>))]</span><span>return</span> <span>[{</span><span>"</span><span>FILE_NAME</span><span>"</span><span>:</span> <span>f</span><span>,</span> <span>"</span><span>IMG_PATH</span><span>"</span><span>:</span> <span>os</span><span>.</span><span>path</span><span>.</span><span>join</span><span>(</span><span>IMAGE_FOLDER</span><span>,</span> <span>f</span><span>)}</span> <span>for</span> <span>f</span> <span>in</span> <span>image_files</span><span>]</span><span># Get metadata </span><span>@st.cache_data</span><span>def</span> <span>get_metadata</span><span>():</span><span>return</span> <span>session</span><span>.</span><span>table</span><span>(</span><span>"</span><span>IMAGE_METADATA</span><span>"</span><span>).</span><span>select</span><span>(</span><span>"</span><span>FILE_NAME</span><span>"</span><span>,</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>).</span><span>to_pandas</span><span>()</span><span># Convert image to thumbnail and encode in base64 </span><span>@st.cache_data</span><span>def</span> <span>get_thumbnail_base64</span><span>(</span><span>img_path</span><span>,</span> <span>max_size</span><span>=</span><span>(</span><span>300</span><span>,</span> <span>300</span><span>)):</span><span>with</span> <span>Image</span><span>.</span><span>open</span><span>(</span><span>img_path</span><span>)</span> <span>as</span> <span>img</span><span>:</span><span>img</span><span>.</span><span>thumbnail</span><span>(</span><span>max_size</span><span>)</span><span>buffered</span> <span>=</span> <span>io</span><span>.</span><span>BytesIO</span><span>()</span><span>img</span><span>.</span><span>save</span><span>(</span><span>buffered</span><span>,</span> <span>format</span><span>=</span><span>"</span><span>JPEG</span><span>"</span><span>)</span><span>return</span> <span>base64</span><span>.</span><span>b64encode</span><span>(</span><span>buffered</span><span>.</span><span>getvalue</span><span>()).</span><span>decode</span><span>(</span><span>'</span><span>utf-8</span><span>'</span><span>)</span><span># Initialize image data and metadata </span><span>if</span> <span>'</span><span>img_df</span><span>'</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>:</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span> <span>=</span> <span>get_image_data</span><span>()</span><span>if</span> <span>'</span><span>metadata_df</span><span>'</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>:</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>get_metadata</span><span>()</span><span># Display image gallery </span><span>def</span> <span>show_image_gallery</span><span>():</span><span>st</span><span>.</span><span>title</span><span>(</span><span>"</span><span>️ Image Gallery</span><span>"</span><span>)</span><span># Add search box </span> <span>search_query</span> <span>=</span> <span>st</span><span>.</span><span>text_input</span><span>(</span><span>"</span><span>Search images (top 10 most relevant results will be displayed)</span><span>"</span><span>,</span> <span>""</span><span>)</span><span>if</span> <span>search_query</span><span>:</span><span># Escape search query (basic SQL injection prevention) </span> <span>escaped_query</span> <span>=</span> <span>search_query</span><span>.</span><span>replace</span><span>(</span><span>"'"</span><span>,</span> <span>"''"</span><span>)</span><span># Vectorize search query and calculate similarity </span> <span>search_results</span> <span>=</span> <span>session</span><span>.</span><span>sql</span><span>(</span><span>f</span><span>"""</span><span> WITH search_vector AS ( SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024(</span><span>'</span><span>voyage-multilingual-2</span><span>'</span><span>, </span><span>'</span><span>{</span><span>escaped_query</span><span>}</span><span>'</span><span>) as embedding ) SELECT i.FILE_NAME, i.DESCRIPTION, VECTOR_COSINE_SIMILARITY(i.VECTOR, s.embedding) as similarity FROM IMAGE_METADATA i, search_vector s WHERE i.VECTOR IS NOT NULL ORDER BY similarity DESC LIMIT 10 </span><span>"""</span><span>).</span><span>collect</span><span>()</span><span># Display search results </span> <span>st</span><span>.</span><span>subheader</span><span>(</span><span>"</span><span>Search Results</span><span>"</span><span>)</span><span>for</span> <span>result</span> <span>in</span> <span>search_results</span><span>:</span><span>file_name</span> <span>=</span> <span>result</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]</span><span>description</span> <span>=</span> <span>result</span><span>[</span><span>'</span><span>DESCRIPTION</span><span>'</span><span>]</span><span>similarity</span> <span>=</span> <span>result</span><span>[</span><span>'</span><span>SIMILARITY</span><span>'</span><span>]</span><span>img_path</span> <span>=</span> <span>next</span><span>((</span><span>img</span><span>[</span><span>'</span><span>IMG_PATH</span><span>'</span><span>]</span> <span>for</span> <span>img</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span> <span>if</span> <span>img</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]</span> <span>==</span> <span>file_name</span><span>),</span> <span>None</span><span>)</span><span>if</span> <span>img_path</span><span>:</span><span>col1</span><span>,</span> <span>col2</span> <span>=</span> <span>st</span><span>.</span><span>columns</span><span>([</span><span>1</span><span>,</span> <span>3</span><span>])</span><span>with</span> <span>col1</span><span>:</span><span>st</span><span>.</span><span>image</span><span>(</span><span>img_path</span><span>,</span> <span>width</span><span>=</span><span>150</span><span>)</span><span>with</span> <span>col2</span><span>:</span><span>st</span><span>.</span><span>write</span><span>(</span><span>f</span><span>"</span><span>File name: </span><span>{</span><span>file_name</span><span>}</span><span>"</span><span>)</span><span>st</span><span>.</span><span>write</span><span>(</span><span>f</span><span>"</span><span>Description: </span><span>{</span><span>description</span><span>}</span><span>"</span><span>)</span><span>st</span><span>.</span><span>write</span><span>(</span><span>f</span><span>"</span><span>Match rate: </span><span>{</span><span>similarity</span><span>:</span><span>.</span><span>1</span><span>%</span><span>}</span><span>"</span><span>)</span><span>st</span><span>.</span><span>markdown</span><span>(</span><span>"</span><span>---</span><span>"</span><span>)</span><span>else</span><span>:</span><span># Normal gallery display </span> <span>num_columns</span> <span>=</span> <span>st</span><span>.</span><span>slider</span><span>(</span><span>"</span><span>Width:</span><span>"</span><span>,</span> <span>min_value</span><span>=</span><span>1</span><span>,</span> <span>max_value</span><span>=</span><span>5</span><span>,</span> <span>value</span><span>=</span><span>4</span><span>)</span><span>cols</span> <span>=</span> <span>st</span><span>.</span><span>columns</span><span>(</span><span>num_columns</span><span>)</span><span>for</span> <span>i</span><span>,</span> <span>img</span> <span>in</span> <span>enumerate</span><span>(</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span><span>):</span><span>with</span> <span>cols</span><span>[</span><span>i</span> <span>%</span> <span>num_columns</span><span>]:</span><span>st</span><span>.</span><span>image</span><span>(</span><span>img</span><span>[</span><span>"</span><span>IMG_PATH</span><span>"</span><span>],</span> <span>caption</span><span>=</span><span>None</span><span>,</span> <span>use_column_width</span><span>=</span><span>True</span><span>)</span><span># Edit image descriptions </span><span>def</span> <span>edit_image_descriptions</span><span>():</span><span>st</span><span>.</span><span>title</span><span>(</span><span>"</span><span>️ Edit Image Captions</span><span>"</span><span>)</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>get_metadata</span><span>()</span><span># Add new images to metadata </span> <span>for</span> <span>img</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span><span>:</span><span>if</span> <span>img</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>]</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>].</span><span>values</span><span>:</span><span>new_row</span> <span>=</span> <span>pd</span><span>.</span><span>DataFrame</span><span>({</span><span>"</span><span>FILE_NAME</span><span>"</span><span>:</span> <span>[</span><span>img</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>]],</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>:</span> <span>[</span><span>""</span><span>]})</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>pd</span><span>.</span><span>concat</span><span>([</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>,</span> <span>new_row</span><span>],</span> <span>ignore_index</span><span>=</span><span>True</span><span>)</span><span>merged_df</span> <span>=</span> <span>pd</span><span>.</span><span>merge</span><span>(</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>,</span> <span>pd</span><span>.</span><span>DataFrame</span><span>(</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span><span>),</span> <span>on</span><span>=</span><span>"</span><span>FILE_NAME</span><span>"</span><span>,</span> <span>how</span><span>=</span><span>"</span><span>left</span><span>"</span><span>)</span><span>with</span> <span>st</span><span>.</span><span>form</span><span>(</span><span>"</span><span>edit_descriptions</span><span>"</span><span>):</span><span>for</span> <span>_</span><span>,</span> <span>row</span> <span>in</span> <span>merged_df</span><span>.</span><span>iterrows</span><span>():</span><span>col1</span><span>,</span> <span>col2</span> <span>=</span> <span>st</span><span>.</span><span>columns</span><span>([</span><span>1</span><span>,</span> <span>3</span><span>])</span><span>with</span> <span>col1</span><span>:</span><span>st</span><span>.</span><span>image</span><span>(</span><span>row</span><span>[</span><span>"</span><span>IMG_PATH</span><span>"</span><span>],</span> <span>width</span><span>=</span><span>100</span><span>)</span><span>with</span> <span>col2</span><span>:</span><span>new_description</span> <span>=</span> <span>st</span><span>.</span><span>text_input</span><span>(</span><span>f</span><span>"</span><span>File name: </span><span>{</span><span>row</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]</span><span>}</span><span>"</span><span>,</span> <span>value</span><span>=</span><span>row</span><span>[</span><span>"</span><span>DESCRIPTION</span><span>"</span><span>],</span> <span>key</span><span>=</span><span>row</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>])</span><span>merged_df</span><span>.</span><span>loc</span><span>[</span><span>merged_df</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>]</span> <span>==</span> <span>row</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>],</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>]</span> <span>=</span> <span>new_description</span><span>submit_button</span> <span>=</span> <span>st</span><span>.</span><span>form_submit_button</span><span>(</span><span>"</span><span>Save Changes</span><span>"</span><span>)</span><span>if</span> <span>submit_button</span><span>:</span><span>update_snowflake_table</span><span>(</span><span>merged_df</span><span>[[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>,</span> <span>'</span><span>DESCRIPTION</span><span>'</span><span>]])</span><span>st</span><span>.</span><span>success</span><span>(</span><span>"</span><span>Changes saved successfully!</span><span>"</span><span>)</span><span>st</span><span>.</span><span>cache_data</span><span>.</span><span>clear</span><span>()</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>get_metadata</span><span>()</span><span># Function to generate image description </span><span>def</span> <span>generate_description</span><span>(</span><span>image_path</span><span>):</span><span>image_base64</span> <span>=</span> <span>get_thumbnail_base64</span><span>(</span><span>image_path</span><span>)</span><span>prompt</span> <span>=</span> <span>"""</span><span> Please describe this image in English within 400 characters in a single line. No need for a response, just output the image description. </span><span>"""</span><span>request_body</span> <span>=</span> <span>{</span><span>"</span><span>anthropic_version</span><span>"</span><span>:</span> <span>"</span><span>bedrock-2023-05-31</span><span>"</span><span>,</span><span>"</span><span>max_tokens</span><span>"</span><span>:</span> <span>200000</span><span>,</span><span>"</span><span>messages</span><span>"</span><span>:</span> <span>[</span><span>{</span><span>"</span><span>role</span><span>"</span><span>:</span> <span>"</span><span>user</span><span>"</span><span>,</span><span>"</span><span>content</span><span>"</span><span>:</span> <span>[</span><span>{</span><span>"</span><span>type</span><span>"</span><span>:</span> <span>"</span><span>image</span><span>"</span><span>,</span><span>"</span><span>source</span><span>"</span><span>:</span> <span>{</span><span>"</span><span>type</span><span>"</span><span>:</span> <span>"</span><span>base64</span><span>"</span><span>,</span><span>"</span><span>media_type</span><span>"</span><span>:</span> <span>"</span><span>image/jpeg</span><span>"</span><span>,</span><span>"</span><span>data</span><span>"</span><span>:</span> <span>image_base64</span><span>}</span><span>},</span><span>{</span><span>"</span><span>type</span><span>"</span><span>:</span> <span>"</span><span>text</span><span>"</span><span>,</span><span>"</span><span>text</span><span>"</span><span>:</span> <span>prompt</span><span>}</span><span>]</span><span>}</span><span>]</span><span>}</span><span>response</span> <span>=</span> <span>bedrock</span><span>.</span><span>invoke_model</span><span>(</span><span>body</span><span>=</span><span>json</span><span>.</span><span>dumps</span><span>(</span><span>request_body</span><span>),</span><span>modelId</span><span>=</span><span>"</span><span>anthropic.claude-3-5-sonnet-20240620-v1:0</span><span>"</span><span>,</span><span>accept</span><span>=</span><span>'</span><span>application/json</span><span>'</span><span>,</span><span>contentType</span><span>=</span><span>'</span><span>application/json</span><span>'</span><span>)</span><span>response_body</span> <span>=</span> <span>json</span><span>.</span><span>loads</span><span>(</span><span>response</span><span>.</span><span>get</span><span>(</span><span>'</span><span>body</span><span>'</span><span>).</span><span>read</span><span>())</span><span>return</span> <span>response_body</span><span>[</span><span>"</span><span>content</span><span>"</span><span>][</span><span>0</span><span>][</span><span>"</span><span>text</span><span>"</span><span>]</span><span># Function to update Snowflake table </span><span>def</span> <span>update_snowflake_table</span><span>(</span><span>update_df</span><span>):</span><span>snow_df</span> <span>=</span> <span>session</span><span>.</span><span>create_dataframe</span><span>(</span><span>update_df</span><span>)</span><span>session</span><span>.</span><span>table</span><span>(</span><span>"</span><span>IMAGE_METADATA</span><span>"</span><span>).</span><span>merge</span><span>(</span><span>snow_df</span><span>,</span><span>(</span><span>session</span><span>.</span><span>table</span><span>(</span><span>"</span><span>IMAGE_METADATA</span><span>"</span><span>).</span><span>FILE_NAME</span> <span>==</span> <span>snow_df</span><span>.</span><span>FILE_NAME</span><span>),</span><span>[</span><span>when_matched</span><span>().</span><span>update</span><span>({</span><span>"</span><span>DESCRIPTION</span><span>"</span><span>:</span> <span>snow_df</span><span>.</span><span>DESCRIPTION</span><span>}),</span><span>when_not_matched</span><span>().</span><span>insert</span><span>({</span><span>"</span><span>FILE_NAME</span><span>"</span><span>:</span> <span>snow_df</span><span>.</span><span>FILE_NAME</span><span>,</span><span>"</span><span>DESCRIPTION</span><span>"</span><span>:</span> <span>snow_df</span><span>.</span><span>DESCRIPTION</span><span>})</span><span>]</span><span>)</span><span># Generate image descriptions </span><span>def</span> <span>generate_image_descriptions</span><span>():</span><span>st</span><span>.</span><span>title</span><span>(</span><span>"</span><span>🤖 Automatic Image Caption Generation</span><span>"</span><span>)</span><span>if</span> <span>'</span><span>generated_description</span><span>'</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>:</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>generated_description</span> <span>=</span> <span>None</span><span>if</span> <span>'</span><span>selected_image</span><span>'</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>:</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>selected_image</span> <span>=</span> <span>None</span><span># Generate description for individual image </span> <span>with</span> <span>st</span><span>.</span><span>form</span><span>(</span><span>"</span><span>generate_description</span><span>"</span><span>):</span><span>selected_image</span> <span>=</span> <span>st</span><span>.</span><span>selectbox</span><span>(</span><span>"</span><span>Select an image:</span><span>"</span><span>,</span> <span>options</span><span>=</span><span>[</span><span>img</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>]</span> <span>for</span> <span>img</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span><span>])</span><span>generate_button</span> <span>=</span> <span>st</span><span>.</span><span>form_submit_button</span><span>(</span><span>"</span><span>Generate Image Caption</span><span>"</span><span>)</span><span>if</span> <span>generate_button</span><span>:</span><span>image_info</span> <span>=</span> <span>next</span><span>(</span><span>img</span> <span>for</span> <span>img</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span> <span>if</span> <span>img</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]</span> <span>==</span> <span>selected_image</span><span>)</span><span>generated_description</span> <span>=</span> <span>generate_description</span><span>(</span><span>image_info</span><span>[</span><span>'</span><span>IMG_PATH</span><span>'</span><span>])</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>generated_description</span> <span>=</span> <span>generated_description</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>selected_image</span> <span>=</span> <span>selected_image</span><span>st</span><span>.</span><span>image</span><span>(</span><span>image_info</span><span>[</span><span>'</span><span>IMG_PATH</span><span>'</span><span>],</span> <span>width</span><span>=</span><span>300</span><span>)</span><span>st</span><span>.</span><span>write</span><span>(</span><span>"</span><span>Generated Caption:</span><span>"</span><span>)</span><span>st</span><span>.</span><span>write</span><span>(</span><span>generated_description</span><span>)</span><span>if</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>generated_description</span> <span>is</span> <span>not</span> <span>None</span><span>:</span><span>if</span> <span>st</span><span>.</span><span>button</span><span>(</span><span>"</span><span>Save Caption</span><span>"</span><span>):</span><span>update_snowflake_table</span><span>(</span><span>pd</span><span>.</span><span>DataFrame</span><span>({</span><span>'</span><span>FILE_NAME</span><span>'</span><span>:</span> <span>[</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>selected_image</span><span>],</span> <span>'</span><span>DESCRIPTION</span><span>'</span><span>:</span> <span>[</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>generated_description</span><span>]}))</span><span>st</span><span>.</span><span>success</span><span>(</span><span>"</span><span>Caption saved successfully</span><span>"</span><span>)</span><span>st</span><span>.</span><span>cache_data</span><span>.</span><span>clear</span><span>()</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>get_metadata</span><span>()</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>generated_description</span> <span>=</span> <span>None</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>selected_image</span> <span>=</span> <span>None</span><span># Batch process images without descriptions </span> <span>st</span><span>.</span><span>subheader</span><span>(</span><span>"</span><span>Batch Caption Generation for Uncaptioned Images</span><span>"</span><span>)</span><span>images_without_description</span> <span>=</span> <span>[</span><span>img</span> <span>for</span> <span>img</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span><span>if</span> <span>img</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>]</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>[</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>[</span><span>"</span><span>DESCRIPTION</span><span>"</span><span>].</span><span>notna</span><span>()</span> <span>&</span><span>(</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>[</span><span>"</span><span>DESCRIPTION</span><span>"</span><span>]</span> <span>!=</span> <span>""</span><span>)</span><span>][</span><span>"</span><span>FILE_NAME</span><span>"</span><span>].</span><span>values</span><span>]</span><span>if</span> <span>images_without_description</span><span>:</span><span>st</span><span>.</span><span>write</span><span>(</span><span>f</span><span>"</span><span>{</span><span>len</span><span>(</span><span>images_without_description</span><span>)</span><span>}</span><span> images don</span><span>'</span><span>t have captions.</span><span>"</span><span>)</span><span>if</span> <span>st</span><span>.</span><span>button</span><span>(</span><span>"</span><span>Generate Captions in Batch</span><span>"</span><span>):</span><span>progress_bar</span> <span>=</span> <span>st</span><span>.</span><span>progress</span><span>(</span><span>0</span><span>)</span><span>for</span> <span>i</span><span>,</span> <span>img</span> <span>in</span> <span>enumerate</span><span>(</span><span>images_without_description</span><span>):</span><span>generated_description</span> <span>=</span> <span>generate_description</span><span>(</span><span>img</span><span>[</span><span>'</span><span>IMG_PATH</span><span>'</span><span>])</span><span>update_snowflake_table</span><span>(</span><span>pd</span><span>.</span><span>DataFrame</span><span>({</span><span>'</span><span>FILE_NAME</span><span>'</span><span>:</span> <span>[</span><span>img</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]],</span> <span>'</span><span>DESCRIPTION</span><span>'</span><span>:</span> <span>[</span><span>generated_description</span><span>]}))</span><span>progress_bar</span><span>.</span><span>progress</span><span>((</span><span>i</span> <span>+</span> <span>1</span><span>)</span> <span>/</span> <span>len</span><span>(</span><span>images_without_description</span><span>))</span><span>st</span><span>.</span><span>success</span><span>(</span><span>"</span><span>Captions generated and saved for all images!</span><span>"</span><span>)</span><span>st</span><span>.</span><span>cache_data</span><span>.</span><span>clear</span><span>()</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>get_metadata</span><span>()</span><span>else</span><span>:</span><span>st</span><span>.</span><span>write</span><span>(</span><span>"</span><span>All images have captions.</span><span>"</span><span>)</span><span># Display debug information </span> <span>st</span><span>.</span><span>subheader</span><span>(</span><span>"</span><span>Metadata Information</span><span>"</span><span>)</span><span>st</span><span>.</span><span>write</span><span>(</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>)</span><span># Function to generate vector data using Cortex LLM's Embedding function </span><span>def</span> <span>generate_embedding</span><span>(</span><span>text</span><span>):</span><span>if</span> <span>text</span> <span>and</span> <span>text</span><span>.</span><span>strip</span><span>():</span><span>result</span> <span>=</span> <span>session</span><span>.</span><span>sql</span><span>(</span><span>f</span><span>"</span><span>SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024(</span><span>'</span><span>voyage-multilingual-2</span><span>'</span><span>, </span><span>'</span><span>{</span><span>text</span><span>}</span><span>'</span><span>) as embedding</span><span>"</span><span>).</span><span>collect</span><span>()</span><span>return</span> <span>result</span><span>[</span><span>0</span><span>][</span><span>'</span><span>EMBEDDING</span><span>'</span><span>]</span><span>return</span> <span>None</span><span># Function to generate and save vector data </span><span>def</span> <span>generate_and_save_vectors</span><span>():</span><span>st</span><span>.</span><span>title</span><span>(</span><span>"</span><span>🧬 Automatic Vector Data Generation</span><span>"</span><span>)</span><span># Get metadata (including vector data information) </span> <span>full_metadata</span> <span>=</span> <span>session</span><span>.</span><span>table</span><span>(</span><span>"</span><span>IMAGE_METADATA</span><span>"</span><span>).</span><span>select</span><span>(</span><span>"</span><span>FILE_NAME</span><span>"</span><span>,</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>,</span> <span>"</span><span>VECTOR</span><span>"</span><span>).</span><span>to_pandas</span><span>()</span><span># Extract images without vector data </span> <span>images_without_vector</span> <span>=</span> <span>full_metadata</span><span>[</span><span>(</span><span>full_metadata</span><span>[</span><span>'</span><span>DESCRIPTION</span><span>'</span><span>].</span><span>notna</span><span>())</span> <span>&</span><span>(</span><span>full_metadata</span><span>[</span><span>'</span><span>DESCRIPTION</span><span>'</span><span>]</span> <span>!=</span> <span>""</span><span>)</span> <span>&</span><span>(</span><span>full_metadata</span><span>[</span><span>'</span><span>VECTOR</span><span>'</span><span>].</span><span>isna</span><span>())</span> <span># Only rows without vector data </span> <span>]</span><span>if</span> <span>images_without_vector</span><span>.</span><span>empty</span><span>:</span><span>st</span><span>.</span><span>write</span><span>(</span><span>"</span><span>All images have vector data.</span><span>"</span><span>)</span><span>else</span><span>:</span><span>st</span><span>.</span><span>write</span><span>(</span><span>f</span><span>"</span><span>Vector data can be generated for </span><span>{</span><span>len</span><span>(</span><span>images_without_vector</span><span>)</span><span>}</span><span> images.</span><span>"</span><span>)</span><span>if</span> <span>st</span><span>.</span><span>button</span><span>(</span><span>"</span><span>Generate Vector Data</span><span>"</span><span>):</span><span>progress_bar</span> <span>=</span> <span>st</span><span>.</span><span>progress</span><span>(</span><span>0</span><span>)</span><span>for</span> <span>i</span><span>,</span> <span>(</span><span>_</span><span>,</span> <span>row</span><span>)</span> <span>in</span> <span>enumerate</span><span>(</span><span>images_without_vector</span><span>.</span><span>iterrows</span><span>()):</span><span>params_df</span> <span>=</span> <span>session</span><span>.</span><span>create_dataframe</span><span>([[</span><span>row</span><span>[</span><span>'</span><span>DESCRIPTION</span><span>'</span><span>],</span> <span>row</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]]],</span> <span>schema</span><span>=</span><span>[</span><span>"</span><span>description</span><span>"</span><span>,</span> <span>"</span><span>file_name</span><span>"</span><span>])</span><span>session</span><span>.</span><span>sql</span><span>(</span><span>"""</span><span> UPDATE IMAGE_METADATA SET VECTOR = SNOWFLAKE.CORTEX.EMBED_TEXT_1024(</span><span>'</span><span>voyage-multilingual-2</span><span>'</span><span>, description) WHERE FILE_NAME = file_name AND VECTOR IS NULL </span><span>"""</span><span>).</span><span>join</span><span>(</span><span>params_df</span><span>).</span><span>collect</span><span>()</span><span>progress_bar</span><span>.</span><span>progress</span><span>((</span><span>i</span> <span>+</span> <span>1</span><span>)</span> <span>/</span> <span>len</span><span>(</span><span>images_without_vector</span><span>))</span><span>st</span><span>.</span><span>success</span><span>(</span><span>"</span><span>Vector data generated and saved for all target images!</span><span>"</span><span>)</span><span>st</span><span>.</span><span>cache_data</span><span>.</span><span>clear</span><span>()</span><span># Display debug information </span> <span>st</span><span>.</span><span>subheader</span><span>(</span><span>"</span><span>Metadata Information</span><span>"</span><span>)</span><span>updated_full_metadata</span> <span>=</span> <span>session</span><span>.</span><span>table</span><span>(</span><span>"</span><span>IMAGE_METADATA</span><span>"</span><span>).</span><span>select</span><span>(</span><span>"</span><span>FILE_NAME</span><span>"</span><span>,</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>,</span> <span>"</span><span>VECTOR</span><span>"</span><span>).</span><span>to_pandas</span><span>()</span><span>st</span><span>.</span><span>write</span><span>(</span><span>updated_full_metadata</span><span>)</span><span># Main application execution </span><span>if</span> <span>__name__</span> <span>==</span> <span>"</span><span>__main__</span><span>"</span><span>:</span><span>st</span><span>.</span><span>sidebar</span><span>.</span><span>title</span><span>(</span><span>"</span><span>Navigation</span><span>"</span><span>)</span><span>page</span> <span>=</span> <span>st</span><span>.</span><span>sidebar</span><span>.</span><span>radio</span><span>(</span><span>"</span><span>Select a feature to use:</span><span>"</span><span>,</span><span>[</span><span>"</span><span>Image Gallery</span><span>"</span><span>,</span> <span>"</span><span>Edit Captions</span><span>"</span><span>,</span> <span>"</span><span>Auto-generate Captions</span><span>"</span><span>,</span> <span>"</span><span>Auto-generate Vector Data</span><span>"</span><span>]</span><span>)</span><span>if</span> <span>page</span> <span>==</span> <span>"</span><span>Image Gallery</span><span>"</span><span>:</span><span>show_image_gallery</span><span>()</span><span>elif</span> <span>page</span> <span>==</span> <span>"</span><span>Edit Captions</span><span>"</span><span>:</span><span>edit_image_descriptions</span><span>()</span><span>elif</span> <span>page</span> <span>==</span> <span>"</span><span>Auto-generate Captions</span><span>"</span><span>:</span><span>generate_image_descriptions</span><span>()</span><span>elif</span> <span>page</span> <span>==</span> <span>"</span><span>Auto-generate Vector Data</span><span>"</span><span>:</span><span>generate_and_save_vectors</span><span>()</span><span>import</span> <span>streamlit</span> <span>as</span> <span>st</span> <span>import</span> <span>pandas</span> <span>as</span> <span>pd</span> <span>import</span> <span>os</span> <span>import</span> <span>base64</span> <span>import</span> <span>boto3</span> <span>import</span> <span>json</span> <span>from</span> <span>snowflake.snowpark.context</span> <span>import</span> <span>get_active_session</span> <span>from</span> <span>snowflake.snowpark.functions</span> <span>import</span> <span>col</span><span>,</span> <span>when_matched</span><span>,</span> <span>when_not_matched</span><span>,</span> <span>lit</span><span>,</span> <span>call_udf</span> <span>import</span> <span>_snowflake</span> <span>from</span> <span>PIL</span> <span>import</span> <span>Image</span> <span>import</span> <span>io</span> <span># Set custom theme </span><span>st</span><span>.</span><span>set_page_config</span><span>(</span> <span>page_title</span><span>=</span><span>"</span><span>Image Gallery</span><span>"</span><span>,</span> <span>layout</span><span>=</span><span>"</span><span>wide</span><span>"</span><span>,</span> <span>initial_sidebar_state</span><span>=</span><span>"</span><span>expanded</span><span>"</span><span>,</span> <span>)</span> <span># Add custom CSS </span><span>st</span><span>.</span><span>markdown</span><span>(</span><span>"""</span><span> <style> .reportview-container { background: #f0f2f6; } .main .block-container { padding-top: 2rem; padding-bottom: 2rem; padding-left: 5rem; padding-right: 5rem; } .stButton>button { background-color: #4CAF50; color: white; padding: 10px 20px; border: none; border-radius: 5px; cursor: pointer; transition: background-color 0.3s; } .stButton>button:hover { background-color: #45a049; } .stTextInput>div>div>input { border-radius: 5px; } .stSelectbox>div>div>select { border-radius: 5px; } h1, h2, h3 { color: #2c3e50; } .stProgress > div > div > div > div { background-color: #4CAF50; } </style> </span><span>"""</span><span>,</span> <span>unsafe_allow_html</span><span>=</span><span>True</span><span>)</span> <span># Image folder path </span><span>IMAGE_FOLDER</span> <span>=</span> <span>"</span><span>image</span><span>"</span> <span># Get Snowflake session </span><span>session</span> <span>=</span> <span>get_active_session</span><span>()</span> <span># Create table (only on first run) </span><span>@st.cache_resource</span> <span>def</span> <span>create_table_if_not_exists</span><span>():</span> <span>session</span><span>.</span><span>sql</span><span>(</span><span>"""</span><span> CREATE TABLE IF NOT EXISTS IMAGE_METADATA ( FILE_NAME STRING, DESCRIPTION STRING, VECTOR VECTOR(FLOAT, 1024) ) </span><span>"""</span><span>).</span><span>collect</span><span>()</span> <span>create_table_if_not_exists</span><span>()</span> <span># Function to get AWS credentials </span><span>def</span> <span>get_aws_credentials</span><span>():</span> <span>aws_key_object</span> <span>=</span> <span>_snowflake</span><span>.</span><span>get_username_password</span><span>(</span><span>'</span><span>bedrock_key</span><span>'</span><span>)</span> <span>region</span> <span>=</span> <span>'</span><span>us-east-1</span><span>'</span> <span>return</span> <span>{</span> <span>'</span><span>aws_access_key_id</span><span>'</span><span>:</span> <span>aws_key_object</span><span>.</span><span>username</span><span>,</span> <span>'</span><span>aws_secret_access_key</span><span>'</span><span>:</span> <span>aws_key_object</span><span>.</span><span>password</span><span>,</span> <span>'</span><span>region_name</span><span>'</span><span>:</span> <span>region</span> <span>},</span> <span>region</span> <span># Set up Bedrock client </span><span>boto3_session_args</span><span>,</span> <span>region</span> <span>=</span> <span>get_aws_credentials</span><span>()</span> <span>boto3_session</span> <span>=</span> <span>boto3</span><span>.</span><span>Session</span><span>(</span><span>**</span><span>boto3_session_args</span><span>)</span> <span>bedrock</span> <span>=</span> <span>boto3_session</span><span>.</span><span>client</span><span>(</span><span>'</span><span>bedrock-runtime</span><span>'</span><span>,</span> <span>region_name</span><span>=</span><span>region</span><span>)</span> <span># Get image data </span><span>@st.cache_data</span> <span>def</span> <span>get_image_data</span><span>():</span> <span>image_files</span> <span>=</span> <span>[</span><span>f</span> <span>for</span> <span>f</span> <span>in</span> <span>os</span><span>.</span><span>listdir</span><span>(</span><span>IMAGE_FOLDER</span><span>)</span> <span>if</span> <span>f</span><span>.</span><span>lower</span><span>().</span><span>endswith</span><span>((</span><span>'</span><span>.png</span><span>'</span><span>,</span> <span>'</span><span>.jpg</span><span>'</span><span>,</span> <span>'</span><span>.jpeg</span><span>'</span><span>,</span> <span>'</span><span>.gif</span><span>'</span><span>))]</span> <span>return</span> <span>[{</span><span>"</span><span>FILE_NAME</span><span>"</span><span>:</span> <span>f</span><span>,</span> <span>"</span><span>IMG_PATH</span><span>"</span><span>:</span> <span>os</span><span>.</span><span>path</span><span>.</span><span>join</span><span>(</span><span>IMAGE_FOLDER</span><span>,</span> <span>f</span><span>)}</span> <span>for</span> <span>f</span> <span>in</span> <span>image_files</span><span>]</span> <span># Get metadata </span><span>@st.cache_data</span> <span>def</span> <span>get_metadata</span><span>():</span> <span>return</span> <span>session</span><span>.</span><span>table</span><span>(</span><span>"</span><span>IMAGE_METADATA</span><span>"</span><span>).</span><span>select</span><span>(</span><span>"</span><span>FILE_NAME</span><span>"</span><span>,</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>).</span><span>to_pandas</span><span>()</span> <span># Convert image to thumbnail and encode in base64 </span><span>@st.cache_data</span> <span>def</span> <span>get_thumbnail_base64</span><span>(</span><span>img_path</span><span>,</span> <span>max_size</span><span>=</span><span>(</span><span>300</span><span>,</span> <span>300</span><span>)):</span> <span>with</span> <span>Image</span><span>.</span><span>open</span><span>(</span><span>img_path</span><span>)</span> <span>as</span> <span>img</span><span>:</span> <span>img</span><span>.</span><span>thumbnail</span><span>(</span><span>max_size</span><span>)</span> <span>buffered</span> <span>=</span> <span>io</span><span>.</span><span>BytesIO</span><span>()</span> <span>img</span><span>.</span><span>save</span><span>(</span><span>buffered</span><span>,</span> <span>format</span><span>=</span><span>"</span><span>JPEG</span><span>"</span><span>)</span> <span>return</span> <span>base64</span><span>.</span><span>b64encode</span><span>(</span><span>buffered</span><span>.</span><span>getvalue</span><span>()).</span><span>decode</span><span>(</span><span>'</span><span>utf-8</span><span>'</span><span>)</span> <span># Initialize image data and metadata </span><span>if</span> <span>'</span><span>img_df</span><span>'</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>:</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span> <span>=</span> <span>get_image_data</span><span>()</span> <span>if</span> <span>'</span><span>metadata_df</span><span>'</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>:</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>get_metadata</span><span>()</span> <span># Display image gallery </span><span>def</span> <span>show_image_gallery</span><span>():</span> <span>st</span><span>.</span><span>title</span><span>(</span><span>"</span><span>️ Image Gallery</span><span>"</span><span>)</span> <span># Add search box </span> <span>search_query</span> <span>=</span> <span>st</span><span>.</span><span>text_input</span><span>(</span><span>"</span><span>Search images (top 10 most relevant results will be displayed)</span><span>"</span><span>,</span> <span>""</span><span>)</span> <span>if</span> <span>search_query</span><span>:</span> <span># Escape search query (basic SQL injection prevention) </span> <span>escaped_query</span> <span>=</span> <span>search_query</span><span>.</span><span>replace</span><span>(</span><span>"'"</span><span>,</span> <span>"''"</span><span>)</span> <span># Vectorize search query and calculate similarity </span> <span>search_results</span> <span>=</span> <span>session</span><span>.</span><span>sql</span><span>(</span><span>f</span><span>"""</span><span> WITH search_vector AS ( SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024(</span><span>'</span><span>voyage-multilingual-2</span><span>'</span><span>, </span><span>'</span><span>{</span><span>escaped_query</span><span>}</span><span>'</span><span>) as embedding ) SELECT i.FILE_NAME, i.DESCRIPTION, VECTOR_COSINE_SIMILARITY(i.VECTOR, s.embedding) as similarity FROM IMAGE_METADATA i, search_vector s WHERE i.VECTOR IS NOT NULL ORDER BY similarity DESC LIMIT 10 </span><span>"""</span><span>).</span><span>collect</span><span>()</span> <span># Display search results </span> <span>st</span><span>.</span><span>subheader</span><span>(</span><span>"</span><span>Search Results</span><span>"</span><span>)</span> <span>for</span> <span>result</span> <span>in</span> <span>search_results</span><span>:</span> <span>file_name</span> <span>=</span> <span>result</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]</span> <span>description</span> <span>=</span> <span>result</span><span>[</span><span>'</span><span>DESCRIPTION</span><span>'</span><span>]</span> <span>similarity</span> <span>=</span> <span>result</span><span>[</span><span>'</span><span>SIMILARITY</span><span>'</span><span>]</span> <span>img_path</span> <span>=</span> <span>next</span><span>((</span><span>img</span><span>[</span><span>'</span><span>IMG_PATH</span><span>'</span><span>]</span> <span>for</span> <span>img</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span> <span>if</span> <span>img</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]</span> <span>==</span> <span>file_name</span><span>),</span> <span>None</span><span>)</span> <span>if</span> <span>img_path</span><span>:</span> <span>col1</span><span>,</span> <span>col2</span> <span>=</span> <span>st</span><span>.</span><span>columns</span><span>([</span><span>1</span><span>,</span> <span>3</span><span>])</span> <span>with</span> <span>col1</span><span>:</span> <span>st</span><span>.</span><span>image</span><span>(</span><span>img_path</span><span>,</span> <span>width</span><span>=</span><span>150</span><span>)</span> <span>with</span> <span>col2</span><span>:</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>f</span><span>"</span><span>File name: </span><span>{</span><span>file_name</span><span>}</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>f</span><span>"</span><span>Description: </span><span>{</span><span>description</span><span>}</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>f</span><span>"</span><span>Match rate: </span><span>{</span><span>similarity</span><span>:</span><span>.</span><span>1</span><span>%</span><span>}</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>markdown</span><span>(</span><span>"</span><span>---</span><span>"</span><span>)</span> <span>else</span><span>:</span> <span># Normal gallery display </span> <span>num_columns</span> <span>=</span> <span>st</span><span>.</span><span>slider</span><span>(</span><span>"</span><span>Width:</span><span>"</span><span>,</span> <span>min_value</span><span>=</span><span>1</span><span>,</span> <span>max_value</span><span>=</span><span>5</span><span>,</span> <span>value</span><span>=</span><span>4</span><span>)</span> <span>cols</span> <span>=</span> <span>st</span><span>.</span><span>columns</span><span>(</span><span>num_columns</span><span>)</span> <span>for</span> <span>i</span><span>,</span> <span>img</span> <span>in</span> <span>enumerate</span><span>(</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span><span>):</span> <span>with</span> <span>cols</span><span>[</span><span>i</span> <span>%</span> <span>num_columns</span><span>]:</span> <span>st</span><span>.</span><span>image</span><span>(</span><span>img</span><span>[</span><span>"</span><span>IMG_PATH</span><span>"</span><span>],</span> <span>caption</span><span>=</span><span>None</span><span>,</span> <span>use_column_width</span><span>=</span><span>True</span><span>)</span> <span># Edit image descriptions </span><span>def</span> <span>edit_image_descriptions</span><span>():</span> <span>st</span><span>.</span><span>title</span><span>(</span><span>"</span><span>️ Edit Image Captions</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>get_metadata</span><span>()</span> <span># Add new images to metadata </span> <span>for</span> <span>img</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span><span>:</span> <span>if</span> <span>img</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>]</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>].</span><span>values</span><span>:</span> <span>new_row</span> <span>=</span> <span>pd</span><span>.</span><span>DataFrame</span><span>({</span><span>"</span><span>FILE_NAME</span><span>"</span><span>:</span> <span>[</span><span>img</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>]],</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>:</span> <span>[</span><span>""</span><span>]})</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>pd</span><span>.</span><span>concat</span><span>([</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>,</span> <span>new_row</span><span>],</span> <span>ignore_index</span><span>=</span><span>True</span><span>)</span> <span>merged_df</span> <span>=</span> <span>pd</span><span>.</span><span>merge</span><span>(</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>,</span> <span>pd</span><span>.</span><span>DataFrame</span><span>(</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span><span>),</span> <span>on</span><span>=</span><span>"</span><span>FILE_NAME</span><span>"</span><span>,</span> <span>how</span><span>=</span><span>"</span><span>left</span><span>"</span><span>)</span> <span>with</span> <span>st</span><span>.</span><span>form</span><span>(</span><span>"</span><span>edit_descriptions</span><span>"</span><span>):</span> <span>for</span> <span>_</span><span>,</span> <span>row</span> <span>in</span> <span>merged_df</span><span>.</span><span>iterrows</span><span>():</span> <span>col1</span><span>,</span> <span>col2</span> <span>=</span> <span>st</span><span>.</span><span>columns</span><span>([</span><span>1</span><span>,</span> <span>3</span><span>])</span> <span>with</span> <span>col1</span><span>:</span> <span>st</span><span>.</span><span>image</span><span>(</span><span>row</span><span>[</span><span>"</span><span>IMG_PATH</span><span>"</span><span>],</span> <span>width</span><span>=</span><span>100</span><span>)</span> <span>with</span> <span>col2</span><span>:</span> <span>new_description</span> <span>=</span> <span>st</span><span>.</span><span>text_input</span><span>(</span><span>f</span><span>"</span><span>File name: </span><span>{</span><span>row</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]</span><span>}</span><span>"</span><span>,</span> <span>value</span><span>=</span><span>row</span><span>[</span><span>"</span><span>DESCRIPTION</span><span>"</span><span>],</span> <span>key</span><span>=</span><span>row</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>])</span> <span>merged_df</span><span>.</span><span>loc</span><span>[</span><span>merged_df</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>]</span> <span>==</span> <span>row</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>],</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>]</span> <span>=</span> <span>new_description</span> <span>submit_button</span> <span>=</span> <span>st</span><span>.</span><span>form_submit_button</span><span>(</span><span>"</span><span>Save Changes</span><span>"</span><span>)</span> <span>if</span> <span>submit_button</span><span>:</span> <span>update_snowflake_table</span><span>(</span><span>merged_df</span><span>[[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>,</span> <span>'</span><span>DESCRIPTION</span><span>'</span><span>]])</span> <span>st</span><span>.</span><span>success</span><span>(</span><span>"</span><span>Changes saved successfully!</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>cache_data</span><span>.</span><span>clear</span><span>()</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>get_metadata</span><span>()</span> <span># Function to generate image description </span><span>def</span> <span>generate_description</span><span>(</span><span>image_path</span><span>):</span> <span>image_base64</span> <span>=</span> <span>get_thumbnail_base64</span><span>(</span><span>image_path</span><span>)</span> <span>prompt</span> <span>=</span> <span>"""</span><span> Please describe this image in English within 400 characters in a single line. No need for a response, just output the image description. </span><span>"""</span> <span>request_body</span> <span>=</span> <span>{</span> <span>"</span><span>anthropic_version</span><span>"</span><span>:</span> <span>"</span><span>bedrock-2023-05-31</span><span>"</span><span>,</span> <span>"</span><span>max_tokens</span><span>"</span><span>:</span> <span>200000</span><span>,</span> <span>"</span><span>messages</span><span>"</span><span>:</span> <span>[</span> <span>{</span> <span>"</span><span>role</span><span>"</span><span>:</span> <span>"</span><span>user</span><span>"</span><span>,</span> <span>"</span><span>content</span><span>"</span><span>:</span> <span>[</span> <span>{</span> <span>"</span><span>type</span><span>"</span><span>:</span> <span>"</span><span>image</span><span>"</span><span>,</span> <span>"</span><span>source</span><span>"</span><span>:</span> <span>{</span> <span>"</span><span>type</span><span>"</span><span>:</span> <span>"</span><span>base64</span><span>"</span><span>,</span> <span>"</span><span>media_type</span><span>"</span><span>:</span> <span>"</span><span>image/jpeg</span><span>"</span><span>,</span> <span>"</span><span>data</span><span>"</span><span>:</span> <span>image_base64</span> <span>}</span> <span>},</span> <span>{</span> <span>"</span><span>type</span><span>"</span><span>:</span> <span>"</span><span>text</span><span>"</span><span>,</span> <span>"</span><span>text</span><span>"</span><span>:</span> <span>prompt</span> <span>}</span> <span>]</span> <span>}</span> <span>]</span> <span>}</span> <span>response</span> <span>=</span> <span>bedrock</span><span>.</span><span>invoke_model</span><span>(</span> <span>body</span><span>=</span><span>json</span><span>.</span><span>dumps</span><span>(</span><span>request_body</span><span>),</span> <span>modelId</span><span>=</span><span>"</span><span>anthropic.claude-3-5-sonnet-20240620-v1:0</span><span>"</span><span>,</span> <span>accept</span><span>=</span><span>'</span><span>application/json</span><span>'</span><span>,</span> <span>contentType</span><span>=</span><span>'</span><span>application/json</span><span>'</span> <span>)</span> <span>response_body</span> <span>=</span> <span>json</span><span>.</span><span>loads</span><span>(</span><span>response</span><span>.</span><span>get</span><span>(</span><span>'</span><span>body</span><span>'</span><span>).</span><span>read</span><span>())</span> <span>return</span> <span>response_body</span><span>[</span><span>"</span><span>content</span><span>"</span><span>][</span><span>0</span><span>][</span><span>"</span><span>text</span><span>"</span><span>]</span> <span># Function to update Snowflake table </span><span>def</span> <span>update_snowflake_table</span><span>(</span><span>update_df</span><span>):</span> <span>snow_df</span> <span>=</span> <span>session</span><span>.</span><span>create_dataframe</span><span>(</span><span>update_df</span><span>)</span> <span>session</span><span>.</span><span>table</span><span>(</span><span>"</span><span>IMAGE_METADATA</span><span>"</span><span>).</span><span>merge</span><span>(</span> <span>snow_df</span><span>,</span> <span>(</span><span>session</span><span>.</span><span>table</span><span>(</span><span>"</span><span>IMAGE_METADATA</span><span>"</span><span>).</span><span>FILE_NAME</span> <span>==</span> <span>snow_df</span><span>.</span><span>FILE_NAME</span><span>),</span> <span>[</span> <span>when_matched</span><span>().</span><span>update</span><span>({</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>:</span> <span>snow_df</span><span>.</span><span>DESCRIPTION</span> <span>}),</span> <span>when_not_matched</span><span>().</span><span>insert</span><span>({</span> <span>"</span><span>FILE_NAME</span><span>"</span><span>:</span> <span>snow_df</span><span>.</span><span>FILE_NAME</span><span>,</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>:</span> <span>snow_df</span><span>.</span><span>DESCRIPTION</span> <span>})</span> <span>]</span> <span>)</span> <span># Generate image descriptions </span><span>def</span> <span>generate_image_descriptions</span><span>():</span> <span>st</span><span>.</span><span>title</span><span>(</span><span>"</span><span>🤖 Automatic Image Caption Generation</span><span>"</span><span>)</span> <span>if</span> <span>'</span><span>generated_description</span><span>'</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>:</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>generated_description</span> <span>=</span> <span>None</span> <span>if</span> <span>'</span><span>selected_image</span><span>'</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>:</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>selected_image</span> <span>=</span> <span>None</span> <span># Generate description for individual image </span> <span>with</span> <span>st</span><span>.</span><span>form</span><span>(</span><span>"</span><span>generate_description</span><span>"</span><span>):</span> <span>selected_image</span> <span>=</span> <span>st</span><span>.</span><span>selectbox</span><span>(</span><span>"</span><span>Select an image:</span><span>"</span><span>,</span> <span>options</span><span>=</span><span>[</span><span>img</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>]</span> <span>for</span> <span>img</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span><span>])</span> <span>generate_button</span> <span>=</span> <span>st</span><span>.</span><span>form_submit_button</span><span>(</span><span>"</span><span>Generate Image Caption</span><span>"</span><span>)</span> <span>if</span> <span>generate_button</span><span>:</span> <span>image_info</span> <span>=</span> <span>next</span><span>(</span><span>img</span> <span>for</span> <span>img</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span> <span>if</span> <span>img</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]</span> <span>==</span> <span>selected_image</span><span>)</span> <span>generated_description</span> <span>=</span> <span>generate_description</span><span>(</span><span>image_info</span><span>[</span><span>'</span><span>IMG_PATH</span><span>'</span><span>])</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>generated_description</span> <span>=</span> <span>generated_description</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>selected_image</span> <span>=</span> <span>selected_image</span> <span>st</span><span>.</span><span>image</span><span>(</span><span>image_info</span><span>[</span><span>'</span><span>IMG_PATH</span><span>'</span><span>],</span> <span>width</span><span>=</span><span>300</span><span>)</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>"</span><span>Generated Caption:</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>generated_description</span><span>)</span> <span>if</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>generated_description</span> <span>is</span> <span>not</span> <span>None</span><span>:</span> <span>if</span> <span>st</span><span>.</span><span>button</span><span>(</span><span>"</span><span>Save Caption</span><span>"</span><span>):</span> <span>update_snowflake_table</span><span>(</span><span>pd</span><span>.</span><span>DataFrame</span><span>({</span><span>'</span><span>FILE_NAME</span><span>'</span><span>:</span> <span>[</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>selected_image</span><span>],</span> <span>'</span><span>DESCRIPTION</span><span>'</span><span>:</span> <span>[</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>generated_description</span><span>]}))</span> <span>st</span><span>.</span><span>success</span><span>(</span><span>"</span><span>Caption saved successfully</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>cache_data</span><span>.</span><span>clear</span><span>()</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>get_metadata</span><span>()</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>generated_description</span> <span>=</span> <span>None</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>selected_image</span> <span>=</span> <span>None</span> <span># Batch process images without descriptions </span> <span>st</span><span>.</span><span>subheader</span><span>(</span><span>"</span><span>Batch Caption Generation for Uncaptioned Images</span><span>"</span><span>)</span> <span>images_without_description</span> <span>=</span> <span>[</span> <span>img</span> <span>for</span> <span>img</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>img_df</span> <span>if</span> <span>img</span><span>[</span><span>"</span><span>FILE_NAME</span><span>"</span><span>]</span> <span>not</span> <span>in</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>[</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>[</span><span>"</span><span>DESCRIPTION</span><span>"</span><span>].</span><span>notna</span><span>()</span> <span>&</span> <span>(</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>[</span><span>"</span><span>DESCRIPTION</span><span>"</span><span>]</span> <span>!=</span> <span>""</span><span>)</span> <span>][</span><span>"</span><span>FILE_NAME</span><span>"</span><span>].</span><span>values</span> <span>]</span> <span>if</span> <span>images_without_description</span><span>:</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>f</span><span>"</span><span>{</span><span>len</span><span>(</span><span>images_without_description</span><span>)</span><span>}</span><span> images don</span><span>'</span><span>t have captions.</span><span>"</span><span>)</span> <span>if</span> <span>st</span><span>.</span><span>button</span><span>(</span><span>"</span><span>Generate Captions in Batch</span><span>"</span><span>):</span> <span>progress_bar</span> <span>=</span> <span>st</span><span>.</span><span>progress</span><span>(</span><span>0</span><span>)</span> <span>for</span> <span>i</span><span>,</span> <span>img</span> <span>in</span> <span>enumerate</span><span>(</span><span>images_without_description</span><span>):</span> <span>generated_description</span> <span>=</span> <span>generate_description</span><span>(</span><span>img</span><span>[</span><span>'</span><span>IMG_PATH</span><span>'</span><span>])</span> <span>update_snowflake_table</span><span>(</span><span>pd</span><span>.</span><span>DataFrame</span><span>({</span><span>'</span><span>FILE_NAME</span><span>'</span><span>:</span> <span>[</span><span>img</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]],</span> <span>'</span><span>DESCRIPTION</span><span>'</span><span>:</span> <span>[</span><span>generated_description</span><span>]}))</span> <span>progress_bar</span><span>.</span><span>progress</span><span>((</span><span>i</span> <span>+</span> <span>1</span><span>)</span> <span>/</span> <span>len</span><span>(</span><span>images_without_description</span><span>))</span> <span>st</span><span>.</span><span>success</span><span>(</span><span>"</span><span>Captions generated and saved for all images!</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>cache_data</span><span>.</span><span>clear</span><span>()</span> <span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span> <span>=</span> <span>get_metadata</span><span>()</span> <span>else</span><span>:</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>"</span><span>All images have captions.</span><span>"</span><span>)</span> <span># Display debug information </span> <span>st</span><span>.</span><span>subheader</span><span>(</span><span>"</span><span>Metadata Information</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>st</span><span>.</span><span>session_state</span><span>.</span><span>metadata_df</span><span>)</span> <span># Function to generate vector data using Cortex LLM's Embedding function </span><span>def</span> <span>generate_embedding</span><span>(</span><span>text</span><span>):</span> <span>if</span> <span>text</span> <span>and</span> <span>text</span><span>.</span><span>strip</span><span>():</span> <span>result</span> <span>=</span> <span>session</span><span>.</span><span>sql</span><span>(</span><span>f</span><span>"</span><span>SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024(</span><span>'</span><span>voyage-multilingual-2</span><span>'</span><span>, </span><span>'</span><span>{</span><span>text</span><span>}</span><span>'</span><span>) as embedding</span><span>"</span><span>).</span><span>collect</span><span>()</span> <span>return</span> <span>result</span><span>[</span><span>0</span><span>][</span><span>'</span><span>EMBEDDING</span><span>'</span><span>]</span> <span>return</span> <span>None</span> <span># Function to generate and save vector data </span><span>def</span> <span>generate_and_save_vectors</span><span>():</span> <span>st</span><span>.</span><span>title</span><span>(</span><span>"</span><span>🧬 Automatic Vector Data Generation</span><span>"</span><span>)</span> <span># Get metadata (including vector data information) </span> <span>full_metadata</span> <span>=</span> <span>session</span><span>.</span><span>table</span><span>(</span><span>"</span><span>IMAGE_METADATA</span><span>"</span><span>).</span><span>select</span><span>(</span><span>"</span><span>FILE_NAME</span><span>"</span><span>,</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>,</span> <span>"</span><span>VECTOR</span><span>"</span><span>).</span><span>to_pandas</span><span>()</span> <span># Extract images without vector data </span> <span>images_without_vector</span> <span>=</span> <span>full_metadata</span><span>[</span> <span>(</span><span>full_metadata</span><span>[</span><span>'</span><span>DESCRIPTION</span><span>'</span><span>].</span><span>notna</span><span>())</span> <span>&</span> <span>(</span><span>full_metadata</span><span>[</span><span>'</span><span>DESCRIPTION</span><span>'</span><span>]</span> <span>!=</span> <span>""</span><span>)</span> <span>&</span> <span>(</span><span>full_metadata</span><span>[</span><span>'</span><span>VECTOR</span><span>'</span><span>].</span><span>isna</span><span>())</span> <span># Only rows without vector data </span> <span>]</span> <span>if</span> <span>images_without_vector</span><span>.</span><span>empty</span><span>:</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>"</span><span>All images have vector data.</span><span>"</span><span>)</span> <span>else</span><span>:</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>f</span><span>"</span><span>Vector data can be generated for </span><span>{</span><span>len</span><span>(</span><span>images_without_vector</span><span>)</span><span>}</span><span> images.</span><span>"</span><span>)</span> <span>if</span> <span>st</span><span>.</span><span>button</span><span>(</span><span>"</span><span>Generate Vector Data</span><span>"</span><span>):</span> <span>progress_bar</span> <span>=</span> <span>st</span><span>.</span><span>progress</span><span>(</span><span>0</span><span>)</span> <span>for</span> <span>i</span><span>,</span> <span>(</span><span>_</span><span>,</span> <span>row</span><span>)</span> <span>in</span> <span>enumerate</span><span>(</span><span>images_without_vector</span><span>.</span><span>iterrows</span><span>()):</span> <span>params_df</span> <span>=</span> <span>session</span><span>.</span><span>create_dataframe</span><span>([[</span><span>row</span><span>[</span><span>'</span><span>DESCRIPTION</span><span>'</span><span>],</span> <span>row</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]]],</span> <span>schema</span><span>=</span><span>[</span><span>"</span><span>description</span><span>"</span><span>,</span> <span>"</span><span>file_name</span><span>"</span><span>])</span> <span>session</span><span>.</span><span>sql</span><span>(</span><span>"""</span><span> UPDATE IMAGE_METADATA SET VECTOR = SNOWFLAKE.CORTEX.EMBED_TEXT_1024(</span><span>'</span><span>voyage-multilingual-2</span><span>'</span><span>, description) WHERE FILE_NAME = file_name AND VECTOR IS NULL </span><span>"""</span><span>).</span><span>join</span><span>(</span><span>params_df</span><span>).</span><span>collect</span><span>()</span> <span>progress_bar</span><span>.</span><span>progress</span><span>((</span><span>i</span> <span>+</span> <span>1</span><span>)</span> <span>/</span> <span>len</span><span>(</span><span>images_without_vector</span><span>))</span> <span>st</span><span>.</span><span>success</span><span>(</span><span>"</span><span>Vector data generated and saved for all target images!</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>cache_data</span><span>.</span><span>clear</span><span>()</span> <span># Display debug information </span> <span>st</span><span>.</span><span>subheader</span><span>(</span><span>"</span><span>Metadata Information</span><span>"</span><span>)</span> <span>updated_full_metadata</span> <span>=</span> <span>session</span><span>.</span><span>table</span><span>(</span><span>"</span><span>IMAGE_METADATA</span><span>"</span><span>).</span><span>select</span><span>(</span><span>"</span><span>FILE_NAME</span><span>"</span><span>,</span> <span>"</span><span>DESCRIPTION</span><span>"</span><span>,</span> <span>"</span><span>VECTOR</span><span>"</span><span>).</span><span>to_pandas</span><span>()</span> <span>st</span><span>.</span><span>write</span><span>(</span><span>updated_full_metadata</span><span>)</span> <span># Main application execution </span><span>if</span> <span>__name__</span> <span>==</span> <span>"</span><span>__main__</span><span>"</span><span>:</span> <span>st</span><span>.</span><span>sidebar</span><span>.</span><span>title</span><span>(</span><span>"</span><span>Navigation</span><span>"</span><span>)</span> <span>page</span> <span>=</span> <span>st</span><span>.</span><span>sidebar</span><span>.</span><span>radio</span><span>(</span> <span>"</span><span>Select a feature to use:</span><span>"</span><span>,</span> <span>[</span><span>"</span><span>Image Gallery</span><span>"</span><span>,</span> <span>"</span><span>Edit Captions</span><span>"</span><span>,</span> <span>"</span><span>Auto-generate Captions</span><span>"</span><span>,</span> <span>"</span><span>Auto-generate Vector Data</span><span>"</span><span>]</span> <span>)</span> <span>if</span> <span>page</span> <span>==</span> <span>"</span><span>Image Gallery</span><span>"</span><span>:</span> <span>show_image_gallery</span><span>()</span> <span>elif</span> <span>page</span> <span>==</span> <span>"</span><span>Edit Captions</span><span>"</span><span>:</span> <span>edit_image_descriptions</span><span>()</span> <span>elif</span> <span>page</span> <span>==</span> <span>"</span><span>Auto-generate Captions</span><span>"</span><span>:</span> <span>generate_image_descriptions</span><span>()</span> <span>elif</span> <span>page</span> <span>==</span> <span>"</span><span>Auto-generate Vector Data</span><span>"</span><span>:</span> <span>generate_and_save_vectors</span><span>()</span>import streamlit as st import pandas as pd import os import base64 import boto3 import json from snowflake.snowpark.context import get_active_session from snowflake.snowpark.functions import col, when_matched, when_not_matched, lit, call_udf import _snowflake from PIL import Image import io # Set custom theme st.set_page_config( page_title="Image Gallery", layout="wide", initial_sidebar_state="expanded", ) # Add custom CSS st.markdown(""" <style> .reportview-container { background: #f0f2f6; } .main .block-container { padding-top: 2rem; padding-bottom: 2rem; padding-left: 5rem; padding-right: 5rem; } .stButton>button { background-color: #4CAF50; color: white; padding: 10px 20px; border: none; border-radius: 5px; cursor: pointer; transition: background-color 0.3s; } .stButton>button:hover { background-color: #45a049; } .stTextInput>div>div>input { border-radius: 5px; } .stSelectbox>div>div>select { border-radius: 5px; } h1, h2, h3 { color: #2c3e50; } .stProgress > div > div > div > div { background-color: #4CAF50; } </style> """, unsafe_allow_html=True) # Image folder path IMAGE_FOLDER = "image" # Get Snowflake session session = get_active_session() # Create table (only on first run) @st.cache_resource def create_table_if_not_exists(): session.sql(""" CREATE TABLE IF NOT EXISTS IMAGE_METADATA ( FILE_NAME STRING, DESCRIPTION STRING, VECTOR VECTOR(FLOAT, 1024) ) """).collect() create_table_if_not_exists() # Function to get AWS credentials def get_aws_credentials(): aws_key_object = _snowflake.get_username_password('bedrock_key') region = 'us-east-1' return { 'aws_access_key_id': aws_key_object.username, 'aws_secret_access_key': aws_key_object.password, 'region_name': region }, region # Set up Bedrock client boto3_session_args, region = get_aws_credentials() boto3_session = boto3.Session(**boto3_session_args) bedrock = boto3_session.client('bedrock-runtime', region_name=region) # Get image data @st.cache_data def get_image_data(): image_files = [f for f in os.listdir(IMAGE_FOLDER) if f.lower().endswith(('.png', '.jpg', '.jpeg', '.gif'))] return [{"FILE_NAME": f, "IMG_PATH": os.path.join(IMAGE_FOLDER, f)} for f in image_files] # Get metadata @st.cache_data def get_metadata(): return session.table("IMAGE_METADATA").select("FILE_NAME", "DESCRIPTION").to_pandas() # Convert image to thumbnail and encode in base64 @st.cache_data def get_thumbnail_base64(img_path, max_size=(300, 300)): with Image.open(img_path) as img: img.thumbnail(max_size) buffered = io.BytesIO() img.save(buffered, format="JPEG") return base64.b64encode(buffered.getvalue()).decode('utf-8') # Initialize image data and metadata if 'img_df' not in st.session_state: st.session_state.img_df = get_image_data() if 'metadata_df' not in st.session_state: st.session_state.metadata_df = get_metadata() # Display image gallery def show_image_gallery(): st.title("️ Image Gallery") # Add search box search_query = st.text_input("Search images (top 10 most relevant results will be displayed)", "") if search_query: # Escape search query (basic SQL injection prevention) escaped_query = search_query.replace("'", "''") # Vectorize search query and calculate similarity search_results = session.sql(f""" WITH search_vector AS ( SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024('voyage-multilingual-2', '{escaped_query}') as embedding ) SELECT i.FILE_NAME, i.DESCRIPTION, VECTOR_COSINE_SIMILARITY(i.VECTOR, s.embedding) as similarity FROM IMAGE_METADATA i, search_vector s WHERE i.VECTOR IS NOT NULL ORDER BY similarity DESC LIMIT 10 """).collect() # Display search results st.subheader("Search Results") for result in search_results: file_name = result['FILE_NAME'] description = result['DESCRIPTION'] similarity = result['SIMILARITY'] img_path = next((img['IMG_PATH'] for img in st.session_state.img_df if img['FILE_NAME'] == file_name), None) if img_path: col1, col2 = st.columns([1, 3]) with col1: st.image(img_path, width=150) with col2: st.write(f"File name: {file_name}") st.write(f"Description: {description}") st.write(f"Match rate: {similarity:.1%}") st.markdown("---") else: # Normal gallery display num_columns = st.slider("Width:", min_value=1, max_value=5, value=4) cols = st.columns(num_columns) for i, img in enumerate(st.session_state.img_df): with cols[i % num_columns]: st.image(img["IMG_PATH"], caption=None, use_column_width=True) # Edit image descriptions def edit_image_descriptions(): st.title("️ Edit Image Captions") st.session_state.metadata_df = get_metadata() # Add new images to metadata for img in st.session_state.img_df: if img["FILE_NAME"] not in st.session_state.metadata_df["FILE_NAME"].values: new_row = pd.DataFrame({"FILE_NAME": [img["FILE_NAME"]], "DESCRIPTION": [""]}) st.session_state.metadata_df = pd.concat([st.session_state.metadata_df, new_row], ignore_index=True) merged_df = pd.merge(st.session_state.metadata_df, pd.DataFrame(st.session_state.img_df), on="FILE_NAME", how="left") with st.form("edit_descriptions"): for _, row in merged_df.iterrows(): col1, col2 = st.columns([1, 3]) with col1: st.image(row["IMG_PATH"], width=100) with col2: new_description = st.text_input(f"File name: {row['FILE_NAME']}", value=row["DESCRIPTION"], key=row['FILE_NAME']) merged_df.loc[merged_df["FILE_NAME"] == row["FILE_NAME"], "DESCRIPTION"] = new_description submit_button = st.form_submit_button("Save Changes") if submit_button: update_snowflake_table(merged_df[['FILE_NAME', 'DESCRIPTION']]) st.success("Changes saved successfully!") st.cache_data.clear() st.session_state.metadata_df = get_metadata() # Function to generate image description def generate_description(image_path): image_base64 = get_thumbnail_base64(image_path) prompt = """ Please describe this image in English within 400 characters in a single line. No need for a response, just output the image description. """ request_body = { "anthropic_version": "bedrock-2023-05-31", "max_tokens": 200000, "messages": [ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": "image/jpeg", "data": image_base64 } }, { "type": "text", "text": prompt } ] } ] } response = bedrock.invoke_model( body=json.dumps(request_body), modelId="anthropic.claude-3-5-sonnet-20240620-v1:0", accept='application/json', contentType='application/json' ) response_body = json.loads(response.get('body').read()) return response_body["content"][0]["text"] # Function to update Snowflake table def update_snowflake_table(update_df): snow_df = session.create_dataframe(update_df) session.table("IMAGE_METADATA").merge( snow_df, (session.table("IMAGE_METADATA").FILE_NAME == snow_df.FILE_NAME), [ when_matched().update({ "DESCRIPTION": snow_df.DESCRIPTION }), when_not_matched().insert({ "FILE_NAME": snow_df.FILE_NAME, "DESCRIPTION": snow_df.DESCRIPTION }) ] ) # Generate image descriptions def generate_image_descriptions(): st.title("🤖 Automatic Image Caption Generation") if 'generated_description' not in st.session_state: st.session_state.generated_description = None if 'selected_image' not in st.session_state: st.session_state.selected_image = None # Generate description for individual image with st.form("generate_description"): selected_image = st.selectbox("Select an image:", options=[img["FILE_NAME"] for img in st.session_state.img_df]) generate_button = st.form_submit_button("Generate Image Caption") if generate_button: image_info = next(img for img in st.session_state.img_df if img['FILE_NAME'] == selected_image) generated_description = generate_description(image_info['IMG_PATH']) st.session_state.generated_description = generated_description st.session_state.selected_image = selected_image st.image(image_info['IMG_PATH'], width=300) st.write("Generated Caption:") st.write(generated_description) if st.session_state.generated_description is not None: if st.button("Save Caption"): update_snowflake_table(pd.DataFrame({'FILE_NAME': [st.session_state.selected_image], 'DESCRIPTION': [st.session_state.generated_description]})) st.success("Caption saved successfully") st.cache_data.clear() st.session_state.metadata_df = get_metadata() st.session_state.generated_description = None st.session_state.selected_image = None # Batch process images without descriptions st.subheader("Batch Caption Generation for Uncaptioned Images") images_without_description = [ img for img in st.session_state.img_df if img["FILE_NAME"] not in st.session_state.metadata_df[ st.session_state.metadata_df["DESCRIPTION"].notna() & (st.session_state.metadata_df["DESCRIPTION"] != "") ]["FILE_NAME"].values ] if images_without_description: st.write(f"{len(images_without_description)} images don't have captions.") if st.button("Generate Captions in Batch"): progress_bar = st.progress(0) for i, img in enumerate(images_without_description): generated_description = generate_description(img['IMG_PATH']) update_snowflake_table(pd.DataFrame({'FILE_NAME': [img['FILE_NAME']], 'DESCRIPTION': [generated_description]})) progress_bar.progress((i + 1) / len(images_without_description)) st.success("Captions generated and saved for all images!") st.cache_data.clear() st.session_state.metadata_df = get_metadata() else: st.write("All images have captions.") # Display debug information st.subheader("Metadata Information") st.write(st.session_state.metadata_df) # Function to generate vector data using Cortex LLM's Embedding function def generate_embedding(text): if text and text.strip(): result = session.sql(f"SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024('voyage-multilingual-2', '{text}') as embedding").collect() return result[0]['EMBEDDING'] return None # Function to generate and save vector data def generate_and_save_vectors(): st.title("🧬 Automatic Vector Data Generation") # Get metadata (including vector data information) full_metadata = session.table("IMAGE_METADATA").select("FILE_NAME", "DESCRIPTION", "VECTOR").to_pandas() # Extract images without vector data images_without_vector = full_metadata[ (full_metadata['DESCRIPTION'].notna()) & (full_metadata['DESCRIPTION'] != "") & (full_metadata['VECTOR'].isna()) # Only rows without vector data ] if images_without_vector.empty: st.write("All images have vector data.") else: st.write(f"Vector data can be generated for {len(images_without_vector)} images.") if st.button("Generate Vector Data"): progress_bar = st.progress(0) for i, (_, row) in enumerate(images_without_vector.iterrows()): params_df = session.create_dataframe([[row['DESCRIPTION'], row['FILE_NAME']]], schema=["description", "file_name"]) session.sql(""" UPDATE IMAGE_METADATA SET VECTOR = SNOWFLAKE.CORTEX.EMBED_TEXT_1024('voyage-multilingual-2', description) WHERE FILE_NAME = file_name AND VECTOR IS NULL """).join(params_df).collect() progress_bar.progress((i + 1) / len(images_without_vector)) st.success("Vector data generated and saved for all target images!") st.cache_data.clear() # Display debug information st.subheader("Metadata Information") updated_full_metadata = session.table("IMAGE_METADATA").select("FILE_NAME", "DESCRIPTION", "VECTOR").to_pandas() st.write(updated_full_metadata) # Main application execution if __name__ == "__main__": st.sidebar.title("Navigation") page = st.sidebar.radio( "Select a feature to use:", ["Image Gallery", "Edit Captions", "Auto-generate Captions", "Auto-generate Vector Data"] ) if page == "Image Gallery": show_image_gallery() elif page == "Edit Captions": edit_image_descriptions() elif page == "Auto-generate Captions": generate_image_descriptions() elif page == "Auto-generate Vector Data": generate_and_save_vectors()
Enter fullscreen mode Exit fullscreen mode
Explanation of Some Code Parts
The following section applies custom CSS to slightly customize the design. Streamlit in Snowflake allows customization of web applications using HTML
, CSS
, and JavaScript
. For more details, please check this documentation.
<span># Add custom CSS </span><span>st</span><span>.</span><span>markdown</span><span>(</span><span>"""</span><span> <style> .reportview-container { background: #f0f2f6; } .main .block-container { padding-top: 2rem; padding-bottom: 2rem; padding-left: 5rem; padding-right: 5rem; } .stButton>button { background-color: #4CAF50; color: white; padding: 10px 20px; border: none; border-radius: 5px; cursor: pointer; transition: background-color 0.3s; } .stButton>button:hover { background-color: #45a049; } .stTextInput>div>div>input { border-radius: 5px; } .stSelectbox>div>div>select { border-radius: 5px; } h1, h2, h3 { color: #2c3e50; } .stProgress > div > div > div > div { background-color: #4CAF50; } </style> </span><span>"""</span><span>,</span> <span>unsafe_allow_html</span><span>=</span><span>True</span><span>)</span><span># Add custom CSS </span><span>st</span><span>.</span><span>markdown</span><span>(</span><span>"""</span><span> <style> .reportview-container { background: #f0f2f6; } .main .block-container { padding-top: 2rem; padding-bottom: 2rem; padding-left: 5rem; padding-right: 5rem; } .stButton>button { background-color: #4CAF50; color: white; padding: 10px 20px; border: none; border-radius: 5px; cursor: pointer; transition: background-color 0.3s; } .stButton>button:hover { background-color: #45a049; } .stTextInput>div>div>input { border-radius: 5px; } .stSelectbox>div>div>select { border-radius: 5px; } h1, h2, h3 { color: #2c3e50; } .stProgress > div > div > div > div { background-color: #4CAF50; } </style> </span><span>"""</span><span>,</span> <span>unsafe_allow_html</span><span>=</span><span>True</span><span>)</span># Add custom CSS st.markdown(""" <style> .reportview-container { background: #f0f2f6; } .main .block-container { padding-top: 2rem; padding-bottom: 2rem; padding-left: 5rem; padding-right: 5rem; } .stButton>button { background-color: #4CAF50; color: white; padding: 10px 20px; border: none; border-radius: 5px; cursor: pointer; transition: background-color 0.3s; } .stButton>button:hover { background-color: #45a049; } .stTextInput>div>div>input { border-radius: 5px; } .stSelectbox>div>div>select { border-radius: 5px; } h1, h2, h3 { color: #2c3e50; } .stProgress > div > div > div > div { background-color: #4CAF50; } </style> """, unsafe_allow_html=True)
Enter fullscreen mode Exit fullscreen mode
The following part vectorizes the user’s search string and calculates its similarity with the vector data of the images. The higher the similarity, the closer the search string is to the image, so we retrieve the top 10 results based on similarity.
<span># Vectorize the search query and calculate similarity </span> <span>search_results</span> <span>=</span> <span>session</span><span>.</span><span>sql</span><span>(</span><span>f</span><span>"""</span><span> WITH search_vector AS ( SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024(</span><span>'</span><span>voyage-multilingual-2</span><span>'</span><span>, </span><span>'</span><span>{</span><span>escaped_query</span><span>}</span><span>'</span><span>) as embedding ) SELECT i.FILE_NAME, i.DESCRIPTION, VECTOR_COSINE_SIMILARITY(i.VECTOR, s.embedding) as similarity FROM IMAGE_METADATA i, search_vector s WHERE i.VECTOR IS NOT NULL ORDER BY similarity DESC LIMIT 10 </span><span>"""</span><span>).</span><span>collect</span><span>()</span><span># Vectorize the search query and calculate similarity </span> <span>search_results</span> <span>=</span> <span>session</span><span>.</span><span>sql</span><span>(</span><span>f</span><span>"""</span><span> WITH search_vector AS ( SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024(</span><span>'</span><span>voyage-multilingual-2</span><span>'</span><span>, </span><span>'</span><span>{</span><span>escaped_query</span><span>}</span><span>'</span><span>) as embedding ) SELECT i.FILE_NAME, i.DESCRIPTION, VECTOR_COSINE_SIMILARITY(i.VECTOR, s.embedding) as similarity FROM IMAGE_METADATA i, search_vector s WHERE i.VECTOR IS NOT NULL ORDER BY similarity DESC LIMIT 10 </span><span>"""</span><span>).</span><span>collect</span><span>()</span># Vectorize the search query and calculate similarity search_results = session.sql(f""" WITH search_vector AS ( SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024('voyage-multilingual-2', '{escaped_query}') as embedding ) SELECT i.FILE_NAME, i.DESCRIPTION, VECTOR_COSINE_SIMILARITY(i.VECTOR, s.embedding) as similarity FROM IMAGE_METADATA i, search_vector s WHERE i.VECTOR IS NOT NULL ORDER BY similarity DESC LIMIT 10 """).collect()
Enter fullscreen mode Exit fullscreen mode
The following section generates vector data from image captions. Note how this is achieved with a simple 3-line SQL query.
<span>if</span> <span>st</span><span>.</span><span>button</span><span>(</span><span>"</span><span>Generate Vector Data</span><span>"</span><span>):</span><span>progress_bar</span> <span>=</span> <span>st</span><span>.</span><span>progress</span><span>(</span><span>0</span><span>)</span><span>for</span> <span>i</span><span>,</span> <span>(</span><span>_</span><span>,</span> <span>row</span><span>)</span> <span>in</span> <span>enumerate</span><span>(</span><span>images_without_vector</span><span>.</span><span>iterrows</span><span>()):</span><span>params_df</span> <span>=</span> <span>session</span><span>.</span><span>create_dataframe</span><span>([[</span><span>row</span><span>[</span><span>'</span><span>DESCRIPTION</span><span>'</span><span>],</span> <span>row</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]]],</span> <span>schema</span><span>=</span><span>[</span><span>"</span><span>description</span><span>"</span><span>,</span> <span>"</span><span>file_name</span><span>"</span><span>])</span><span>session</span><span>.</span><span>sql</span><span>(</span><span>"""</span><span> UPDATE IMAGE_METADATA SET VECTOR = SNOWFLAKE.CORTEX.EMBED_TEXT_1024(</span><span>'</span><span>voyage-multilingual-2</span><span>'</span><span>, description) WHERE FILE_NAME = file_name AND VECTOR IS NULL </span><span>"""</span><span>).</span><span>join</span><span>(</span><span>params_df</span><span>).</span><span>collect</span><span>()</span><span>progress_bar</span><span>.</span><span>progress</span><span>((</span><span>i</span> <span>+</span> <span>1</span><span>)</span> <span>/</span> <span>len</span><span>(</span><span>images_without_vector</span><span>))</span><span>st</span><span>.</span><span>success</span><span>(</span><span>"</span><span>Vector data generated and saved for all target images!</span><span>"</span><span>)</span><span>st</span><span>.</span><span>cache_data</span><span>.</span><span>clear</span><span>()</span><span>if</span> <span>st</span><span>.</span><span>button</span><span>(</span><span>"</span><span>Generate Vector Data</span><span>"</span><span>):</span> <span>progress_bar</span> <span>=</span> <span>st</span><span>.</span><span>progress</span><span>(</span><span>0</span><span>)</span> <span>for</span> <span>i</span><span>,</span> <span>(</span><span>_</span><span>,</span> <span>row</span><span>)</span> <span>in</span> <span>enumerate</span><span>(</span><span>images_without_vector</span><span>.</span><span>iterrows</span><span>()):</span> <span>params_df</span> <span>=</span> <span>session</span><span>.</span><span>create_dataframe</span><span>([[</span><span>row</span><span>[</span><span>'</span><span>DESCRIPTION</span><span>'</span><span>],</span> <span>row</span><span>[</span><span>'</span><span>FILE_NAME</span><span>'</span><span>]]],</span> <span>schema</span><span>=</span><span>[</span><span>"</span><span>description</span><span>"</span><span>,</span> <span>"</span><span>file_name</span><span>"</span><span>])</span> <span>session</span><span>.</span><span>sql</span><span>(</span><span>"""</span><span> UPDATE IMAGE_METADATA SET VECTOR = SNOWFLAKE.CORTEX.EMBED_TEXT_1024(</span><span>'</span><span>voyage-multilingual-2</span><span>'</span><span>, description) WHERE FILE_NAME = file_name AND VECTOR IS NULL </span><span>"""</span><span>).</span><span>join</span><span>(</span><span>params_df</span><span>).</span><span>collect</span><span>()</span> <span>progress_bar</span><span>.</span><span>progress</span><span>((</span><span>i</span> <span>+</span> <span>1</span><span>)</span> <span>/</span> <span>len</span><span>(</span><span>images_without_vector</span><span>))</span> <span>st</span><span>.</span><span>success</span><span>(</span><span>"</span><span>Vector data generated and saved for all target images!</span><span>"</span><span>)</span> <span>st</span><span>.</span><span>cache_data</span><span>.</span><span>clear</span><span>()</span>if st.button("Generate Vector Data"): progress_bar = st.progress(0) for i, (_, row) in enumerate(images_without_vector.iterrows()): params_df = session.create_dataframe([[row['DESCRIPTION'], row['FILE_NAME']]], schema=["description", "file_name"]) session.sql(""" UPDATE IMAGE_METADATA SET VECTOR = SNOWFLAKE.CORTEX.EMBED_TEXT_1024('voyage-multilingual-2', description) WHERE FILE_NAME = file_name AND VECTOR IS NULL """).join(params_df).collect() progress_bar.progress((i + 1) / len(images_without_vector)) st.success("Vector data generated and saved for all target images!") st.cache_data.clear()
Enter fullscreen mode Exit fullscreen mode
Conclusion
With the foundation for utilizing images by generating captions laid in previous parts, this article has enabled searching images even with ambiguous keywords. Using Snowflake’s powerful vector search mechanism, it’s possible to retrieve search results instantly even with vast amounts of image data.
Some ideas for further developing this app include:
- Enabling similar image searches based on a specified image, not just user search strings
- Extending search capabilities to other unstructured data types like documents and music
I’m sure you have many ideas of your own! I’d be delighted if you could use this article as a reference to bring your ideas to life.
Promotion
Sharing Snowflake’s What’s New on X
I’m sharing updates on Snowflake’s What’s New on X. I’d be happy if you could follow:
English Version
Snowflake What’s New Bot (English Version)
Japanese Version
Snowflake’s What’s New Bot (Japanese Version)
Change History
(20240924) Initial post
Original Japanese Article
https://zenn.dev/tsubasa_tech/articles/64722ba45947a9
Image search with Streamlit in Snowflake (3 Part Series)
1 Image search with Streamlit in Snowflake (SiS) Part 1 – Creating an image gallery app –
2 Image search with Streamlit in Snowflake (SiS) Part 2 – Automatically generate image captions –
3 Image search with Streamlit in Snowflake (SiS) Part 3 – Add vector search function for images –
原文链接:Image search with Streamlit in Snowflake (SiS) Part 3 – Add vector search function for images –
暂无评论内容