Do you sometimes feel overwhelmed by too many research papers and struggle to find the right one? You’re not alone my friend. There’s so much academic literature that it’s becoming hard to keep up with the latest discoveries in your field. But what if you could make this process easier and tailor it to your needs? What if you could build an AI Research Agent and make it your personal assistant that finds the most relevant papers for you?
You will build this agent using powerful libraries like Streamlit to create a user-friendly web app, OpenAI’s GPT-4o-mini for advanced language understanding, MultiOn to access Arxiv and retrieve the latest research data, and Mem0 to provide a personalized memory layer that learns from your preferences. With these tools, you’ll be able to navigate academic research like never before.
With this, you’ll spend less time searching and more time focusing on your work. In this guide, you’ll realize that building your own AI research agent with memory using GPT-4o-mini and a vector database to find relevant research papers based on your interests is easier than you think.
It doesn’t matter if you are just starting out, you can set up this powerful tool.
Here’s a step-by-step guide to get you started.
Project Environment Setup
When working in Visual Studio Code (VS Code), start by creating a new Python file for our project. It’s helpful to have separate files for different parts of your project.
Create a new Python application:
To do this, start by opening your VS Code and creating a new folder:
Step 1
Open VS Code
Step 2
Create new folder
Step 3
Create a new file called app.py
in the newly created folder.
Installing required Python Libraries
First, open your terminal and run the following commands:
pip <span>install </span>streamlit openai multion mem0pip <span>install </span>streamlit openai multion mem0pip install streamlit openai multion mem0
Enter fullscreen mode Exit fullscreen mode
Importing the necessary Libraries
In your Python script, import the following libraries:
<span>import</span> <span>streamlit</span> <span>as</span> <span>st</span><span>import</span> <span>os</span><span>from</span> <span>openai</span> <span>import</span> <span>OpenAI</span><span>from</span> <span>multion.client</span> <span>import</span> <span>Multi0n</span><span>from</span> <span>mem0</span> <span>import</span> <span>Memory</span><span>import</span> <span>streamlit</span> <span>as</span> <span>st</span> <span>import</span> <span>os</span> <span>from</span> <span>openai</span> <span>import</span> <span>OpenAI</span> <span>from</span> <span>multion.client</span> <span>import</span> <span>Multi0n</span> <span>from</span> <span>mem0</span> <span>import</span> <span>Memory</span>import streamlit as st import os from openai import OpenAI from multion.client import Multi0n from mem0 import Memory
Enter fullscreen mode Exit fullscreen mode
- Streamlit: Used for building the web app.
- OpenAI: Utilized for GPT-4o-mini.
- MultiOn: Accesses Arxiv and retrieves data.
- Mem0: Provides a personalized memory layer.
Setting the Streamlit App
Configure the basic layout of your Streamlit app
:
<span>st</span><span>.</span><span>title</span><span>(</span><span>"</span><span>AI Research Agent with Memory </span><span>"</span><span>)</span><span>api_keys</span> <span>=</span><span>(</span><span>api_keys</span> <span>=</span><span>{</span><span>k</span><span>:</span><span>st</span><span>.</span><span>text_input</span><span>(</span><span>f</span><span>"</span><span> (k.capitalize()} API Key</span><span>"</span><span>,</span> <span>type</span><span>=</span><span>"</span><span>password</span><span>"</span><span>)</span> <span>for</span> <span>k</span> <span>in</span> <span>[</span><span>'</span><span>openai</span><span>'</span><span>,</span> <span>'</span><span>multion</span><span>'</span><span>]}</span><span>)</span><span>st</span><span>.</span><span>title</span><span>(</span><span>"</span><span>AI Research Agent with Memory </span><span>"</span><span>)</span> <span>api_keys</span> <span>=</span><span>(</span><span>api_keys</span> <span>=</span><span>{</span><span>k</span><span>:</span><span>st</span><span>.</span><span>text_input</span><span>(</span><span>f</span><span>"</span><span> (k.capitalize()} API Key</span><span>"</span><span>,</span> <span>type</span><span>=</span><span>"</span><span>password</span><span>"</span><span>)</span> <span>for</span> <span>k</span> <span>in</span> <span>[</span><span>'</span><span>openai</span><span>'</span><span>,</span> <span>'</span><span>multion</span><span>'</span><span>]}</span> <span>)</span>st.title("AI Research Agent with Memory ") api_keys =(api_keys ={k:st.text_input(f" (k.capitalize()} API Key", type="password") for k in ['openai', 'multion']} )
Enter fullscreen mode Exit fullscreen mode
Initializing services with API Keys
Set up the services by configuring Mem0
with Qdrant
as the vector store and initializing MultiOn
and OpenAI clients:
<span>if</span> <span>all</span><span>(</span><span>api_keys</span><span>.</span><span>values</span><span>()):</span><span>os</span><span>.</span><span>environ</span><span>[</span><span>'</span><span>OPENAI_API_KEY</span><span>'</span><span>]</span><span>=</span><span>api_keys</span><span>[</span><span>'</span><span>openai</span><span>'</span><span>]</span><span>config</span> <span>=</span> <span>{</span> <span>"</span><span>vector_store</span><span>"</span><span>:</span> <span>{</span> <span>"</span><span>provider</span><span>"</span><span>:</span> <span>"</span><span>qdrant</span><span>"</span><span>,</span> <span>"</span><span>config</span><span>"</span><span>:</span> <span>"</span><span>model</span><span>"</span><span>:</span> <span>"</span><span>gpt-40-mini</span><span>"</span><span>,</span> <span>"</span><span>host</span><span>"</span><span>:</span> <span>"</span><span>localhost</span><span>"</span><span>,</span> <span>"</span><span>port</span><span>"</span><span>:</span> <span>6333</span><span>,</span> <span>},</span> <span>}</span><span>memory</span><span>=</span> <span>Memory</span><span>.</span><span>from_config</span><span>(</span><span>config</span><span>)</span><span>multion</span> <span>=</span> <span>Multion</span><span>(</span><span>api_key</span><span>=</span><span>api_keys</span><span>[</span><span>'</span><span>multion</span><span>'</span><span>])</span><span>openai_client</span> <span>=</span> <span>OpenAI </span><span>(</span><span>api_key</span><span>=</span><span>api_keys</span><span>[</span><span>'</span><span>openai</span><span>'</span><span>])</span><span>if</span> <span>all</span><span>(</span><span>api_keys</span><span>.</span><span>values</span><span>()):</span> <span>os</span><span>.</span><span>environ</span><span>[</span><span>'</span><span>OPENAI_API_KEY</span><span>'</span><span>]</span><span>=</span><span>api_keys</span><span>[</span><span>'</span><span>openai</span><span>'</span><span>]</span> <span>config</span> <span>=</span> <span>{</span> <span>"</span><span>vector_store</span><span>"</span><span>:</span> <span>{</span> <span>"</span><span>provider</span><span>"</span><span>:</span> <span>"</span><span>qdrant</span><span>"</span><span>,</span> <span>"</span><span>config</span><span>"</span><span>:</span> <span>"</span><span>model</span><span>"</span><span>:</span> <span>"</span><span>gpt-40-mini</span><span>"</span><span>,</span> <span>"</span><span>host</span><span>"</span><span>:</span> <span>"</span><span>localhost</span><span>"</span><span>,</span> <span>"</span><span>port</span><span>"</span><span>:</span> <span>6333</span><span>,</span> <span>},</span> <span>}</span> <span>memory</span><span>=</span> <span>Memory</span><span>.</span><span>from_config</span><span>(</span><span>config</span><span>)</span> <span>multion</span> <span>=</span> <span>Multion</span><span>(</span><span>api_key</span><span>=</span><span>api_keys</span><span>[</span><span>'</span><span>multion</span><span>'</span><span>])</span> <span>openai_client</span> <span>=</span> <span>OpenAI </span><span>(</span><span>api_key</span><span>=</span><span>api_keys</span><span>[</span><span>'</span><span>openai</span><span>'</span><span>])</span>if all(api_keys.values()): os.environ['OPENAI_API_KEY']=api_keys['openai'] config = { "vector_store": { "provider": "qdrant", "config": "model": "gpt-40-mini", "host": "localhost", "port": 6333, }, } memory= Memory.from_config(config) multion = Multion(api_key=api_keys['multion']) openai_client = OpenAI (api_key=api_keys['openai'])
Enter fullscreen mode Exit fullscreen mode
Creating user input and search query fields
Add a sidebar for user input and a search query field:
<span>st</span><span>.</span><span>sidebar</span><span>.</span><span>text_input</span><span>(</span><span>"</span><span>Enter your Username</span><span>"</span><span>)</span><span>search_query</span> <span>=</span> <span>st</span><span>.</span><span>text_input</span><span>(</span><span>"</span><span>Research paper search query</span><span>"</span><span>)</span><span>st</span><span>.</span><span>sidebar</span><span>.</span><span>text_input</span><span>(</span><span>"</span><span>Enter your Username</span><span>"</span><span>)</span> <span>search_query</span> <span>=</span> <span>st</span><span>.</span><span>text_input</span><span>(</span><span>"</span><span>Research paper search query</span><span>"</span><span>)</span>st.sidebar.text_input("Enter your Username") search_query = st.text_input("Research paper search query")
Enter fullscreen mode Exit fullscreen mode
Defining a function to process search results with GPT-4o-mini
Create a function to process search results into a readable format:
<span>def</span> <span>process_with_gpt4</span> <span>(</span><span>result</span><span>):</span><span>prompt</span> <span>f</span> <span>"""</span><span> Based on the following arXiv search result, provide a proper structured output in markdown that is readable by the users. Each paper should have a title, authors, abstract, and link.Search Result: (result) Output Format: Table with the following columns: [{{</span><span>"</span><span>title</span><span>"</span><span>: </span><span>"</span><span>Paper Title</span><span>"</span><span>, </span><span>"</span><span>authors</span><span>"</span><span>: </span><span>"</span><span>Author Names</span><span>"</span><span>, </span><span>"</span><span>abstract</span><span>"</span><span>: </span><span>"</span><span>Brief abstract</span><span>"</span><span>, </span><span>"</span><span>link</span><span>"</span><span>: </span><span>"</span><span>arXiv link</span><span>"</span><span>}}, ...]</span><span>"""</span><span>response</span> <span>=</span> <span>openai_client</span><span>.</span><span>chat</span><span>.</span><span>completions</span><span>.</span><span>create </span><span>(</span><span>model</span><span>=</span><span>"</span><span>gpt-40-mini</span><span>"</span><span>,</span> <span>messages</span><span>=</span><span>[{</span><span>"</span><span>role</span><span>"</span><span>:</span> <span>"</span><span>user</span><span>"</span><span>,</span> <span>"</span><span>content</span><span>"</span><span>:</span> <span>prompt</span><span>}],</span> <span>temperature</span><span>=</span><span>0.2</span><span>)</span><span>return</span> <span>response</span><span>.</span><span>choices</span><span>[</span><span>0</span><span>].</span><span>message</span><span>.</span><span>content</span><span>def</span> <span>process_with_gpt4</span> <span>(</span><span>result</span><span>):</span> <span>prompt</span> <span>f</span> <span>"""</span><span> Based on the following arXiv search result, provide a proper structured output in markdown that is readable by the users. Each paper should have a title, authors, abstract, and link.Search Result: (result) Output Format: Table with the following columns: [{{</span><span>"</span><span>title</span><span>"</span><span>: </span><span>"</span><span>Paper Title</span><span>"</span><span>, </span><span>"</span><span>authors</span><span>"</span><span>: </span><span>"</span><span>Author Names</span><span>"</span><span>, </span><span>"</span><span>abstract</span><span>"</span><span>: </span><span>"</span><span>Brief abstract</span><span>"</span><span>, </span><span>"</span><span>link</span><span>"</span><span>: </span><span>"</span><span>arXiv link</span><span>"</span><span>}}, ...]</span><span>"""</span> <span>response</span> <span>=</span> <span>openai_client</span><span>.</span><span>chat</span><span>.</span><span>completions</span><span>.</span><span>create </span><span>(</span><span>model</span><span>=</span><span>"</span><span>gpt-40-mini</span><span>"</span><span>,</span> <span>messages</span><span>=</span><span>[{</span><span>"</span><span>role</span><span>"</span><span>:</span> <span>"</span><span>user</span><span>"</span><span>,</span> <span>"</span><span>content</span><span>"</span><span>:</span> <span>prompt</span><span>}],</span> <span>temperature</span><span>=</span><span>0.2</span><span>)</span> <span>return</span> <span>response</span><span>.</span><span>choices</span><span>[</span><span>0</span><span>].</span><span>message</span><span>.</span><span>content</span>def process_with_gpt4 (result): prompt f """ Based on the following arXiv search result, provide a proper structured output in markdown that is readable by the users. Each paper should have a title, authors, abstract, and link.Search Result: (result) Output Format: Table with the following columns: [{{"title": "Paper Title", "authors": "Author Names", "abstract": "Brief abstract", "link": "arXiv link"}}, ...]""" response = openai_client.chat.completions.create (model="gpt-40-mini", messages=[{"role": "user", "content": prompt}], temperature=0.2) return response.choices[0].message.content
Enter fullscreen mode Exit fullscreen mode
Implementing paper search functionality
Build the core functionality to search and display research papers:
<span>if</span> <span>st</span><span>.</span><span>button</span><span>(</span><span>'</span><span>Search for Papers</span><span>'</span><span>):</span><span>with</span> <span>st</span><span>.</span> <span>spinner </span><span>(</span><span>'</span><span>Searching and Processing...</span><span>'</span><span>):</span><span>relevant</span> <span>memories</span> <span>=</span> <span>memory</span><span>.</span><span>search</span><span>(</span><span>search_query</span><span>,</span> <span>user_id</span><span>=</span><span>user_id</span><span>,</span> <span>limit</span><span>=</span><span>3</span><span>)</span><span>prompt</span> <span>=</span> <span>f</span> <span>"</span><span>Search for arXiv papers: {search_query}</span><span>\n</span><span>User background: {</span><span>'</span><span> </span><span>'</span><span>.join(mem[</span><span>'</span><span>text</span><span>'</span><span>] for mem in relevant_memories)}</span><span>"</span><span>result</span> <span>=</span> <span>process_with_gpt4 </span><span>(</span><span>multion</span><span>.</span><span>browse </span><span>(</span><span>cmd</span><span>=</span><span>prompt</span><span>,</span> <span>url</span><span>=</span><span>"</span><span>https://arxiv.org/</span><span>"</span><span>))</span><span>st</span><span>.</span><span>markdown </span><span>(</span><span>result</span><span>)</span><span>if</span> <span>st</span><span>.</span><span>button</span><span>(</span><span>'</span><span>Search for Papers</span><span>'</span><span>):</span> <span>with</span> <span>st</span><span>.</span> <span>spinner </span><span>(</span><span>'</span><span>Searching and Processing...</span><span>'</span><span>):</span> <span>relevant</span> <span>memories</span> <span>=</span> <span>memory</span><span>.</span><span>search</span><span>(</span><span>search_query</span><span>,</span> <span>user_id</span><span>=</span><span>user_id</span><span>,</span> <span>limit</span><span>=</span><span>3</span><span>)</span> <span>prompt</span> <span>=</span> <span>f</span> <span>"</span><span>Search for arXiv papers: {search_query}</span><span>\n</span><span>User background: {</span><span>'</span><span> </span><span>'</span><span>.join(mem[</span><span>'</span><span>text</span><span>'</span><span>] for mem in relevant_memories)}</span><span>"</span> <span>result</span> <span>=</span> <span>process_with_gpt4 </span><span>(</span><span>multion</span><span>.</span><span>browse </span><span>(</span><span>cmd</span><span>=</span><span>prompt</span><span>,</span> <span>url</span><span>=</span><span>"</span><span>https://arxiv.org/</span><span>"</span><span>))</span> <span>st</span><span>.</span><span>markdown </span><span>(</span><span>result</span><span>)</span>if st.button('Search for Papers'): with st. spinner ('Searching and Processing...'): relevant memories = memory.search(search_query, user_id=user_id, limit=3) prompt = f "Search for arXiv papers: {search_query}\nUser background: {' '.join(mem['text'] for mem in relevant_memories)}" result = process_with_gpt4 (multion.browse (cmd=prompt, url="https://arxiv.org/")) st.markdown (result)
Enter fullscreen mode Exit fullscreen mode
Adding memory viewing feature
To view your stored memories:
<span>if</span> <span>st</span><span>.</span><span>sidebar</span><span>.</span><span>button</span><span>(</span><span>"</span><span>View Memory</span><span>"</span><span>):</span><span>st</span><span>.</span><span>sidebar</span><span>.</span><span>write</span><span>(</span><span>"</span><span>\n</span><span>"</span><span>.</span><span>join</span><span>([</span><span>f</span><span>"</span><span>- </span><span>{</span><span>mem</span><span>[</span><span>'</span><span>text</span><span>'</span><span>])</span><span>}</span><span>"</span> <span>for</span> <span>mem</span> <span>in</span> <span>memory</span><span>.</span><span>get_all </span><span>(</span><span>user_id</span><span>=</span><span>user_id</span><span>)]))</span><span>if</span> <span>st</span><span>.</span><span>sidebar</span><span>.</span><span>button</span><span>(</span><span>"</span><span>View Memory</span><span>"</span><span>):</span> <span>st</span><span>.</span><span>sidebar</span><span>.</span><span>write</span><span>(</span><span>"</span><span>\n</span><span>"</span><span>.</span><span>join</span><span>([</span><span>f</span><span>"</span><span>- </span><span>{</span><span>mem</span><span>[</span><span>'</span><span>text</span><span>'</span><span>])</span><span>}</span><span>"</span> <span>for</span> <span>mem</span> <span>in</span> <span>memory</span><span>.</span><span>get_all </span><span>(</span><span>user_id</span><span>=</span><span>user_id</span><span>)]))</span>if st.sidebar.button("View Memory"): st.sidebar.write("\n".join([f"- {mem['text'])}" for mem in memory.get_all (user_id=user_id)]))
Enter fullscreen mode Exit fullscreen mode
Running Application
To see your AI research agent
in action, paste the above code into your IDE (VSCode or PyCharm) and run the following command:
streamlit run ai_arxiv_agent_memory.pystreamlit run ai_arxiv_agent_memory.pystreamlit run ai_arxiv_agent_memory.py
Enter fullscreen mode Exit fullscreen mode
This will launch your Streamlit app
, where you can search for research papers and manage your personalized memory layer.
And there you have it! You now have the power to tame the beast of academic literature and make research a whole lot easier. With your AI Research Agent
by your side, you’ll be able to find the perfect papers, stay on top of the latest discoveries, and focus on what really matters – your work.
It’s like having your own personal research assistant, minus the coffee breaks. So, what are you waiting for? Dive in, start building, and discover a whole new world of stress-free research.
If you find this guide useful, please share it.
原文链接:AI Research Agent with memory using GPT-4o-mini: Step-by-Step Guide.
暂无评论内容