Question about RAG Langchain Tutorial: isn’t embedding repeated?

herrmayr · January 21, 2024, 10:10pm

Hi everyone,

I’m rather new to Streamlit and I’m trying to understand this tutorial: https://blog.streamlit.io/langchain-tutorial-4-build-an-ask-the-doc-app/

Do I understand correctly that the generate_response method is run on every form submission?

Wouldn’t that mean that the text is embedded and stored in a vector store on each new question that the user enters? That would be very inefficient, because isn’t the vector store there so I can store the embedding and don’t have to do the process (which costs both time and money potentially) every time?

I probably miss something, and I would appreciate if anyone could help me on my learning journey here…

Thank you!
Chris

tonykip · January 22, 2024, 6:05am

You’re absolutely right in your understanding about the efficiency issue. The tutorial was made for the purpose of introducing beginners to build a simple app with less code so they can then venture and build up upon the basic app.

With the current tutorial code, generate_response method is indeed called on every form submission. So for each new question a user asks, the entire document is re-ingested, split into chunks, each chunk is embedded in Chroma even if it has not changed.

There are ways to mitigate this which would introduce more code to handle the complexity. The main idea would be to decouple the document ingestion pipeline from the Q&A/ retrieval process. Some code would assign id to documents uploaded and before each run, the code checks if it has been uploaded before and if true, it retrieves the vector embeddings of that document and passes it to the LLM for a response. If not present, then the initial pipeline is run end-to-end. This is just one of many way to solve this.

Topic		Replies	Views
LangChain tutorial #4: Build an Ask the Doc app Show the Community! llms	7	5695	July 21, 2024
Streamlit + LLMs + files = how to avoid re-running the file? Using Streamlit	3	1059	July 30, 2023
Undesirable behavior where all actions trigger an app refresh and the question and answer process is rerun Using Streamlit session-state	7	806	August 29, 2023
Streaming response from RAG app LLMs and AI debugging	3	2634	July 14, 2024
Advice needed: Converting Jupyter Notebook to Streamlit web app for LLM chatbot LLMs and AI discussion	2	356	July 30, 2024

Question about RAG Langchain Tutorial: isn’t embedding repeated?

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies