Summary
I build a simple app to upload a pdf file, create embeddings, connect to LLM and generate the response.
However, every time I ask a new question the app is doing all the stuff from the beginning. For large pdf files, it may take several minutes to create embeddings.
I am trying to build a logic that can create embeddings only once and pass the ‘embeddings engine’ to the next function to answer the question. Still I havent found a solution…
Steps to reproduce
Code snippet:
# upload file
file = st.file_uploader("Upload your PDF", type="pdf")
if file is not None:
if st.button('Generate engine: '):
with st.spinner('Model generating the response engine...'):
------------
LOADING THE FILE AND GENERATING THE EMBEDDINGS
------------
qa = RetrievalQA.from_chain_type(
llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True
)
st.header('Ask your data')
user_q = st.text_area('Enter your questions here: ')
if st.button('Get response'):
try:
with st.spinner('Model is working on it...'):
result = qa({"query": user_q})
st.subheader('Response: ')
st.write(result['result'])
st.subheader('Source pages: ')
st.write(result['source_documents']['metadata'])
except Exception as e:
st.error(f" An error: {e}")
Expected behavior:
I would like it to use the ‘qa’ but do not refresh the creation of embeddings
Actual behavior:
The code snippet I put creates an error: An error: local variable ‘qa’ referenced before the assignment
So I guess the qa object is not passed. How to solve that?