Summary
I have a csv which is processed into a vectorstore to be used by a llm chat model.
I use the fileuploader to let the user upload the file, but 99% of the time they use the same file. I’d like to default to using the last uploaded/default file and save the user some time.
I have used @st.cache_data to cache the processing of the file:
@st.cache_data
def get_csv_text(csv_docs):
#reads csv to dataframe, does some stuff and returns a clean dataframe
@st.cache_data
def get_vectorstorecsv(_data):
#reads returned dataframe and turns it into a vectorstore
These do what they’re supposed to do, but they only trigger when the user uploads a file and processes it. I can’t figure out any way to restore the cached result of either function without having the user upload a file.
I tried something like this to attempt to retrieve the cached file:
csv_docs = None
get_csv_text(csv_docs)
Which obviously doesn’t work because the parameter isn’t the same, but the only way to get that the same is to upload the same file…
I’m sure there is an easy way to store and restore data like this but I just can’t find it. Any help would be appreciated