Hi, I’m using file_uploader in my app like this:
st.file_uploader(
"Load files for processing",
type="pdf",
accept_multiple_files=True,
key="uploaded_files",
on_change=uploaded_files_changed)
Loaded file are then saved in session state under the uploaded_files key. After uploading the app into Streamlit Cloud and further testing, I came across an issue where when I add and remove files for a few times, it can occur that the list containing uploaded files (st.session_state['uploaded_files']) contains an instance of DeletedFile. I’m wondering how I can evade this situation. I came up with a solution where I just check every item in the list whether it is an instance of UploadedFile, but I’m not convinced this is the right approach?
Is it happening repeatedly on the deployed app? Could you also screen-record this behavior and share it with us so we can look into it? Thanks again for sharing this.
Unfortunatelly, I was not able to reproduce this issue myself. This issue originally happened to one of my clients while testing the demo app. I have noticed the errors only via the logs produced by the deployed app.
The errors arose when while processing the uploaded files (PDFs), as the pdf parser could not load DeletedFile instances as they have no data parameter.
Since I was passing the st.session_state['uploaded_files'] list to the pdf parser as a whole, I assumed the list must have contained instanced of DeletedFile. This wouldn’t be an issue by itself as I can always filter the list to get rid of unwanted DeletedFile instances, but I was mainly surprised since there was no mention of DeletedFile instances in the api docs.
Traceback (most recent call last):
File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 548, in _run_script
self._session_state.on_script_will_rerun(rerun_data.widget_states)
File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/safe_session_state.py", line 68, in on_script_will_rerun
self._state.on_script_will_rerun(latest_widget_states)
File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 486, in on_script_will_rerun
self._call_callbacks()
File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 499, in _call_callbacks
self._new_widget_state.call_callback(wid)
File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 251, in call_callback
callback(*args, **kwargs)
File "/mount/src/deployed-app/app.py", line 35, in uploaded_files_changed
[assume_file_type(file) for file in st.session_state['uploaded_files']]
File "/mount/src/deployed-app/app.py", line 35, in <listcomp>
[assume_file_type(file) for file in st.session_state['uploaded_files']]
File "/mount/src/deployed-app/helper_functions.py", line 32, in assume_file_type
body = get_text_from_pdf(file)
File "/mount/src/deployed-app/pdf_parser.py", line 10, in get_text_from_pdf
reader = PdfReader(file)
File "/home/adminuser/venv/lib/python3.9/site-packages/pypdf/_reader.py", line 332, in __init__
self.read(stream)
File "/home/adminuser/venv/lib/python3.9/site-packages/pypdf/_reader.py", line 1553, in read
self._basic_validation(stream)
File "/home/adminuser/venv/lib/python3.9/site-packages/pypdf/_reader.py", line 1592, in _basic_validation
stream.seek(0, os.SEEK_SET)
AttributeError: 'DeletedFile' object has no attribute 'seek'