File uploader returns instances of DeletedFile

Hi, Iโ€™m using file_uploader in my app like this:

st.file_uploader(
            "Load files for processing", 
            type="pdf", 
            accept_multiple_files=True,  
            key="uploaded_files", 
            on_change=uploaded_files_changed)

Loaded file are then saved in session state under the uploaded_files key. After uploading the app into Streamlit Cloud and further testing, I came across an issue where when I add and remove files for a few times, it can occur that the list containing uploaded files (st.session_state['uploaded_files']) contains an instance of DeletedFile. Iโ€™m wondering how I can evade this situation. I came up with a solution where I just check every item in the list whether it is an instance of UploadedFile, but Iโ€™m not convinced this is the right approach?

isinstance(file, st.runtime.uploaded_file_manager.UploadedFile)

While testing the app locally, I didnโ€™t encounter this issue, but that might just be random.

1 Like

Hi @soyrubio,

Thanks for posting!

Is it happening repeatedly on the deployed app? Could you also screen-record this behavior and share it with us so we can look into it? Thanks again for sharing this.

Unfortunatelly, I was not able to reproduce this issue myself. This issue originally happened to one of my clients while testing the demo app. I have noticed the errors only via the logs produced by the deployed app.

The errors arose when while processing the uploaded files (PDFs), as the pdf parser could not load DeletedFile instances as they have no data parameter.

Since I was passing the st.session_state['uploaded_files'] list to the pdf parser as a whole, I assumed the list must have contained instanced of DeletedFile. This wouldnโ€™t be an issue by itself as I can always filter the list to get rid of unwanted DeletedFile instances, but I was mainly surprised since there was no mention of DeletedFile instances in the api docs.

I can provide you with the error logs however:

Traceback (most recent call last):
  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 548, in _run_script
    self._session_state.on_script_will_rerun(rerun_data.widget_states)
  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/safe_session_state.py", line 68, in on_script_will_rerun
    self._state.on_script_will_rerun(latest_widget_states)
  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 486, in on_script_will_rerun
    self._call_callbacks()
  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 499, in _call_callbacks
    self._new_widget_state.call_callback(wid)
  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/state/session_state.py", line 251, in call_callback
    callback(*args, **kwargs)
  File "/mount/src/deployed-app/app.py", line 35, in uploaded_files_changed
    [assume_file_type(file) for file in st.session_state['uploaded_files']]
  File "/mount/src/deployed-app/app.py", line 35, in <listcomp>
    [assume_file_type(file) for file in st.session_state['uploaded_files']]
  File "/mount/src/deployed-app/helper_functions.py", line 32, in assume_file_type
    body = get_text_from_pdf(file)
  File "/mount/src/deployed-app/pdf_parser.py", line 10, in get_text_from_pdf
    reader = PdfReader(file)
  File "/home/adminuser/venv/lib/python3.9/site-packages/pypdf/_reader.py", line 332, in __init__
    self.read(stream)
  File "/home/adminuser/venv/lib/python3.9/site-packages/pypdf/_reader.py", line 1553, in read
    self._basic_validation(stream)
  File "/home/adminuser/venv/lib/python3.9/site-packages/pypdf/_reader.py", line 1592, in _basic_validation
    stream.seek(0, os.SEEK_SET)
AttributeError: 'DeletedFile' object has no attribute 'seek'

Thank you for the logs and detailed explanation of the issue. I will get back to you soon.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.