Cannot read PDF from st.upload with PyMuPDF

  • running locally
  • PyMuPDF version 1.24.4
  • Python version 3.11.2
  • Streamlit version 1.34

PyMuPDF reads the pdf, but returns empty text and has this error message:
MuPDF error: format error: non-page object in page tree

full code:
uploaded_file = st.file_uploader(" pdf: ", type=[‘pdf’])

#this is not working!
    doc=pymupdf.open(stream=uploaded_file.read(),filetype="pdf")
    text=""
    try:
        for page in doc:
            text += page.get_text()
        if not text:
               st.error("No text found, PyMuPDF did not work")
    finally:
        doc.close()
    return text

it is similar to this, but it does not work with the given solution similar issue

Ive search througly for this issue, but cant seem to resolve it

PyPDF works!! but I’m swithcing for PyMuPDF for the features.

This worked…
before the doc=pymupdf.open(stream=uploaded_file.read(),filetype=“pdf”)

uploaded_file.seek(0,0)

solution

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.