Hello @Soumyadip_Sarkar, I think you were missing the read() to read file as bytesIO which pymupdf can then consume.
For future reference, the following works:
import fitz
import streamlit as st
uploaded_pdf = st.file_uploader("Load pdf: ", type=['pdf'])
if uploaded_pdf is not None:
with fitz.open(stream=uploaded_pdf.read(), filetype="pdf") as doc:
text = ""
for page in doc:
text += page.getText()
st.write(text)
I’m not sure fitz.open() context manager always closes the file as I got some AttributeError: 'Document' object has no attribute 'isClosed' error so I closed the buffer manually too:
import fitz
import streamlit as st
uploaded_pdf = st.file_uploader("Load pdf: ", type=['pdf'])
if uploaded_pdf is not None:
doc = fitz.open(stream=uploaded_pdf.read(), filetype="pdf")
text = ""
for page in doc:
text += page.getText()
st.write(text)
doc.close()