PDF Reader problems

Are there different types of PDF file?

I’ve created a simple Streamlit PDF reader app, using the file_uploader. This works some of the time & displays the PDF. I’m using pdf display code I found in the forum to do this. Slightly modified. The code I’m using is:

import streamlit as st
import base64

image = st.sidebar.file_uploader("Please browse for a pdf file")
st.sidebar.write("Only one file at a time!")


if image is not None:

    fn = image.name  

    
    if fn[-4:] ==".pdf" or fn[-4:] == ".PDF"  :
        
        with open(fn,"rb") as f:
            base64_pdf = base64.b64encode(f.read()).decode('utf-8')
        pdf_display = f'<embed src=”data:application/pdf;base64,{base64_pdf}” width=”800″ height=”1000″ type=”application/pdf”></embed>'
        # pdf_display = f'<iframe src="data:application/pdf;base64,{base64_pdf}" width="800" height="1000" type="application/pdf"></iframe>'
        st.markdown(pdf_display, unsafe_allow_html=True)
    
    else:
        "Invalid File type (must be .pdf)  "

    #  *****************************  Function SHOW_PDF to read PDFs
        # def show_pdf(file_path):
        #     with open(file_path,"rb") as f:
        #         base64_pdf = base64.b64encode(f.read()).decode('utf-8')
        #     pdf_display = f'<iframe src="data:application/pdf;base64,{base64_pdf}" width="800" height="800" type="application/pdf"></iframe>'
        #     st.markdown(pdf_display, unsafe_allow_html=True)
        
        #  *****************************

END OF CODE

With some files I get a File Not Found error & no display. This is the error message in Terminal:

  •                *
    
  •    2023-01-13 16:51:08.507 Uncaught app exception*
    
  •    Traceback (most recent call last):*
    
  •      File "/Users/timkendal/Desktop/Python Projects/Streamlit_testing/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script*
    
  •        exec(code, module.__dict__)*
    
  •      File "/Users/timkendal/Desktop/Python Projects/Streamlit_testing/pdf_reader.py", line 15, in <module>*
    
  •        with open(fn,"rb") as f:*
    
  •             ^^^^^^^^^^^^^*
    
  •    FileNotFoundError: [Errno 2] No such file or directory: 'Hidcote Map.pdf'*
    

It is always the same files that fail to display, and they are actually found, as the name appears under the file browser (in the Sidebar in my case). All the files I’m testing with are in the same folder. The code and the files are all local.

You will see that there are 2 lines in the code which are identical except that one uses ‘embed’ and the other ‘iframe’. I found a post that said one of these worked where the other did not. Neither work for me.

The salmon coloured error box in the Browser is this:

              FileNotFoundError: [Errno 2] No such file or directory: 'Worcs Beacon Path.pdf'
              
              Traceback:
              
              ```
              File "/Users/timkendal/Desktop/Python Projects/Streamlit_testing/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
                  exec(code, module.__dict__)File "/Users/timkendal/Desktop/Python Projects/Streamlit_testing/pdf_reader.py", line 15, in <module>
                  with open(fn,"rb") as f:
                       ^^^^^^^^^^^^^
              ```

In summary, why will the code above read and display only some PDF files?

Debug info

  • Streamlit version: 1.17.0 (updated to this today. Previous version had the same problem.)
  • Python version: 3.11.1
  • Using Conda? PipEnv? PyEnv? Pex? NO
  • OS version: MacOS Catalina 10.15.7
  • Browser version: Chrome Version 108.0.5359.124

Requirements file none

I hope someone can help here! It is frustrating!

Tim Kendal

Calling file_uploader() won’t create a file in the server’s filesytem, instead it will return a file-like object from which you can directly read the bytes.

So do do not try to open a file and then read it, just use image.read() as the argument to b64encode.

Thanks Goyo. I’m afraid your reply doesn’t help me much - I’m not an expert on these things, and I don’t know how to implement what you say.

If you could suggest exactly what my code should be, that would be great.

The point here is that the code I have works with some pdf files but not others. As I said in the question, why do some pdfs display as expected but others don’t?

All the pdfs I have tried display properly with Adobe Reader etc

I could upload 2 files (1 working & 1 not) if it would help but the uploader won’t let me send a pdf for some reason (file names except .jpg greyed out)

Thanks again for your reply

Do not try to open the file and use this instead:

base64_pdf = base64.b64encode(image.read()).decode('utf-8')

Thanks Goyo

Not working yet, but I’ll keep trying!

As a new. user of Streamlit, I think it’s disappointing that displaying a pdf file is so complicated!

It is not that complicated and you got that part right anyway. You seem to be struggling with file / data management in python instead.

Take a look at the example I just deployed.

https://display-pdf.streamlit.app/

Thanks Goyo

I’ve tried your code and of course it works - BUT only with some pdf files!! This was my original problem.

It’s always the same files that don’t display, and they all display with Adobe Reader, and that’s why I asked if all pdfs were the same

I attach 2 sample pdfs to this email (I can’t see how to upload files in the forum)

The one that works is Blenheim Palace Ma.pdf. The other one, Worcs Beacon Path.pdf does not display, though it is loaded and the name appears on-screen

Thanks again for your trouble

(Attachment Blenheim Palace Map.pdf is missing)

(Attachment Worcs Beacon Path.pdf is missing)

Just had an auto response from Streamlit saying my 2 pdfs are not authorised so have been rejected. Presumably on security grounds.

Not sure how to get round this, or if you want to see them!

Github, Google Drive, OneDrive…

Google Drive:
https://drive.google.com/file/d/18gOWImu4O9VnrRwwraUav3IbS8ZdLHCw/view?usp=share_link

https://drive.google.com/file/d/18gIq3XAS8LFxfqvYiL3ny_KVQ7LBEryc/view?usp=share_link

I hope these work for you!

Nope. I get acces denied. When sharing the file, grant General access with Viewer permissions to Anyone with the link.

share

Thanks Goyo - sorry to mess you about

I’ve revisited Google & got new links (may be the same as the old ones), with sharing now correct (I Hope!)

https://drive.google.com/file/d/18gIq3XAS8LFxfqvYiL3ny_KVQ7LBEryc/view?usp=share_link

https://drive.google.com/file/d/18gOWImu4O9VnrRwwraUav3IbS8ZdLHCw/view?usp=share_link

I hope this works this time!

Works for me using Gnome Web. Maybe it is a browser issue?

Having a very similar issue here. The PDF will actually not display at all in Chrome, but works just fine in Safari and Firefox

Thanks Jordan. I’ve only tried Chrome so far as a browser, but I’ll give Safari a go, and report back

Just tried Safari & it doesn’t work either - shame. I’ve also tried Opera - same result.

It’s perhaps worth mentioning again that it is only some pdfs that fail to display, not all. Indeed most work ok.