Streamlit App - Converting an Uploaded PDF to Seperate Images for Downloading

Summary

Create a streamlit app which allows the user to upload a PDF file, and then be able to download the PDF pages as separate PNG files.

Expected behavior:

Upload the PDF file, hit continue, PDF pages appear as images within the app, button to download each page appears above each image

Actual behavior:

Upload the PDF file, hit continue, first PDF page appears as image and then error.
RuntimeError: Invalid binary data format: <class 'PIL.PpmImagePlugin.PpmImageFile'>

I’ve uploaded my code into the repository here.

Also included below. I have managed to download the ‘image’ once, but it doesn’t open as an image, comes through as an invalid format?

import streamlit as st
import pdf2image
import zipfile
import os

pdf_uploaded = st.file_uploader("Select a file", type="pdf")
button = st.button("Confirm")
image_down = []
if button and pdf_uploaded is not None:

    if pdf_uploaded.type == "application/pdf":
        images = pdf2image.convert_from_bytes(pdf_uploaded.read(), poppler_path="poppler/library/bin")
        for i, page in enumerate(images):
            st.image(page, use_column_width=True)
            st.download_button("Download", data=page, file_name=f"Image_{i}.png")
            image_down.append(page)

Hi @Thomas_Ellyatt !

Quite interesting use case. After some debugging I see that the issue is caused by the line:

st.download_button("Download", data=page, file_name=f"Image_{i}.png")

The Streamlit download_button expects the data type to be a certain type.

Solution for that issue can be found on forum here or in abovementioned stackoverflow link: How to download image? - #10 by Javier_Jaramillo

And it works!

https://tomjohnh-streamlit-pdf2image-main-xilovz.streamlit.app/

My github fork: here.

import streamlit as st
import pdf2image
import zipfile
import os
from io import BytesIO

# https://discuss.streamlit.io/t/how-to-download-image/3358/10


pdf_uploaded = st.file_uploader("Select a file", type="pdf")
button = st.button("Confirm")
image_down = []
st.write("test1")
if button and pdf_uploaded is not None:
    st.write("test2")
    if pdf_uploaded.type == "application/pdf":
        st.write("test3")
        images = pdf2image.convert_from_bytes(pdf_uploaded.read())
        for i, page in enumerate(images):
            st.write(i)
            st.write(page)
            st.image(page, use_column_width=True)
            img = page
            buf = BytesIO()
            img.save(buf, format="JPEG")
            byte_im = buf.getvalue()
            st.download_button("Download", data=byte_im, file_name=f"Image_{i}.png")
1 Like

Hey Tom,

Thanks so much for this! I’ve tweaked it a little and works like a charm :).

https://thomasellyatt-pdf-to-png-converter-main-u4v3jk.streamlit.app/

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.