Creating a PDF file generator

Hi, I do data analysis work, so I generate different plots. Now i want to create a tool which helps to visualize and as well as export chats into pdf file. Is it possible with streamlit to create that kind of app.

Hi @Pavan_Cheyutha,

You sure can !
Though streamlit doesnt support PDF generation out of the box for obvious reasons but you can look into PyFPDF and coupled with streamlit it can do the job. Adding a snippet that I just tried for reference.

import streamlit as st
from fpdf import FPDF
import base64

report_text = st.text_input("Report Text")


export_as_pdf = st.button("Export Report")

def create_download_link(val, filename):
    b64 = base64.b64encode(val)  # val looks like b'...'
    return f'<a href="data:application/octet-stream;base64,{b64.decode()}" download="{filename}.pdf">Download file</a>'

if export_as_pdf:
    pdf = FPDF()
    pdf.add_page()
    pdf.set_font('Arial', 'B', 16)
    pdf.cell(40, 10, report_text)
    
    html = create_download_link(pdf.output(dest="S").encode("latin-1"), "test")

    st.markdown(html, unsafe_allow_html=True)

Will give you this.

13 Likes

Wow…it has given hope to me. thank you very much… today I am starting my work I will let you know challenges what i am facing in my work. thank you very much.

Hi wrote code for data visualization. now I want to export all graph into pdf file… I don’t know how to go ahead in my process. please help me out.

import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
from fpdf import FPDF
import base64

dataset = st.file_uploader("upload file here", type = ['csv'])
if dataset is not None:
    df = pd.read_csv(dataset)
    st.sidebar.write(' ### Rows and Columns:',df.shape)
       
# Graph
    cols = list(df.columns)
    st.sidebar.write('### Columns:')
    check_boxes = [st.sidebar.checkbox(col, key=col) for col in cols]
    col1 = [col for col, checked in zip(cols, check_boxes) if checked]
    
    
    for i in col1:
        st.write('### ', i)
        st.line_chart(df[i])

1 Like

Hi @Pavan_Cheyutha,

You have two options here I think,

  1. Use matplotlib to create charts, it will be easier to save them as images.
  2. Use altair charts to create the charts ( what streamlit internally uses ), it has additional dependencies ( altair_saver and selenium backend with chrome driver ) to save them as images.

Once you have the images written to a temporary file use it with

pdf.add_image(temp_file.name)

And you should see the images in your pdf :slight_smile:

Hope it helps !

Hey, thank you very much for your reply, now I am using matplotlib to plot graphs. Here the graph is not in image format. it is showing as list type. I tried to to convert it into image by using
from PIL import Image. but it is not working. how can I convert it in image format and how to save in temporary file.
img = plt.plot(df[‘sepal_length’])
st.pyplot()
pic = Image.open(img)
st.image(pic, caption=“The caption”, use_column_width=True)

Hey @Pavan_Cheyutha,

You can use the plot’s savefig function somewhat like this,

import tempfile

with tempfile.NamedTemporaryFile(delete=False, suffix=".png") as tmpfile:
    plt.savefig(tmpfile.name, format="png")

Then you can use tmpfile.name to write it into pdf.

Hope it helps !

Getting error like pdf.add_image() function is not existed in FPDF library

hey, I found that function it is pdf.image(). Now it is working thank you, I will let you know if i face any problem again. thank you very much

I am able to download pdf file… with graphs, but the new problem is all graphs are coming in one graph. I want all individual graphs.

import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
from fpdf import FPDF
import base64
from PIL import Image
import numpy as np
import cv2 as cv
import tempfile
st.set_option('deprecation.showPyplotGlobalUse', False)
dataset = st.file_uploader("upload file here", type = ['csv'])
if dataset is not None:
    df = pd.read_csv(dataset)
    st.sidebar.write(' ### Rows and Columns:',df.shape)
# Graph
cols = list(df.columns)
st.sidebar.write('### Columns:')
check_boxes = [st.sidebar.checkbox(col, key=col) for col in cols]
col1 = [col for col, checked in zip(cols, check_boxes) if checked]


for i in col1:
    st.write('### ', i)
    #st.line_chart(df[i])
    plt.plot(df[i])

# Creating temporary memory to store char as image 
for j in range(len(col1)):  
    with tempfile.NamedTemporaryFile(delete=False, suffix=".png") as tmpfile:
        plt.savefig(tmpfile.name, format="png")
        st.pyplot()
        
# Creating export and download link for pdf file.
export_as_pdf = st.button("Export Report")

def create_download_link(val, filename):
    b64 = base64.b64encode(val)  # val looks like b'...'
    return f'<a href="data:application/octet-stream;base64,{b64.decode()}" download="{filename}.pdf">Download file</a>'


if export_as_pdf:
     pdf = FPDF()
     pdf.add_page()
     pdf.image(tmpfile.name,10,10,100)
     
     html = create_download_link(pdf.output(dest="S").encode("latin-1"), "test")

     st.markdown(html, unsafe_allow_html=True)

That’s probably happening because you are plotting all of the graphs at one figure.
Try plotting it somewhat like this, its just pseudo code.

axes = []
for i in col1:
    fig, ax = plt.subplots()
    ax.plot(data_goes_here)
    axes.append(ax)

if export_as_pdf:
   pdf = FPDF()
   # add page etc.
   for ax in axes:
       with NamedTemporaryFile(... ) as tmpfile:
              ax.savefig(tmpfile.name)
              pdf.image(tmpfile.name)
   html = create_download_link(pdf.output(dest="S"))
   st.markdown(html)

Hope it helps !

Getting error like this --> AttributeError: ‘AxesSubplot’ object has no attribute ‘savefig’.
And I think you missed tempfile. before ’ NamedTemporaryFile(...) .

Hi @Pavan_Cheyutha,

I think you could add a new image otherwise you will have to adjust the x, y, w, h for figures accordingly.
The code that I added was pseudo code thats why you had the errors.
Attaching a working version now.

import streamlit as st
import matplotlib.pyplot as plt
from fpdf import FPDF
import base64
import numpy as np
from tempfile import NamedTemporaryFile

from sklearn.datasets import load_iris

def create_download_link(val, filename):
    b64 = base64.b64encode(val)  # val looks like b'...'
    return f'<a href="data:application/octet-stream;base64,{b64.decode()}" download="{filename}.pdf">Download file</a>'


df = load_iris(as_frame=True)["data"]


figs = []

for col in df.columns:
    fig, ax = plt.subplots()
    ax.plot(df[col])
    st.pyplot(fig)
    figs.append(fig)

export_as_pdf = st.button("Export Report")

if export_as_pdf:
    pdf = FPDF()
    for fig in figs:
        pdf.add_page()
        with NamedTemporaryFile(delete=False, suffix=".png") as tmpfile:
                fig.savefig(tmpfile.name)
                pdf.image(tmpfile.name, 10, 10, 200, 100)
    html = create_download_link(pdf.output(dest="S").encode("latin-1"), "testfile")
    st.markdown(html, unsafe_allow_html=True)

Produces this,

Hope it helps !

3 Likes

Hi @ash2shukla

I’m having a similar issue, I uploaded a pdf file and I’m locking that pdf file with a password and I want to download it again but I couldn’t do it.

Kindly help me

Here is my code:

https://pastebin.com/29umspGf

Thanks in advance

Hi @Sunil_Aleti , Sorry for very late reply. Have been a little busy with work lately,
I went through your code. I think you can create an empty component and update it when u have locked the file,
something like this,

download_link_ph = st.empty()
# code to lock the file
locked_file_content = locked_pdf.output(dest="S").encode("latin-1")
html = create_download_link(locked_file_content, "some_file_name.pdf")
download_link_ph.markdown(html, unsafe_allow_html=True)

Hope it helps !

Hi, @ash2shukla The PDF generator worked in the beginning for me and still works in my local environment. Unfortunately, the link stopped working in the live app, after it went public, and I want to know why?

  # save the pdf with name .pdf
 report = pdf.output(dest="S").encode("latin-1")


 b64 = base64.b64encode(report)  # val looks like b'...'
 my_bar.progress(100)
 return f'<a href="data:application/octet-stream;base64,{b64.decode()}" download="{code}.pdf">Download file</a>'

report = pg.generate_report(sharecode,time_period, detail, subject, options, finOptions)
download_link_ph = st.empty()
download_link_ph.markdown(report, unsafe_allow_html=True)

Hi @malcolmrite welcome to the community!

I cant really tell why the link stopped working without more information, can you look up the network requests in the dev tools and let me know whats the status code, Is it a Resource Not available or Resource Forbidden? If it is the latter then there might be somethings wrong with the proxy rules/ origin config.

Hope it helps!

Hi @ash2shukla Thanks for the welcome and the feedback!

I got this from the console in the network section

Download is disallowed. The frame initiating or instantiating the download is sandboxed, but the flag ‘allow-downloads’ is not set. See Chrome Platform Status for more details

Looks like it’s a browser feature that’s dissallowing this type of download. On closer inspection, I was able to download the pdf by right clicking, and clicking save as link.

Thanks again for the help. :slight_smile:

Hello please help me out I Would like to download a Streamlit table to PDF and I really need it I don’t know how to do it

@Loubna_Massaoudi I think you’d be best off saving the table as XLSX and then Popen’ing a (PowerShell) script to convert that to PDF (see this). Afterwards opening it as base64 encoded and then show a download URL as described in this discussion thread. Presumably this approach will be limited to Windows with Excel installed, so may not work for you.

I had a quick look at PyFPDF package and there are code template examples that might be useful too. PDFs are rendered a bit like bitmaps so quite complex to handle.

The simplest workaround is to use the browser’s print-and-save-as-PDF option if the client computer has Acrobat or Foxit PDF reader installed.

I’ve seen you duplicated your question elsewhere, but have you seen this post which will help with saving as Excel and providing a download link to that file (not PDF though).?