Creating a PDF file generator

Hi, I do data analysis work, so I generate different plots. Now i want to create a tool which helps to visualize and as well as export chats into pdf file. Is it possible with streamlit to create that kind of app.

Hi @Pavan_Cheyutha,

You sure can !
Though streamlit doesnt support PDF generation out of the box for obvious reasons but you can look into PyFPDF and coupled with streamlit it can do the job. Adding a snippet that I just tried for reference.

import streamlit as st
from fpdf import FPDF
import base64

report_text = st.text_input("Report Text")


export_as_pdf = st.button("Export Report")

def create_download_link(val, filename):
    b64 = base64.b64encode(val)  # val looks like b'...'
    return f'<a href="data:application/octet-stream;base64,{b64.decode()}" download="{filename}.pdf">Download file</a>'

if export_as_pdf:
    pdf = FPDF()
    pdf.add_page()
    pdf.set_font('Arial', 'B', 16)
    pdf.cell(40, 10, report_text)
    
    html = create_download_link(pdf.output(dest="S").encode("latin-1"), "test")

    st.markdown(html, unsafe_allow_html=True)

Will give you this.

3 Likes

Wow…it has given hope to me. thank you very much… today I am starting my work I will let you know challenges what i am facing in my work. thank you very much.

Hi wrote code for data visualization. now I want to export all graph into pdf file… I don’t know how to go ahead in my process. please help me out.

import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
from fpdf import FPDF
import base64

dataset = st.file_uploader("upload file here", type = ['csv'])
if dataset is not None:
    df = pd.read_csv(dataset)
    st.sidebar.write(' ### Rows and Columns:',df.shape)
       
# Graph
    cols = list(df.columns)
    st.sidebar.write('### Columns:')
    check_boxes = [st.sidebar.checkbox(col, key=col) for col in cols]
    col1 = [col for col, checked in zip(cols, check_boxes) if checked]
    
    
    for i in col1:
        st.write('### ', i)
        st.line_chart(df[i])

Hi @Pavan_Cheyutha,

You have two options here I think,

  1. Use matplotlib to create charts, it will be easier to save them as images.
  2. Use altair charts to create the charts ( what streamlit internally uses ), it has additional dependencies ( altair_saver and selenium backend with chrome driver ) to save them as images.

Once you have the images written to a temporary file use it with

pdf.add_image(temp_file.name)

And you should see the images in your pdf :slight_smile:

Hope it helps !

Hey, thank you very much for your reply, now I am using matplotlib to plot graphs. Here the graph is not in image format. it is showing as list type. I tried to to convert it into image by using
from PIL import Image. but it is not working. how can I convert it in image format and how to save in temporary file.
img = plt.plot(df[‘sepal_length’])
st.pyplot()
pic = Image.open(img)
st.image(pic, caption=“The caption”, use_column_width=True)

Hey @Pavan_Cheyutha,

You can use the plot’s savefig function somewhat like this,

import tempfile

with tempfile.NamedTemporaryFile(delete=False, suffix=".png") as tmpfile:
    plt.savefig(tmpfile.name, format="png")

Then you can use tmpfile.name to write it into pdf.

Hope it helps !

Getting error like pdf.add_image() function is not existed in FPDF library

hey, I found that function it is pdf.image(). Now it is working thank you, I will let you know if i face any problem again. thank you very much

I am able to download pdf file… with graphs, but the new problem is all graphs are coming in one graph. I want all individual graphs.

import streamlit as st
import pandas as pd
import matplotlib.pyplot as plt
from fpdf import FPDF
import base64
from PIL import Image
import numpy as np
import cv2 as cv
import tempfile
st.set_option('deprecation.showPyplotGlobalUse', False)
dataset = st.file_uploader("upload file here", type = ['csv'])
if dataset is not None:
    df = pd.read_csv(dataset)
    st.sidebar.write(' ### Rows and Columns:',df.shape)
# Graph
cols = list(df.columns)
st.sidebar.write('### Columns:')
check_boxes = [st.sidebar.checkbox(col, key=col) for col in cols]
col1 = [col for col, checked in zip(cols, check_boxes) if checked]


for i in col1:
    st.write('### ', i)
    #st.line_chart(df[i])
    plt.plot(df[i])

# Creating temporary memory to store char as image 
for j in range(len(col1)):  
    with tempfile.NamedTemporaryFile(delete=False, suffix=".png") as tmpfile:
        plt.savefig(tmpfile.name, format="png")
        st.pyplot()
        
# Creating export and download link for pdf file.
export_as_pdf = st.button("Export Report")

def create_download_link(val, filename):
    b64 = base64.b64encode(val)  # val looks like b'...'
    return f'<a href="data:application/octet-stream;base64,{b64.decode()}" download="{filename}.pdf">Download file</a>'


if export_as_pdf:
     pdf = FPDF()
     pdf.add_page()
     pdf.image(tmpfile.name,10,10,100)
     
     html = create_download_link(pdf.output(dest="S").encode("latin-1"), "test")

     st.markdown(html, unsafe_allow_html=True)

That’s probably happening because you are plotting all of the graphs at one figure.
Try plotting it somewhat like this, its just pseudo code.

axes = []
for i in col1:
    fig, ax = plt.subplots()
    ax.plot(data_goes_here)
    axes.append(ax)

if export_as_pdf:
   pdf = FPDF()
   # add page etc.
   for ax in axes:
       with NamedTemporaryFile(... ) as tmpfile:
              ax.savefig(tmpfile.name)
              pdf.image(tmpfile.name)
   html = create_download_link(pdf.output(dest="S"))
   st.markdown(html)

Hope it helps !

Getting error like this --> AttributeError: ‘AxesSubplot’ object has no attribute ‘savefig’.
And I think you missed tempfile. before ’ NamedTemporaryFile(...) .

Hi @Pavan_Cheyutha,

I think you could add a new image otherwise you will have to adjust the x, y, w, h for figures accordingly.
The code that I added was pseudo code thats why you had the errors.
Attaching a working version now.

import streamlit as st
import matplotlib.pyplot as plt
from fpdf import FPDF
import base64
import numpy as np
from tempfile import NamedTemporaryFile

from sklearn.datasets import load_iris

def create_download_link(val, filename):
    b64 = base64.b64encode(val)  # val looks like b'...'
    return f'<a href="data:application/octet-stream;base64,{b64.decode()}" download="{filename}.pdf">Download file</a>'


df = load_iris(as_frame=True)["data"]


figs = []

for col in df.columns:
    fig, ax = plt.subplots()
    ax.plot(df[col])
    st.pyplot(fig)
    figs.append(fig)

export_as_pdf = st.button("Export Report")

if export_as_pdf:
    pdf = FPDF()
    for fig in figs:
        pdf.add_page()
        with NamedTemporaryFile(delete=False, suffix=".png") as tmpfile:
                fig.savefig(tmpfile.name)
                pdf.image(tmpfile.name, 10, 10, 200, 100)
    html = create_download_link(pdf.output(dest="S").encode("latin-1"), "testfile")
    st.markdown(html, unsafe_allow_html=True)

Produces this,

Hope it helps !

2 Likes

Hi @ash2shukla

I’m having a similar issue, I uploaded a pdf file and I’m locking that pdf file with a password and I want to download it again but I couldn’t do it.

Kindly help me

Here is my code:

https://pastebin.com/29umspGf

Thanks in advance

Hi @Sunil_Aleti , Sorry for very late reply. Have been a little busy with work lately,
I went through your code. I think you can create an empty component and update it when u have locked the file,
something like this,

download_link_ph = st.empty()
# code to lock the file
locked_file_content = locked_pdf.output(dest="S").encode("latin-1")
html = create_download_link(locked_file_content, "some_file_name.pdf")
download_link_ph.markdown(html, unsafe_allow_html=True)

Hope it helps !

Hi, @ash2shukla The PDF generator worked in the beginning for me and still works in my local environment. Unfortunately, the link stopped working in the live app, after it went public, and I want to know why?

  # save the pdf with name .pdf
 report = pdf.output(dest="S").encode("latin-1")


 b64 = base64.b64encode(report)  # val looks like b'...'
 my_bar.progress(100)
 return f'<a href="data:application/octet-stream;base64,{b64.decode()}" download="{code}.pdf">Download file</a>'

report = pg.generate_report(sharecode,time_period, detail, subject, options, finOptions)
download_link_ph = st.empty()
download_link_ph.markdown(report, unsafe_allow_html=True)

Hi @malcolmrite welcome to the community!

I cant really tell why the link stopped working without more information, can you look up the network requests in the dev tools and let me know whats the status code, Is it a Resource Not available or Resource Forbidden? If it is the latter then there might be somethings wrong with the proxy rules/ origin config.

Hope it helps!

Hi @ash2shukla Thanks for the welcome and the feedback!

I got this from the console in the network section

Download is disallowed. The frame initiating or instantiating the download is sandboxed, but the flag ‘allow-downloads’ is not set. See Download in Sandboxed Iframes - Chrome Platform Status for more details

Looks like it’s a browser feature that’s dissallowing this type of download. On closer inspection, I was able to download the pdf by right clicking, and clicking save as link.

Thanks again for the help. :slight_smile: