Issue with Plotly Figure to PDF Conversion in Streamlit App

I have encountered an issue with the write_image function in my Streamlit app when trying to save a Plotly figure as a PDF. The app is intended to display an electricity market plot, and I’ve used Plotly to create the plot and Streamlit to provide an option to download it as a PDF. However, when I click the “Download PDF” button, the downloaded PDF file is empty, and the plot is not displayed in the app.

Details:

  • Running Environment: I am running the Streamlit app locally.
  • Code: Here is a snippet of the relevant code from my app:
market_fig = go.Figure()
# ... (Plotly figure setup)
buffer = io.BytesIO()
market_fig.write_image(file=buffer, format="pdf")
st.download_button(
    label="Download PDF",
    data=buffer,
    file_name="market_figure.pdf",
    mime="application/pdf",
)
st.plotly_chart(market_fig, use_container_width=True)
  • Troubleshooting Attempt: To further investigate the issue, I created a separate test script with a simplified Plotly figure and used the write_image function to save it as a PDF. The test script ran successfully and generated a PDF file with the plot, so it seems like the issue is specific to my Streamlit app.

Request for Assistance:

I am seeking help in identifying the root cause of this problem and finding a solution to ensure that the Plotly figure is correctly saved as a PDF and displayed in the Streamlit app. Any insights, suggestions, or guidance on how to resolve this issue would be greatly appreciated.

Additional Information:

  • Full Error Message: I have not received any error messages, but the downloaded PDF is empty.

Thank you in advance for your assistance!


image

This is my full function that i want to transfer to pdf:

# Define the new function for the electricity market plot
def electricity_market():
    st.title("Elektriciteit")

    # Load the dataset from Parquet file
    @st.cache_data
    def load_data(file_path):
        data = pd.read_parquet(file_path)
        return data
    
    parquet_file_path = r"C:\Users\.parquet"
    data = load_data(parquet_file_path)

    # Convert to datetime with explicit format
    date_format = "%d-%m-%y %H:%M"  # Adjust to match your data
    data['Tijd (CET)'] = pd.to_datetime(data['Tijd (CET)'], format=date_format)
    
    # Select relevant columns and drop NA values that might cause issues in the plot
    market_data = data[['Tijd (CET)', 'PrijsDynamisch', 'Prijzen vast_variabel']].dropna()

    # Resample the data to monthly frequency, using the mean for 'PrijsDynamisch'
    monthly_data = market_data.resample('D', on='Tijd (CET)').mean() #now set on Day (D) can be set to whatever mean.

    # Skip the months where the 'PrijsDynamisch' value is zero
    monthly_data = monthly_data[monthly_data['PrijsDynamisch'] != 0]

    # Check if the resampled data has any rows
    if monthly_data.empty:
        st.error("No data available for plotting after resampling and filtering.")
        return
    
    # Create a new figure for the market plot
    market_fig = go.Figure()
    market_fig.add_trace(go.Scatter(x=monthly_data.index, y=monthly_data['PrijsDynamisch'], mode='lines+markers', name='PrijsDynamisch',
    marker=dict(size=1)))
    market_fig.add_trace(go.Scatter(x=monthly_data.index, y=monthly_data['Prijzen vast_variabel'], mode='lines+markers', name='Prijzen vast_variabel',
    marker=dict(size=1)))   

   # Update the layout to format the x-axis with months
    market_fig.update_layout(
        xaxis=dict(
            tickformat='%b %Y',  # abbreviated month name with year
            type='date'  # Ensures that the x-axis is treated as a date
        ),
        title='Marktprijs versus vast/variable',
        xaxis_title='Maand',
        yaxis_title='Prijs (€)',
        legend=dict(
            orientation='h',
            xanchor='center',
            x=0.20,
            y=-0.3  # Adjust this value as needed to position the legend below the plot
        )
    )
    # Create an in-memory buffer
    buffer = io.BytesIO()

    # Save the figure as a PDF to the buffer
    #market_fig.write_image(file=buffer, format="pdf")

    # Download the PDF from the buffer
    st.download_button(
        label="Download PDF",
        data=buffer,
        file_name="market_figure.pdf",
        mime="application/pdf",
    )

    # Display the Plotly chart
    st.plotly_chart(market_fig, use_container_width=True)

Why is this one commented?


Would make a difference to use a context manager for the BytesIO object?

# Create an in-memory buffer
with io.BytesIO() as buffer:

    # Save the figure as a PDF to the buffer
    market_fig.write_image(file=buffer, format="pdf")

    # Download the PDF from the buffer
    st.download_button(
        label="Download PDF",
        data=buffer,
        file_name="market_figure.pdf",
        mime="application/pdf",
    )

# Display the Plotly chart
st.plotly_chart(market_fig, use_container_width=True)

that one is the problem; everything else is working but commented it out because when I uncomment the app loads forever. (so the download button etc works but when I click on the downloaded file it is an empty PDF). My guess right now is that this has to do with the data not being hardcoded (I used some examples of hardcoded data points and they do work when converting to pdf). but i am nog sure though.

My bad, i assumed the empty PDF was just because of the empty buffer.


Plotly should use kaleido for those static image exports, but it could be that that package is not installed and plotly is defaulting to the other engine orca. Does passing engine='kaleido' to the write_image method raise an error saying you need to pip install kaleido?

market_fig.write_image(file=buffer, format="pdf", engine="kaleido")

nope, no error just the eternal loading of the page… Could this be some dependencies problem then? I installed Kaleido on local machine and venv.

The following script that I found on the community board does when I locally load the app, tried copying but didn’t work out. The main difference is that the dataset is hardcoded in the example, but not sure if that is causing the eternal loading of the other function.


import streamlit as st
import pandas as pd
import plotly.graph_objects as go
import io

# Load the data
@st.cache_data
def load_data():
    return pd.DataFrame(
        {
            "Fruit": ["Apples", "Oranges", "Bananas", "Apples", "Oranges", "Bananas"],
            "Contestant": ["Alex", "Alex", "Alex", "Jordan", "Jordan", "Jordan"],
            "Number Eaten": [2, 1, 3, 1, 3, 2],
        }
    )

# Create and cache a Plotly figure
@st.cache_data
def create_figure(df):
    fig = go.Figure()
    for contestant, group in df.groupby("Contestant"):
        fig.add_trace(
            go.Bar(
                x=group["Fruit"],
                y=group["Number Eaten"],
                name=contestant,
                hovertemplate="Contestant=%s<br>Fruit=%%{x}<br>Number Eaten=%%{y}<extra></extra>"
                % contestant,
            )
        )
    fig.update_layout(legend_title_text="Contestant")
    fig.update_xaxes(title_text="Fruit")
    fig.update_yaxes(title_text="Number Eaten")
    return fig

df = load_data()
fig = create_figure(df)

# Create an in-memory buffer
buffer = io.BytesIO()

# Save the figure as a pdf to the buffer
fig.write_image(file=buffer, format="pdf")

# Download the pdf from the buffer
st.download_button(
    label="Download PDF",
    data=buffer,
    file_name="figure.pdf",
    mime="application/pdf",
)

st.plotly_chart(fig)

lol that example isnt working either rn…

Maybe? The small example works for me running:

  • python 3.11.7
  • streamlit 1.30.0
  • plotly 5.18.0
  • kaleido 0.2.1

Updated streamlit from 29 to 30 but still no progess.

i used the following:

fig.write_image("gas_market_plot.png")

#to html setup
config = pdfkit.configuration(wkhtmltopdf=r'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe')

def generate_pdf(html_content):
    options = {'enable-local-file-access': True}
    # Use the 'config' variable defined at the top level of your script
    pdfkit.from_string(html_content, 'dashboard_report.pdf', options=options, configuration=config)

# function to create html content
def create_html_content():
    html_content = """
    <!doctype html>
    <html>
    <head>
        <title>Dashboard Report</title>
    </head>
    <body>
        <h1>Dashboard Report</h1>
        <h2>Electricity Market Plot</h2>
        <img src="electricity_market_plot.png" alt="Electricity Market Plot">
        <h2>Gas Market Plot</h2>
        <img src="gas_market_plot.png" alt="Gas Market Plot">
    </body>
    </html>
    """
    return html_content


# function to create a download link for the pdf
def create_download_link(filename):
    with open(filename, 'rb') as f:
        pdf_file = f.read()
    b64 = base64.b64encode(pdf_file).decode()
    href = f'<a href="data:application/octet-stream;base64,{b64}" download="{filename}">download report</a>'
    return href

# Example usage
html_content = create_html_content()
generate_pdf(html_content)

everything is fine and creating a pdf with html content untill i use the:

fig.write_image(“gas_market_plot.png”)

very strange…