Plotly chart performance with datetime x-axis

Spilopmageren · September 10, 2024, 12:27pm

I noticed plotly charts were kind of slow at rendering even with small amounts of data and decided to dive into it. Specifically I found that charts with datetimes on the x-axis were slow.

I started to make a minimum reproducible example and in doing so I found that, after saving my data to csv’s and reading from these (to pandas), the issue stopped. What i then found was that if the x-axis column in my pandas dataframe was of the object/string type, when generating the figure, then the streamlit rendering would be faster and apparently without loss of functionality in the plots.

I have found that the render time will be more than 10x faster when using string typed datetimes.

My question would be if this is expected, or if I am doing something weird in the first place. Furthermore is there any reason why i might not want to always do this when doing plotly charts? In any case i wanted to share this potential performance increase.

I include a small example which shows both the total time and the time just for rendering as the generation of the figure can easily be cached so some may primarily need the fast rendering.

Im running my app locally with Streamlit version 1.38.0 and Python version 3.12.4.

small example code:

from time import time

import pandas as pd
import plotly.graph_objects as go
import streamlit as st


def st_fig_show(fig):
    st.plotly_chart(fig, use_container_width=True)

def gen_fig_sample(convert_to_string_dates):
    data1 = pd.DataFrame({'date': pd.date_range(start='2020-01-01', end='2024-06-01', freq='h'), 'value': 1})
    data2 = pd.DataFrame({'date': pd.date_range(start='2020-01-01', end='2024-06-01', freq='h'), 'value': 2})
    if convert_to_string_dates:
        data1.date = data1.date.astype(str)
        data2.date = data2.date.astype(str)
    fig = go.Figure()

    fig.add_trace(
        go.Scatter(
            x=data1.date, y=data1.value,
            mode='lines',
            name='data_value'
        )
    )
    fig.add_trace(
        go.Scatter(
            x=data2.date, y=data2.value,
            mode='lines',
            name='data_value2'
        )
    )
    return fig


ts = time()
fig_sample_string = gen_fig_sample(convert_to_string_dates=True)
st.write("generating figure using string-datetimes: %2.4f seconds" % (time() - ts))

ts1 = time()
st_fig_show(fig_sample_string)
st.write("rendering using string-datetimes: %2.4f seconds" % (time() - ts1))
st.write("total for generating and rendering string-datetimes: %2.4f seconds" % (time() - ts))

ts2 = time()
fig_sample_datetime = gen_fig_sample(convert_to_string_dates=False)
st.write("generating figure using datetimes: %2.4f seconds" % (time() - ts2))

ts3 = time()
st_fig_show(fig_sample_datetime)
st.write("rendering using datetimes: %2.4f seconds" % (time() - ts3))
st.write("total for generating and rendering datetimes: %2.4f seconds" % (time() - ts2))

edsaac · September 11, 2024, 2:46pm

That is interesting. I do not have an answer for the reason why that happens, but I was able to replicate your findings so I thought it would be good to report. Indeed st.plotly_chart takes 10x longer to render a figure with datetimes.

Code

import functools
import time

import streamlit as st
import pandas as pd
import plotly.express as px
from plotly.graph_objects import Figure


def timer(func):
    """Decorator to time a function execution.
    See: https://realpython.com/python-timer/#creating-a-python-timer-decorator
    """

    @functools.wraps(func)
    def wrapper_timer(*args, **kwargs):
        tic = time.perf_counter()
        value = func(*args, **kwargs)
        toc = time.perf_counter()
        elapsed_time = toc - tic
        st.write(f"Elapsed time: **{elapsed_time:0.4f} seconds**")
        return value

    return wrapper_timer


@st.cache_resource
def generate_dataframe(convert_to_string_dates: bool) -> pd.DataFrame:
    """Generate a dataframe with datetimes and values. The dataframe is cached
    to avoid re-generating it every time the app is run.
    """

    df = pd.DataFrame(
        {"date": pd.date_range(start="2015-01-01", end="2024-06-01", freq="h")}
    )

    df["value"] = df["date"].apply(lambda x: x.month)

    if convert_to_string_dates:
        df["date"] = df["date"].astype(str)

    return df


@st.cache_resource
def generate_figure(df: pd.DataFrame) -> Figure:
    """Generate a figure with a line chart using the dataframe. The figure is
    cached to avoid re-generating it every time the app is run.
    """
    fig = px.line(df, x="date", y="value")
    return fig


@timer
def st_fig_show(fig: Figure) -> None:
    """Render a plotly figure using streamlit. It will print the elapsed time
    of the function execution, which is only the time it takes Streamlit to
    render the figure.
    """
    st.plotly_chart(fig, use_container_width=True)


def main():
    cols = st.columns(2)

    with cols[0]:
        "## Render figure using datetimes"
        df_datetimes = generate_dataframe(convert_to_string_dates=False)
        dtypes_str = "\n"
        for label, dtype in df_datetimes.dtypes.items():
            dtypes_str += f"- {label}: `{dtype}`\n"
        f"**Data types** {dtypes_str}"
        f"Size of dataframe: {df_datetimes.memory_usage().sum() / 1024:.2f} KB"

        figure = generate_figure(df_datetimes)
        st_fig_show(figure)

    with cols[1]:
        "## Render figure using string-datetimes"
        df_strings = generate_dataframe(convert_to_string_dates=True)
        dtypes_str = "\n"
        for label, dtype in df_strings.dtypes.items():
            dtypes_str += f"- {label}: `{dtype}`\n"
        f"**Data types** {dtypes_str}"
        f"Size of dataframe: {df_strings.memory_usage().sum() / 1024:.2f} KB"

        figure = generate_figure(df_strings)
        st_fig_show(figure)

    if st.button("Rerun `st.plotly_chart`", use_container_width=True):
        st.rerun()


if __name__ == "__main__":
    main()

I would guess that Streamlit makes a copy of the data contained in the plotly Figure, transforms the datetime data into strings (to make a valid JSON), add other custom bits to that JSON, and then passes that to plotly.

------- df_datetimes -------
         2978924 function calls (2813219 primitive calls) in 1.480 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        2    0.507    0.253    0.709    0.355 {method '__deepcopy__' of 'numpy.ndarray' objects}
 165549/0    0.213    0.000    0.000          copy.py:118(deepcopy)
    82539    0.127    0.000    0.440    0.000 utils.py:85(default)
    82537    0.079    0.000    0.079    0.000 {method 'isoformat' of 'datetime.datetime' objects}
   578731    0.070    0.000    0.070    0.000 {method 'get' of 'dict' objects}

PS:

I tested converting the Plotly Figure to HTML and render it using streamlit.components.v1.html. The performance difference is still the same so perhaps the issue is not within Streamlit but within Plotly.

Spilopmageren · September 12, 2024, 8:19am

Thanks for the reply and for your time looking into this. I couldn’t quite figure out how to test if it was plotly alone causing this so thanks alot.

While this may be caused by plotly I probably do not care as much about performance, when working plotly in other contexts as I do when creating a dashboard, where the responsiveness can really be felt by the user. So even though this may not be (probably is not) a bug, I hope that this post may help others optimize their dashboards.

One further note is that I was working on my dashboard in pycharm using the debugger, and here the issue becomes much worse (string-typed performance will be more than 100x compared to the 10x found here). I know this is not as widely interesting, but I just wanted to mention it as it may trick others (as it did me) into thinking that their app is slower than it really is when deployed.

Balvalat · September 13, 2024, 5:50pm

You should probably open an issue here: Issues · plotly/plotly.py · GitHub

system · March 12, 2025, 5:50pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

edsaac · April 25, 2025, 3:58pm

It seems this was fixed in Plotly and the performance is pretty much the same now with datetimes or strings. (Thanks to @Matthias3 for testing again).

Topic		Replies	Views
Plotly Performance Issues Despite Caching Using Streamlit pandas , plotly , debugging	8	98	April 27, 2025
Slow and instable app when using plotly charts Using Streamlit	2	1956	March 5, 2024
Why I do not see the x axis when I use plotly with streamlit Using Streamlit discussion	3	47	February 9, 2025
Scatter plot is very slow Using Streamlit	6	1579	August 23, 2024
How to render chart faster Using Streamlit cache , plotly	3	5614	January 12, 2022

Plotly chart performance with datetime x-axis

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies