I noticed plotly charts were kind of slow at rendering even with small amounts of data and decided to dive into it. Specifically I found that charts with datetimes on the x-axis were slow.
I started to make a minimum reproducible example and in doing so I found that, after saving my data to csv’s and reading from these (to pandas), the issue stopped. What i then found was that if the x-axis column in my pandas dataframe was of the object/string type, when generating the figure, then the streamlit rendering would be faster and apparently without loss of functionality in the plots.
I have found that the render time will be more than 10x faster when using string typed datetimes.
My question would be if this is expected, or if I am doing something weird in the first place. Furthermore is there any reason why i might not want to always do this when doing plotly charts? In any case i wanted to share this potential performance increase.
I include a small example which shows both the total time and the time just for rendering as the generation of the figure can easily be cached so some may primarily need the fast rendering.
Im running my app locally with Streamlit version 1.38.0 and Python version 3.12.4.
small example code:
from time import time
import pandas as pd
import plotly.graph_objects as go
import streamlit as st
def st_fig_show(fig):
st.plotly_chart(fig, use_container_width=True)
def gen_fig_sample(convert_to_string_dates):
data1 = pd.DataFrame({'date': pd.date_range(start='2020-01-01', end='2024-06-01', freq='h'), 'value': 1})
data2 = pd.DataFrame({'date': pd.date_range(start='2020-01-01', end='2024-06-01', freq='h'), 'value': 2})
if convert_to_string_dates:
data1.date = data1.date.astype(str)
data2.date = data2.date.astype(str)
fig = go.Figure()
fig.add_trace(
go.Scatter(
x=data1.date, y=data1.value,
mode='lines',
name='data_value'
)
)
fig.add_trace(
go.Scatter(
x=data2.date, y=data2.value,
mode='lines',
name='data_value2'
)
)
return fig
ts = time()
fig_sample_string = gen_fig_sample(convert_to_string_dates=True)
st.write("generating figure using string-datetimes: %2.4f seconds" % (time() - ts))
ts1 = time()
st_fig_show(fig_sample_string)
st.write("rendering using string-datetimes: %2.4f seconds" % (time() - ts1))
st.write("total for generating and rendering string-datetimes: %2.4f seconds" % (time() - ts))
ts2 = time()
fig_sample_datetime = gen_fig_sample(convert_to_string_dates=False)
st.write("generating figure using datetimes: %2.4f seconds" % (time() - ts2))
ts3 = time()
st_fig_show(fig_sample_datetime)
st.write("rendering using datetimes: %2.4f seconds" % (time() - ts3))
st.write("total for generating and rendering datetimes: %2.4f seconds" % (time() - ts2))
That is interesting. I do not have an answer for the reason why that happens, but I was able to replicate your findings so I thought it would be good to report. Indeed st.plotly_chart takes 10x longer to render a figure with datetimes.
import functools
import time
import streamlit as st
import pandas as pd
import plotly.express as px
from plotly.graph_objects import Figure
def timer(func):
"""Decorator to time a function execution.
See: https://realpython.com/python-timer/#creating-a-python-timer-decorator
"""
@functools.wraps(func)
def wrapper_timer(*args, **kwargs):
tic = time.perf_counter()
value = func(*args, **kwargs)
toc = time.perf_counter()
elapsed_time = toc - tic
st.write(f"Elapsed time: **{elapsed_time:0.4f} seconds**")
return value
return wrapper_timer
@st.cache_resource
def generate_dataframe(convert_to_string_dates: bool) -> pd.DataFrame:
"""Generate a dataframe with datetimes and values. The dataframe is cached
to avoid re-generating it every time the app is run.
"""
df = pd.DataFrame(
{"date": pd.date_range(start="2015-01-01", end="2024-06-01", freq="h")}
)
df["value"] = df["date"].apply(lambda x: x.month)
if convert_to_string_dates:
df["date"] = df["date"].astype(str)
return df
@st.cache_resource
def generate_figure(df: pd.DataFrame) -> Figure:
"""Generate a figure with a line chart using the dataframe. The figure is
cached to avoid re-generating it every time the app is run.
"""
fig = px.line(df, x="date", y="value")
return fig
@timer
def st_fig_show(fig: Figure) -> None:
"""Render a plotly figure using streamlit. It will print the elapsed time
of the function execution, which is only the time it takes Streamlit to
render the figure.
"""
st.plotly_chart(fig, use_container_width=True)
def main():
cols = st.columns(2)
with cols[0]:
"## Render figure using datetimes"
df_datetimes = generate_dataframe(convert_to_string_dates=False)
dtypes_str = "\n"
for label, dtype in df_datetimes.dtypes.items():
dtypes_str += f"- {label}: `{dtype}`\n"
f"**Data types** {dtypes_str}"
f"Size of dataframe: {df_datetimes.memory_usage().sum() / 1024:.2f} KB"
figure = generate_figure(df_datetimes)
st_fig_show(figure)
with cols[1]:
"## Render figure using string-datetimes"
df_strings = generate_dataframe(convert_to_string_dates=True)
dtypes_str = "\n"
for label, dtype in df_strings.dtypes.items():
dtypes_str += f"- {label}: `{dtype}`\n"
f"**Data types** {dtypes_str}"
f"Size of dataframe: {df_strings.memory_usage().sum() / 1024:.2f} KB"
figure = generate_figure(df_strings)
st_fig_show(figure)
if st.button("Rerun `st.plotly_chart`", use_container_width=True):
st.rerun()
if __name__ == "__main__":
main()
I would guess that Streamlit makes a copy of the data contained in the plotly Figure, transforms the datetime data into strings (to make a valid JSON), add other custom bits to that JSON, and then passes that to plotly.
I tested converting the Plotly Figure to HTML and render it using streamlit.components.v1.html. The performance difference is still the same so perhaps the issue is not within Streamlit but within Plotly.
Thanks for the reply and for your time looking into this. I couldn’t quite figure out how to test if it was plotly alone causing this so thanks alot.
While this may be caused by plotly I probably do not care as much about performance, when working plotly in other contexts as I do when creating a dashboard, where the responsiveness can really be felt by the user. So even though this may not be (probably is not) a bug, I hope that this post may help others optimize their dashboards.
One further note is that I was working on my dashboard in pycharm using the debugger, and here the issue becomes much worse (string-typed performance will be more than 100x compared to the 10x found here). I know this is not as widely interesting, but I just wanted to mention it as it may trick others (as it did me) into thinking that their app is slower than it really is when deployed.
Thanks for stopping by! We use cookies to help us understand how you interact with our website.
By clicking “Accept all”, you consent to our use of cookies. For more information, please see our privacy policy.
Cookie settings
Strictly necessary cookies
These cookies are necessary for the website to function and cannot be switched off. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms.
Performance cookies
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us understand how visitors move around the site and which pages are most frequently visited.
Functional cookies
These cookies are used to record your choices and settings, maintain your preferences over time and recognize you when you return to our website. These cookies help us to personalize our content for you and remember your preferences.
Targeting cookies
These cookies may be deployed to our site by our advertising partners to build a profile of your interest and provide you with content that is relevant to you, including showing you relevant ads on other websites.