Hi, is there a way to speed up my scatter plots ?
I have 5 of them and in total i will have around 10 but its slow.
If i use small file up to 1k rows in csv and 25 columns its fast,
With 22k rows it shows me 0.7421250219573975 - took to load but it takes almost 2 to 4 seconds as i can see chart from previous click ( state session with radio button ) and i see how it fades away and scatter plot appears.
My scatter plots are in rows , no tabs, no columns.
With 200k it takes around 3.25 but it takes another double of that time sometimes triple to load and pages sometimes gets less responsive or non responsive for a moment until everything is loaded.
And more plots i add - the slower it gets.
Does plotting affect speed and performance of streamlit app ( line chart with 200k rows takes ages to load but code is only few words) and does in-row code writing affects streamlit performance ?
i mean like this:
I can’t see the full code, but if you’re performing data transformations before plotting, you should use the st.cache_data decorator. This ensures that such operations are not repeated unnecessarily and should speed up the display of your scatter plots.
Let me know how it goes. I’m happy to review the code too if you like
I do have some data transformation but its within @st.cache
One thing is not in @st.cache is filtering
filtering is set to filtered_df and my scatter, graphs, map etc is dynamically changes on this filtered_df.
And i have 10 radio buttons mapped to st.session_state and 8 of them are tied to filtered_df.
This is my scatter plot code:
elif st.session_state.type_filter=='Filtered Time Series':
st.subheader("Filtered Weekly Earnings")
linechartwrate = pd.DataFrame(filtered_df.groupby(filtered_df["week_year"].dt.strftime("%Y : %U"))["Rate"].sum()).reset_index()
laiko_juosta_savaite = px.bar(linechartwrate, x = "week_year", y="Rate", labels={"Rate": "Amount"}, height=600, width = 1000, template="gridon")
st.plotly_chart(laiko_juosta_savaite, use_container_width=True)
csv = linechartwrate.to_csv(index=False).encode('utf-8')
st.download_button("Download Time Series Weekly Data", data=csv, file_name="TimeSeriesWeekly.csv", mime="text/csv", help='Click here to download the file as a CSV file')
with st.expander("View Time Series Weekly Data"):
st.write(linechartwrate.T.style.background_gradient(cmap="Blues"))
# Time Series Monthly
st.subheader("Filtered Monthly Earnings")
df["month_year"] = df["PuDate"].dt.to_period("M")
linechartrate = pd.DataFrame(filtered_df.groupby(filtered_df["month_year"].dt.strftime("%Y : %m"))["Rate"].sum()).reset_index()
laiko_juosta_menuo = px.bar(linechartrate, x = "month_year", y="Rate", labels={"Rate": "Amount"}, height=600, width = 1000, template="gridon")
st.plotly_chart(laiko_juosta_menuo, use_container_width=True)
csv = linechartrate.to_csv(index=False).encode('utf-8')
st.download_button("Download Time Series Monthly Data", data=csv, file_name="TimeSeriesMonthly.csv", mime="text/csv", help='Click here to download the file as a CSV file')
with st.expander("View Time Series Monthly Data"):
st.write(linechartrate.T.style.background_gradient(cmap="Blues"))
I see. I would definitely cache the data filtering function.
The new st.cache_data decorator is used for caching functions that return data, such as dataframes, text, or computations with basic types.
You could try something along these lines:
import streamlit as st
import pandas as pd
import plotly.express as px
# Cache the data loading function using st.cache_data
@st.cache_data
def load_data():
# Load your data here
return df
# Cache the data filtering function using st.cache_data
@st.cache_data
def filter_data(df, filter_conditions):
# Apply your filtering logic here based on the filter_conditions
filtered_df = df[filter_conditions]
return filtered_df
# Load the data (this will be cached)
data = load_data()
# Define your filter conditions based on user input or other criteria
filter_conditions = ...
# Filter the data (this will be cached)
filtered_df = filter_data(data, filter_conditions)
# Now use the filtered_df for your plotting
if st.session_state.type_filter == 'Scatter Plot':
fig = px.scatter(
filtered_df,
x="RPM",
y="Weight",
color="Price",
color_continuous_scale="gnbu"
)
st.plotly_chart(fig, theme="streamlit", use_container_width=True)
Can you please help me understand what is the point to use st.cache_data when the data is filtered with drop down options? Whenever the user selects a new value in the drop down, the whole script will be ran. I may have understood it incorrectly and it seems to me that caching the dynamic data unnecessarily increases memory usage and slow down performance.
Thanks for stopping by! We use cookies to help us understand how you interact with our website.
By clicking “Accept all”, you consent to our use of cookies. For more information, please see our privacy policy.
Cookie settings
Strictly necessary cookies
These cookies are necessary for the website to function and cannot be switched off. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms.
Performance cookies
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us understand how visitors move around the site and which pages are most frequently visited.
Functional cookies
These cookies are used to record your choices and settings, maintain your preferences over time and recognize you when you return to our website. These cookies help us to personalize our content for you and remember your preferences.
Targeting cookies
These cookies may be deployed to our site by our advertising partners to build a profile of your interest and provide you with content that is relevant to you, including showing you relevant ads on other websites.