Hi @optimvs
Have you looked into caching your data using st.cache_data
(st.cache_data - Streamlit Docs)
Also see this post that suggests using an option for persisting to disk (st.cache_data(persist="disk")
):
Another area to explore in improving the performance is to determine if the entire data is needed, here are some questions to consider:
- Are all columns needed?
- Or only certain columns are used from the data? If so, you can use the
usecols
parameter ofpd.read_csv
.
Another great read is the blog by @randyzwitch on building performant apps (6 Tips for Improving Your App Performance | Streamlit), in particular sections 4 and 5 on: * 4. Remove unused data* and * 5. Optimize data storage formats*
Hope this helps!