Caching results of class methods

Hi all! I’ve decided to build a streamlit app in order to demonstrate how my library works and stuck with some performance issues.

I have class with several methods which return plotly object, for example:

vh = VisualisationsHandler(data1, data2)
plt = vh.build_plot(args)

Methods are slow because of preprocessing for data visualisations and I’ve thought that I could just cache results of the methods and use st.plotly_chart for fast visualisations. I came with following solution for caching method results:

@st.cache
def building_plot(vh, args):
return vh.build_plot(args)

But it didn’t improve timings much - it still takes considerable amount of time. Additionally every time I change any of the args which I pass to plotting methods all plots are redrawing.

I’ve tried to experiment with chache misses inserting st.write(f"Cache miss: expensive_computation({vh_outer}") inside the cached function and it seems that caching is working right but still very slow. I suppose that it could be connected with my class hashing.

Do you have any ideas how I can optimise that without rewriting my class?

Hey @Sergey_Titov - welcome to Streamlit!

@st.cache won’t make a function run more quickly, it will just ensure that two calls to the same function, using the same arguments, don’t result in the function running twice. In other words:

import time
import streamlit as st

@st.cache
def expensive_function(seconds):
    time.sleep(seconds)

start = time.time()

expensive_function(5)

st.write("Total seconds: ", round(time.time() - start, 2))

^ This example will always take ~5 seconds to run, and the @st.cache won’t provide any speed benefits (but it doesn’t hurt performance either).

If you have two or more calls to expensive_function(5), those subsequent calls should be very quick, because the function itself won’t actually be run:

import time
import streamlit as st

@st.cache
def expensive_function(seconds):
    time.sleep(seconds)

start = time.time()

expensive_function(5)
expensive_function(5)
expensive_function(5)

st.write("Total seconds: ", round(time.time() - start, 2))

^ This example should also take ~5 seconds to run, because expensive_function(5) is only being run the first time.

However, @st.cache will also re-run your function when any of the args change. For example:

import time
import streamlit as st

@st.cache
def expensive_function(seconds):
    time.sleep(seconds)

start = time.time()

expensive_function(4)
expensive_function(3)
expensive_function(4)
expensive_function(3)

st.write("Total seconds: ", round(time.time() - start, 2))

^ This will take ~7 seconds to run, because expensive_function(4) and expensive_function(3) both need to be run once (the subsequent calls will already be cached).

So this is all to say that st.cache does not make your function faster, it just prevents it from being called as much in circumstances where you’re passing the same arguments multiple times.

(Apologies if all this offensively elementary and basic, and I’m misunderstanding your question! If you’re saying that st.cache itself is adding lots of additional overhead - this may have to do with the complexity of the arguments that you’re passing to your function, because Streamlit needs to hash those arguments in order to generate the proper cache key.)

1 Like