Using st.cache for Matplotlib figures (or hash_funcs for complex objects)

Posted this on https://twitter.com/andfanilo/status/1421886366510616585 and I guess it may be of interest to some of you :slight_smile:

You sometimes create a function that builds a complex Matplotlib figure out of a dataframe, and you don’t want it to rerun everytime you interact with a Streamlit widget, so you decide to decorate it with @st.cache

Part of the solution is provided in the trace: Streamlit does not know how to hash Spines and needs you to specify it through hash_funcs

Now building a hash by using the colors of each spine in the Spines collection is a silly example, but it just shows you can hash it the way you want.

Why silly do you ask? Try it yourself and see that:

  1. it is supppper slooow, maybe interacting with spines is slow, maybe other variables internal to the Matplotlib Figure are also slow to hash
  2. what if you change the underlying Matplotlib function to plot and change the width of the spines? Your hash would keep the same value as it does not check for the width when computing the hash. You could actually end up not updating the cached result when changing the underlying library during your livecoding session. So make sure your hash identifies your object has specifically as possible!

There are multiple ways provided in the Streamlit docs (definitely have a look!), with each it’s own pros&cons.

  • If Python is able to do this, hash the whole thing! Streamlit has a particular way of hashing objects to make sure every referenced variable has weight in the hash, but for a Maplotlib rendered figure it may be overkill. In other cases though, the Streamlit magic hashing may be faster than Python’s native way, so test carefully.
  • If creating the figure is only part of a bigger cached function, you could want to disable its hashing with the lambda _: None anonymous function.
  • If the figure is the returned value of the method, allow_output_mutation deactivates hashing of the resulting figure, so cached result only depends on provided inputs and the body of the function (just be careful to not mutate the output plot in your Streamlit app: as Streamlit doesn’t track those, so this could mess up the cache!).

So yeah, go give it a try!


This wraps up discussions in a number of issues, which you can check for further details:

Hope you liked it. Did that in a rush, but given positive feedback, I’ll probably do a bit more of those based on recurring questions I see floating around in less of a rush :laughing:. In the meantime, happy Streamlitin’ !

Fanilo :balloon:

5 Likes

BEAUTIFUL @andfanilo!!! :heart_eyes_cat:

1 Like

+ve feedback * 20!

2 Likes