Caching Function With an Unhashable Argument

Hey all, I’m working on a small app that connects to Mailchimp(code below). I had cached my fetch_mailchimp_lists() function to avoid repeatedly fetching data every time Streamlit re-renders.

The function takes a Mailchimp client object as a parameter. Here’s the issue I ran into:

When a user connected their own Mailchimp account using their API key, the lists returned were unexpectedly from my demo account, not theirs. After some digging, I realized this happened because the function was cached paired with the fact that the sole argument to the function was an unhashable Mailchimp Client object. Streamlit threw an error, so I added an underscore to the argument name so it wouldn’t try to hash. I believe because of this, streamlit is unable to tell when to use the cached value of the function vs. call the function, so even when different Client objects were passed to the function, the streamlit servers still returned the cached value since the argument itself isn’t hashed or checked within streamlits servers.

After removing @st.cache_data from the function, everything worked as expected, but of course it is calling the function more times than I want it too…

Just posting here to confirm—am I understanding this behavior correctly? That Streamlit couldn’t detect that the argument (a Mailchimp client object) had changed due to the fact its unhashable and so it always returns the cached result.

If so, how can I get around this?

Info:
App deployed on Community Cloud
Python version: 3.12.10
Streamlit version: 1.44.1

From the docs:

A function’s arguments must be hashable to cache it. If you have an unhashable argument (like a database connection) or an argument you want to exclude from caching, use an underscore prefix in the argument name. In this case, Streamlit will return a cached value when all other arguments match a previous function call. Alternatively, you can declare custom hashing functions with hash_funcs.

See also Excluding input parameters.

2 Likes

One method I’ve used to deal with a similar situation is to pass something that is hashable, that uniquely identifies this particular call.

For example, in your case, you might consider something like passing the API key into the cached function, even though it’s not used in the function body, because if the user is passing a different API key, and so would get a different client, the API key being different will indicate to st.cache_data that it needs to return a different result.

So, something like fetch_mailchimp_lists(_client: Client, api_key: str): ..., or some other value that is hashable, and would uniquely identify the particular client that is trying to connect.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.