Using `st.cache()` with `CachedObjectMutationWarning:`

bismo · December 24, 2020, 4:22pm

Hello! I am using st.cache() when pulling data from the web and it is making my app perform 100x better than it did without it. I am using the following function to get data from the web:

@st.cache
def get_data():

    col_list = [*list of columns here*]

    data = pd.read_csv('url', low_memory=False, usecols=col_list)

    return data

data = get_data()

When I run this, I get the warning:

CachedObjectMutationWarning: Return value of get_data() was mutated between runs.

By default, Streamlit's cache should be treated as immutable, or it may behave in unexpected ways. You received this warning because Streamlit detected that an object returned by get_data() was mutated outside of get_data().

The warning also suggests I use @st.cache(allow_output_mutation=True) to allow this, however, I don’t want my data to be messed up somehow.

The reason I believe I am getting this warning is because I am doing stuff (mainly pandas functions) to this data after pulling based on certain user inputs. I just want to make sure if this is okay and I am at no risk of messing up my data somewhere? The app works wonderfully when I use @st.cache() but not sure if it is worth the potential errors/incorrectness it may cause in my data? I even tried creating a copy of the fetched data like so

...
data = get_data()
data1 = data.copy()

But I still get the warning. If I use @st.cache(allow_output_mutation=True), is it okay if I am altering my data after pulling it? If not, how are supposed to alter data we pull from the web without messing it up?

To be clear, I just want to pull this data once and store it in a cache, and then based on user input, do certain things to it. Without this cache decorator, it seems like the app is pulling the data after every user input, which takes a couple seconds and isn’t desired.

Thanks!

Marisa_Smith · December 24, 2020, 5:14pm

Hey @bismo,

When you said your pulling data from the web what do you mean? I ask because when I read this, i have an inkling that each time you “pull data” from the web it has the potential to change (depending on what your scraping).

The way @st.cache works is that it remembers the output of the function your running so that you don’t have to actually run the function again. It seems that when you read this csv file from that url the data itself is different each time, and it’s throwing you this error.

I wouldn’t expect you doing mutations on the data further down in your app to cause this to occur.

Can you check that nothing is changing from this website somehow?

Thanks!
Marisa

bismo · December 24, 2020, 5:54pm

Hey @Marisa_Smith! I am pulling data from two links actually. They are CSV files from a github repository. One is updated by the host weekly, the other daily, so the data does indeed change with time as new data is added to the files (which I am aware of and expect). Is that okay? Sorry for the confusing terminology in my initial post

EDIT: Just some more info, one of the files is updated overnight, so not often between running the app is the data different.

Thanks!

Topic		Replies	Views
St.cache throws unexpected mutation warning Using Streamlit cache	1	549	February 19, 2023
St.cache and ouput mutation Using Streamlit	3	901	May 13, 2022
@st.cache_data VS @st.cache_resource - small issues Using Streamlit	12	5452	February 17, 2024
Translating st.cache allow_output_mutation=True for new cache_data & cahce_resource functions Using Streamlit cache	6	2549	December 8, 2024
St.cache_data warning on AWS Deployment cache , pandas	6	427	September 7, 2023

Using `st.cache()` with `CachedObjectMutationWarning:`

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies