Hello! I am using st.cache()
when pulling data from the web and it is making my app perform 100x better than it did without it. I am using the following function to get data from the web:
@st.cache
def get_data():
col_list = [*list of columns here*]
data = pd.read_csv('url', low_memory=False, usecols=col_list)
return data
data = get_data()
When I run this, I get the warning:
CachedObjectMutationWarning: Return value of get_data() was mutated between runs.
By default, Streamlit's cache should be treated as immutable, or it may behave in unexpected ways. You received this warning because Streamlit detected that an object returned by get_data() was mutated outside of get_data().
The warning also suggests I use @st.cache(allow_output_mutation=True)
to allow this, however, I don’t want my data to be messed up somehow.
The reason I believe I am getting this warning is because I am doing stuff (mainly pandas functions) to this data after pulling based on certain user inputs. I just want to make sure if this is okay and I am at no risk of messing up my data somewhere? The app works wonderfully when I use @st.cache()
but not sure if it is worth the potential errors/incorrectness it may cause in my data? I even tried creating a copy of the fetched data like so
...
data = get_data()
data1 = data.copy()
But I still get the warning. If I use @st.cache(allow_output_mutation=True)
, is it okay if I am altering my data after pulling it? If not, how are supposed to alter data we pull from the web without messing it up?
To be clear, I just want to pull this data once and store it in a cache, and then based on user input, do certain things to it. Without this cache decorator, it seems like the app is pulling the data after every user input, which takes a couple seconds and isn’t desired.
Thanks!