Memory limits on using cache_data

wangp22 · April 14, 2023, 2:49pm

Hi,
I wonder if there are any memory limits to the cache that Streamlit uses to stored cached results when using cache_data. I think I read it’s 1 GB, not too sure.

In my app, I have a function that uses the st.cache_data decorator. It returns a relatively large output, a dataframe that is around 1Mb. I’m afraid that if this function gets called with different parameters many times during a session, many results will be cached and whatever resource used to store the results would be filled. If true, what is the consequence? Will a warning appear? Does the app suffer a performance hit? Will the older results get overwritten?

The answer will help me decide whether to use st.cache_data on said function.
Thanks

blackary · April 14, 2023, 3:48pm

Hi @wangp22, I don’t believe there is an inherent limit with st.cache_data. However, the total amount of memory available to Community Cloud-hosted apps is limited. See Manage your app - Streamlit Docs for more details, but you are correct that you get 1GB of RAM.

You are also correct that, if you call cache_data on a function 100 different times with 100 different parameters, you can expect the total amount of memory used to climb approximately 100x. If you do end up using up all of the memory on your instance, it may well crash and need to be rebooted.

The easiest workaround is to add a limit of how many different values you remember with max_entries=10, or whatever number is reasonable. See st.cache_data - Streamlit Docs for more details.

You also might try and see whether the cache is really necessary – 1Mb is not a terribly large amount for some purposes, and if it’s pretty quick to generate the dataframe, then caching might not actually make a big difference in performance. It all depends how you are getting/generating that df.

You also might want to consider separating out generating the “base data” and any transformations on it. It’s hard to speak in terribly general terms, but I often end up doing something like this:


@st.cache_data
def fetch_some_data():
    # get data from database, or some other source
    # This function often doesn't need many parameters
    ...

# I may or may not neeed to cache this data, depending on how slow it is
def transform_the_data(filter1, filter2):
    base_data = fetch_some_data()
    # Do the transformations and return them

system · October 11, 2023, 3:48pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Trying to understand Cache & Streamlitcloud vs. Local Version Community Cloud cache , streamlit-cloud	1	691	March 10, 2023
@st.cache_data VS @st.cache_resource - small issues Using Streamlit	12	5721	February 17, 2024
App over its resource limits Community Cloud cache	6	1242	January 27, 2024
Memory behavior Using Streamlit cache	12	3356	March 7, 2024
Using `st.cache()` with `CachedObjectMutationWarning:` Using Streamlit	3	3676	January 12, 2022

Memory limits on using cache_data

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies