Hi,
I wonder if there are any memory limits to the cache that Streamlit uses to stored cached results when using cache_data. I think I read it’s 1 GB, not too sure.
In my app, I have a function that uses the st.cache_data decorator. It returns a relatively large output, a dataframe that is around 1Mb. I’m afraid that if this function gets called with different parameters many times during a session, many results will be cached and whatever resource used to store the results would be filled. If true, what is the consequence? Will a warning appear? Does the app suffer a performance hit? Will the older results get overwritten?
The answer will help me decide whether to use st.cache_data on said function.
Thanks
Hi @wangp22, I don’t believe there is an inherent limit with st.cache_data. However, the total amount of memory available to Community Cloud-hosted apps is limited. See Manage your app - Streamlit Docs for more details, but you are correct that you get 1GB of RAM.
You are also correct that, if you call cache_data on a function 100 different times with 100 different parameters, you can expect the total amount of memory used to climb approximately 100x. If you do end up using up all of the memory on your instance, it may well crash and need to be rebooted.
The easiest workaround is to add a limit of how many different values you remember with max_entries=10, or whatever number is reasonable. See st.cache_data - Streamlit Docs for more details.
You also might try and see whether the cache is really necessary – 1Mb is not a terribly large amount for some purposes, and if it’s pretty quick to generate the dataframe, then caching might not actually make a big difference in performance. It all depends how you are getting/generating that df.
You also might want to consider separating out generating the “base data” and any transformations on it. It’s hard to speak in terribly general terms, but I often end up doing something like this:
@st.cache_data
def fetch_some_data():
# get data from database, or some other source
# This function often doesn't need many parameters
...
# I may or may not neeed to cache this data, depending on how slow it is
def transform_the_data(filter1, filter2):
base_data = fetch_some_data()
# Do the transformations and return them
Thanks for stopping by! We use cookies to help us understand how you interact with our website.
By clicking “Accept all”, you consent to our use of cookies. For more information, please see our privacy policy.
Cookie settings
Strictly necessary cookies
These cookies are necessary for the website to function and cannot be switched off. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms.
Performance cookies
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us understand how visitors move around the site and which pages are most frequently visited.
Functional cookies
These cookies are used to record your choices and settings, maintain your preferences over time and recognize you when you return to our website. These cookies help us to personalize our content for you and remember your preferences.
Targeting cookies
These cookies may be deployed to our site by our advertising partners to build a profile of your interest and provide you with content that is relevant to you, including showing you relevant ads on other websites.