Streamlit cache uses too much ram

Summary

I’ve deployed streamlit in a container and ram usage keeps growing into many GB’s. I’ve set all cache to save to disk so it’s unclear why ram grows.

I have two related questions;

  1. Does setting persist=“disk” inside the cache_decorator still use up ram?
  2. How I can point streamlit cache location to a dedicated container volume?

Thanks!

Hi @CharlesFr

There are a few things that you can do.

Firstly, see if recommendations in the following blogs might help:

After use, cache can be cleared via:

Insights could also be gained on the individual processes that is slowing down the app by profiling the app:

Hope these resources are helpful.

Thanks @dataprofessor - that is helpful indeed.

Would you mind advising on the second question; how can I provide path for persisting cache to disk?

2 Likes

@dataprofessor - would appreciate some insight from you:

  1. Can prioritise caching to disk (and not RAM)
  2. How I customise cache pahth?

Thanks!

1 Like

Hi @CharlesFr

Currently, you can persist to disk via st.cache_data as in:

import streamlit as st

@st.cache_data(persist="disk")
def fetch_and_clean_data(url):
    # Fetch data from URL here, and then clean it up.
    return data

however, specifying a path is not yet possible.

Please refer to the Docs page for more info:

1 Like

Thanks!

All my data functions are decorated with @st.cache_data(persist=True) but I still get very large memory usage - am I correct in assuming that it’s very likely due to memory used by variables?

1 Like

Hi @CharlesFr, just wanted to add that there is a memory leak that will be fixed in version 1.32 of Streamlit, and in the meantime, you can use this wheel file.

Our team has a few other suggestions for reducing memory usage in this thread (I’ll paste below).

To lower the memory usage, there are two quick things you could try:

  1. Use this wheel file which fixes a memory leak. This leak was fixed last week and will be released with 1.32.
  2. Deactivate the backend storage of forward messages in config.toml via (requires at least 1.30):
[global] 
storeCachedForwardMessagesInMemory = false

This is a bit of an old artifact that most likely doesn’t have any use. And its a bit of a problem if there is a spike of user sessions.

These two aspects might help with the issue, but there are other memory inefficiencies we are currently investigating. To give you more specific help, it would be great if you can tell us which of the following features your app is using:
st.file_uploader , st.image , st.video , st.audio , st.pyplot, st.download_button , long running sessions using st.rerun , large dataframes/charts, or any of the caching decorators?

For my understanding, what does Streamlit use ephemeral/disc storage for? Are cached and pickled objected stored on disc or in memory?

I think everything in a normal Streamlit setup is stored in memory. However, you can configure certain aspects to store on disk (e.g. via st.cache_data(persist="disk")). But you are probably not doing that, or?

2 Likes

1.32.0 is now available, including the memory leak fix!

1 Like