Streamlit memory issue

Our team is experiencing an issue with the memory usage of our Streamlit app when deployed on Heroku. The memory allocated starts growing and looks like it is never released. At a certain threshold ( in line with this behaviour https://stackoverflow.com/questions/25348251/when-do-heroku-dynos-use-swap-memory ) swap memory starts to be filled. Quickly the application reaches the memory limit, “R14 quota exceeded” warnings are triggered. RSS and SWAP memory keep growing reaching > 170% memory usage, then, an “R15 memory quota vastly exceeded” is triggered and soon the application crashes. After every restart the pattern repeats.

RELEVANT INFO:

  • Python version: 3.10.5
  • Streamlit version: 1.37.0 (see next comment)
  • Streamlit version has been updated recently: From 1.32.2 to 1.37.0. We started experiencing the problem after this update.
  • The application use a wide variety of Streamlit elements, cache mechanisms included.
  • Running the application in local (on Mac OS) works smoothly with the expected memory behaviour (allocate and release). Following the screenshot representing memory allocation after reproducing in local:

WHAT WE TRIED:

There are a good amount of recent treads regarding memory problems with Streamlit applications deployed on Linux, this sounds like our scenario as well (again, running locally on Mac OS works perfectly) but no relevant explanations/solutions have been suggested so far:

1 Like

Furthermore, what would you suggest to do in the short term as a workaround for this problem?
Production environment is impacted and the problem is serious, need to put the app in a safer spot asap.

Possible changes that comes to my mind:

  1. Scale UP - Increase Heroku memory (Dyno size/number of Dynos) : Expensive and effectively would only post-pone the crash (we have daily restart in place though).
  2. Switch to st.cache_data - Refactor the code avoiding to handle cache directly through st.session_state (dictionary) but instead use st.cache_data annotations (and similars), maybe specifying ttl/max entries : Refactor would still take some time and there’s no guarantee that problem would be solved.
  3. Switch from Heroku to AWS - Change hosting platform : there are evidences that the same problem might happen with AWS as well ( Memory leak when hosting streamlit v1.34 app on AWS EC2 instance via Linux docker container).

Any other solution or comment on the proposed ones would be really appreciated :pray:

cache_data would be the best option as it reuses the cache across all users, unlike session_state that creates a duplicate for each active session on the server

1 Like

Thank you , @SiddhantSadangi for the valuable tip. That is indeed something we are already planning in our roadmap.

However, it is worth to notice that the issue with the memory we are experiencing takes place with just one active session, it is not a theme of multiple user concurrency.

1 Like

Hey @Pietro! I don’t have a good answer in form of a fix or so for you yet but just wanted to acknowledge the issue. I will bring it up with the team. Do you have multiple apps where you experience this or just this single one? Is it possible to share a minimal example that still reproduces the issue for you? And just to confirm: the same app worked fine without any issues in 1.32 but after the update to 1.37, you started to observe the issue right?

Greetings @raethlein !
No, this is our only product based on Streamlit. The application is pretty complex and we cannot share details. However, we understand that being able to reproduce the problem would be extremely valuable for the analysis. We will try to set up a minimal example that reflects our codebase and reproduce the problem to share with you (might require some days).

Furthermore, I want to clarify what are our thoughts on the version update change: Our application changes and evolve quite often and it could be just a coincidence that we started experiencing the problem with timings similar to the version update. Indeed, other tickets (listed above) report the same issue with versions previous to 1.32.

Thank you for the support, I will keep you posted in case other evidences come up and will try to provide you an example. Last thing, I want to recall that is not related to multiple user usage and that we are still stuck with this problem in production.

1 Like

Thanks for your support @Pietro! I hope together we can get to the bottom of this :slightly_smiling_face:
Since you reported that it works just fine on macOS it sounds like its not an app issue though, so I am curious to learn what the culprit is.

1 Like

Hi @raethlein @SiddhantSadangi , we have reproduced the issue here: https://github.com/Tamarix-Technologies/streamlit-issue.

This simple application loads some random tables and stores them in the session state in page 1 and page 2. Page 3 displays a tab that lets you: A) upload more custom user data. B) see all the data in cache for the user. Also, as soon as the user / user data name changes, session state gets cleared.

The memory behaviour in local (macOS Sonoma 14.6.1, M1 pro, 16 gb memory) looks quite normal (profiled using the python library memory_profiler), memory usage goes up and down and in particular goes close to 0 when the user is changed (session state is cleared):

However, when deployed on heroku https://streamlit-issue-e50b58e3c566.herokuapp.com/ it keeps going up and after an hour of inactivity it still remains very high:

This application resembles ours, hopefully this replica will help you understand what is going on.
Thank you for your support, please keep us posted.