Streamlit memory leak on Ubuntu?

Hi
I have created a Streamlit app that I have been running both locally on my Mac Book Pro and on a Linux server with Ubuntu 20.04.1 LTS. I noticed a clear difference that the memory of the Ubuntu server is increasing continuously over time when used while on my Mac it is increasing when used but decreases after a while. The only difference is that the Ubuntu version uses a GPU instead of the CPU when running. Can it be the issue?
Unfortunately, I cannot share the source code, but I use Detectron2 that is built on pytorch for object detection.

EDIT: I just realised that I should try to se the behaviour with using only CPU on the Linux server also to see if that is the problem, but the problem remains even when the Linux server also only uses CPU. I use Streamlit 0.71.0.

Best regards
Tomas

Hello @tomas, welcome to the forum!

Is there a chance that the memory leak comes from Detectron2, or at least its usage inside Streamlit?

One way I see to check if Streamlit is the one to blame would be to create a small Python script which does the following:

  1. At the beginning of your script, prepare multiple predictions to do, as many as you’ve done in Streamlit to trigger the memory leak
  2. Run them in a for loop
  3. Add a long time.sleep in the end, and check if you have any memory leak

The goal here is to reproduce Streamlit’s behavior, that is, run multiple predictions using one python process.

If you notice a memory leak with such python script, maybe there are some resources not freed. Maybe some GitHub issues report memory leaks for Detectron2.

I came across this post from @tim (Streamlit Team):

The situation is somewhat similar to yours, and I suppose his reply makes sense in your case as well.

After ruling out that the problem was due to Detectron 2, I have found a reason for the huge memory usage. Because of previous warnings that I mutated cached data I made a deep copy of my data (including images) as was recommending in the warning. Every time the script is executed, a new copy of the cached data was created. When I removed the deep coping, the huge memory usage stopped. However, I still wonder why the deep copies are kept in memory after each execution since there is no reference kept for it in the code I would assume?