I was making some memory profiling on my application that was using caches, until I realise it wasn’t releasing it.
So I digged a bit to make a simple project and make some tests.
import streamlit as st
import time
import pandas as pd
import numpy as np
from streamlit import caching
import gc
@st.cache(allow_output_mutation=True, ttl=30)
def expensive_computation(a, b):
time.sleep(5)
df = pd.DataFrame(np.random.randn(a, b))
return df
def main():
if st.button("Clear Cache"):
caching.clear_cache()
a = 10000
b = 20
res = expensive_computation(a, b)
st.dataframe(res)
del res
if __name__ == "__main__":
gc.enable()
main()
gc.collect()
At first I was only using the cache decorator with the ttl option. But when a did a memory profiling, the memory didn’t went down after 30 seconds. Each time a was hitting Refresh, the memory was increasing.
So I tried to use caching.clear_cache()
, but memory wasn’t decreasing either.
Same with the garbage collector.
And even removing the cache, each time I hit refresh, the memory kept increasing.
Is this a normal behavior? Because I was expeting that if the user refresh the page, the session was killed, and at least variable were removed by python or streamit. But it seems to keep them in memory. I was hoping it was only momory allocation, that could be reused by the program, but it seems not.
Since I’m using docker to deploy my apps with very limited memory, it often crashes, even with a simple app like above, just by spamming F5.
Here are some profiling I ran, first one is no caching at all, just raw function and display.
As you can se, each time I hit refresh (F5) memory goes up.
Here I have set used the code above, so caching, del
my data, and garbage collector. I even pressed the button to clear cache at aprox 80 seconds.
Is there something i’m doing wrong or missing?