Run when idle

Is there a way to annotate the code to run when the page has loaded?

e.g.

import streamlit as st

@st.cache()
@st.run_when_idle()
def load_data(...):
    df = ...
    return df

data = load_data() # not executed yet

st.title('Hello')

st.markdown("# Introduction: ...")
st.write(df) # waits until the rest of the page loads
st.makrdown('''# Conclusion: ...")

Hello @mkleinbort

I’m interested by this thought, but to be sure of what you are trying to accomplish with the given annotation…
Do you want the st.markdown("Conclusion") to be displayed before, while load_data is running in the background, and then when the loading is over, display the dataframe before the # Conclusion part?

For that, you could use the placeholder widget:

import pandas as pd
import streamlit as st
import time


def load_data():
    time.sleep(5) # simulate a 5 seconds download
    return pd.read_csv(
        "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
    )


st.markdown("# Introduction: ...")
container_for_dataframe = st.empty()  # placeholder for your displayed dataframe for later
container_for_dataframe.warning("Loading data") # in the meantime display a warning
st.markdown("# Conclusion: ...")

df = load_data()
container_for_dataframe.dataframe(df) # now data is loaded, overwrite placeholder

test

If I’m mistaken in your problem, don’t hesitate to correct the purpose :slight_smile:

Happy Streamlitin’
Fanilo :balloon:

I think placeholders probably work for the example I gave, but I didn’t express myself well.

I’m building an interactive report and some fairly large objects need to be loaded to offer full functionality.

To illustrate it better:

import streamlit as st

@st.cache
def load_model(model_id:str)->Model: # <- this takes ~3sec to run
    # say something slow happens, like training the model or calculating shap values
    return model

model_ids = ['A', 'B', 'C', ...., 'Z']
model_objects = {}

current_model_id = st.selectbox('What model do you want to use?', options=model_ids])

current_model = model_objects.get(current_model_id, load_model(current_model_id))

# some stuff that lets users use the model.

So, my situation is:

[1] Loading all the objects would make my app unresponsive for ~90 seconds.

i.e. doing
model_objects = {model_id: load_model(model_id) for model_id in model_ids}

would be slow (but not a memory problem)

[2] Loading a particular object takes ~3seconds, so it’d be good to be able to do it ahead of time “when nothing else is happening”

Is this niche? Yes, probably, and caching means it only needs to happen when I restart the app/clear the cache.

Ok, just writing some thoughts here without having tested anything :sweat_smile:

  • If your “load_model” is mostly IO bound, I think that’s a good fit for asyncio (which then makes it more of a Python problem than a Streamlit problem :)). Using a coroutine would load the model in the background when nothing else is happening.
  • Now I don’t know about caching an async method though, there is a draft of something here: Caching asyncio functions but I personally did not have time to play with this. There are other posts relative to asyncio (eventually threading) in Streamlit if you want to check the search bar.

I’ll try to think about this more later on, just wanted to share this that will maybe get your on a new track :wink:

Hi @andfanilo @mkleinbort

The ttl argument for st.cache is what you need.

  • ttl ( float or None ) – The maximum number of seconds to keep an entry in the cache, or None if cache entries should not expire. The default is None.