Often Streamlit data apps can take a while to load when its a “cold start”, as in the TTL for your @st.cache_data functions have expired and the user may need to wait through a refresh cycle. The proper way to solve this problem would probably be to spend the time making a hot-store database that gets refreshed via some orchestrator that Streamlit then reads from… however if you are just making a quick Streamlit page, this hacky solution may just help you!
Method: Use headless browser via selenium on a schedule to visit your Streamlit app(s) to trigger @st.cache_data refreshes.
Py script:
from selenium import webdriver
import time
# Globals
PORT = 8501 # Default port for your streamlit app
# Configs
options = webdriver.FirefoxOptions()
options.add_argument('--headless') # Run in headless mode
def refresh_page(url, duration_buffer):
"""
url (str): path to streamlit app e.g. "http://localhost:{PORT}/" or "http://localhost:{PORT}/my_specific_page"
duraition_buffer (int): The amount of time the headless browser will wait on the page for the refresh, this should be greater than the expected time for the cache to load.
"""
with webdriver.Firefox(options=options) as driver:
# Open URL
driver.get(url)
# Give time for page to load
time.sleep(duration_buffer)
# Example usage for 1 page:
url = "http://localhost:{PORT}/my_st_page_with_a_cache"
duration_buffer=30 # seconds
print(f'refreshing page {url} ... @ {duration_buffer} seconds', flush=True)
refresh_page(url, duration_buffer)
Then kick off the py script with a crontab scheduled bash command on the same machine that the Streamlit app is running.
Note: If your Streamlit app is pulling a lot of data into memory I recommend using st.cache_data(persist=’disk’)