Combine st.cache_data with st.session_state

Hi,
i want to cache a function, but without passing the args everytime. Instead, getting them from session_state. Thereā€™s any native way to do this?

current behavior

@st.cache_data
def get_organization(user):
    return "organization from the user"

# write this 2 lines, every time in the app
user = st.session_state['user']
organization = get_organization(user)

desired behavior

@st.cache_data_state
def get_organization(user):
    return "organization from the user"

# the user is inherited from session_state automatically
organization = get_organization()

proposal
something like below will do. But iā€™m wondering if thereā€™s already something native.

def cache_data_kwargs(state_keys: list, **cache_options):
    def decorator(function):
        cached_function = st.cache_data(**cache_options)(function)

        def wrapped_function(**function_args):
            state_values = {key: st.session_state[key] for key in state_keys}
            state_values.update(function_args)
            return cached_function(**state_values)

        return wrapped_function

    return decorator


@st.cache_data_kwargs(['user'])
def get_organization(user):
    return "organization from the user"

organization = get_organization()

Session state is a cache of sorts (session specific), so not sure what your intention is trying to ā€œadd caching to a cacheā€. I think you should revisit this requirement in your design and ask yourself why, and if the implementation is appropriate. Session state is intended to be mutated in different contexts throughout your application, whereas data cached functions are intended to encapsulate state that isnā€™t. Tying the two concepts together is a code smell in my opinion.

Given an app in a server, can be access by multiple users.

Cache_data is unified across all users. While session_state is particular to a session (different state in different tabs)

What i want to do is cachƩ certain functions, but not overall, but by user instead.
Following example above, i donā€™t want to compute ā€œget_organizationā€ every single time (imagine needs to access a slow DDBB, and is called several times in the app).
Neither i want to return same organization for all users (thatā€™s a data leak)

So, to get this, what i need to do is

email=st.session_state['email']
get_organization(email)

And my request, is to see if thereā€™s a less verbose way to do it (in a single line, instead of multi-lines every time)

Here, i see (maybe) the problem ā€œexplicit is better than implicitā€. But not the code smell, maybe iā€™m missing something?

Why doesnā€™t the simple current behavior work then? Just wrap the two lines in an auxiliary function call.

def get_organization():
  # 24 hour lifetime
  @st.cache_data(ttl=86400)
  def _get_organization(user):
      return "organization from the user"
  user = st.session_state['user']
  organization = _get_organization(user)
  return organization

organization = get_organization()
1 Like

It does work.

My question is if thereā€™s a native decorator (like cache_data_kwargs above)
to do this (or itā€™s planned).
So you donā€™t need to write a func for each _func

Perhaps this will work because the func params is already a kwargs dict:

@st.cache_data()
def get_organization(email: str = st.session_state["user"]):
    assert email, "user email must be in session state"
    return "organization from the user"

organization = get_organization()

doesnā€™t seem so.

Script

import streamlit as st

st.session_state["x"] = 'a'

@st.cache_data
def ff(x: str = st.session_state['x']):
    print("run")
    return x

st.write(ff())
st.session_state["x"] = 'b'
st.write(ff())

returns

a
a

instead of (a,b)

I tried with hash_funcs param on cache_data and that didnā€™t work, so apparently itā€™s not possible to have a param(s) in a cached function and then expect to be able to call it without any. The cache_data decorator has no new information to decide that its cache should be busted.

from functools import cache

@cache
def ff(x:str=st.session_state['x']):
     return x is None

How about this ?

P.S: not sure whether this would work for multiple users as streamlit is leveraging multiple threads