My original problem was posted in the previous thread about the new cache primitives:
though this has been solved by using st.session_state instead.
In my original case, my intention was to create a session-specific cache, and the session state was exactly what is for it.
So currently I don’t have specific problems, but I would like just to say that I’m afraid if there are some edge cases where users want to memoize unpicklable objects.
And I recommend to state that st.session_state can also be an alternative of st.cache for some specific cases in the migration guide.
I have found the singleton and memorized way more intuitive on how to use. like it also as for our team we have some general library/package code that helps users make/use common resources like database connections, etc (which make use of singleton) and users can just worry about for their dashboards dealing with the simpler caching via memorize.
I think one thing in docs that could be even clearer is at what level caching is namely comments like in Expiring the experimental_singleton cache at regular intervals - #6 by ksdaftari could be just made even clearer on that singleton/memorize global caches (not specific to user like session_state). unless I am misunderstanding docs here (I say this as know have to train some of people making dashboards to be careful when actually cache especially if output that could be different for users, eg would be if hitting api that could have differing authorization depending on user).
Do you like st.experimental_memo and st.experimental_singleton?
yes, personally I only use the memoization part of st.cache, so having the memoization and the singleton-pattern parts in separate functions is very nice, it’s way faster. I also love that one can finally clear the cache in code.
What do you dislike?
nothing so far
Are there any killer features in st.cache that your apps can’t live without?
Do you understand when to use memo and when to use singleton or is this confusing?
I understood it after reading the documentation once, to me it was clear, and I started using streamlit a month ago!
I hope that st.cache gets deprecated, I still don’t understand why st.cache is still accessible in the current version if there is no way to clear it in the code.
I think this is a very good practice, but we need to carefully decide whether to discard st Cache, because frequent changes to mainstream functions will make people untrustworthy, and the cost of maintenance will increase accordingly. Thank you
Use case: I have many data gathering functions, and when I clear caches I don’t need ALL of them refreshed.
I am using memo_decorated_function.clear() which seems to work for that specific function’s cache alone. Is that correct?
If not, could we have a unique key for each cache decorator, which would then allow each cache to be cleared individually. The key name could default to the name of the decorated function, or overridden with a key parameter.
Functionally, the current code solves the most important cases, but with some pains. A function init(str,str) is marked as singleton in my app, but streamlit runs it 6 times before crashing due to OOM. 2 tabs were open and running init(). It looked frozen, so I refresh. Init() finishes and loads the page. Opening a new tab calls init again(). You get the idea. I look forward to the next version. Thank you all.
I’ve forgotten all about st.cache, however the name was far better and the organization is now confusing. I recommend asking users about the naming that will follow. st.cache, st.memo, st.singleton, st.session_state. The concern is that users will struggle to remember nuances and the names are not distinctly clear what they do. The challenge that you will run into is that weak programmers will want to use Streamlit. I don’t mean the kids. Those who understand state machines, multi-processing, and multi-tenancy will breeze through the docs, but imagine a data analyst, business intelligence, marketing, and sales dept of a company. They have math skills more than architecture design. You will have to educate them in the way of if-then to provide the basics. This suggestion is also me asking for help from you. Over the last 2 years, I presented many demos and prototypes using streamlit. People always want to run it locally. For every 20 technical people, there will always be someone asking for help. The suggestions is to remove the interface for memo and singleton. Leave two interfaces to the same backend variables, a wrapper and a normal function or dict-like. In both cases, have variables to indicate rules (e.g. the hashable key =tuple(global/local, user UUID, func name, args), enable serialize result, compress serialized result, FIFO length, is global or local key, max RAM, max disk, purge or raise error on OOM, expiration timestamp, …). With the rules explicitly set, validation should become easy. You could then give better error messages for each scenario. Then, no one needs to know how it works. In my proposal, users simply communicate the expectation through variables, and your code would infer the best way to get there. Future versions would maintain the same interface while the backend gets upgraded. Raise an exception if the request doesn’t make sense. TLDR: Suggest changing API to
st.cache(..., serialize_result=True, compression='DEFLATE', queue='FIFO', history_length='inf',
is_local=False, max_ram='inf', min_free_ram='1g', max_disk='1g', raise_on_oom=False,
ts_expiration=time.now()+'1d', can_purge=True, verbose=2) and st.session/st.globals as dict-like
Delete all the others marked as experimental.
I really like this interface, but the functionality was the problem. You isolated the problems into modules. I look forward to all of these unified.
Thanks for the feedback everyone! Our main takeaway from here and other talks with users was:
Splitting caching into two separate decorators is the right way to go!
The names memo and singleton are too difficult to understand for a lot of users.
So the solution we’re now leaning towards is:
Rename st.experimental_memo to st.cache_data. This command should be used to cache any data objects, e.g. pandas dataframes, numpy arrays, str/int/float, or lists and dicts containing such data objects. Example use cases are dataframe transformations, API queries, ML inference, etc. Behavior will stay the same as for st.experimental_memo, i.e. you always get a fresh copy of the return object at every rerun. This is also the default command you should use in 90% of all cases.
Rename st.experimental_singleton to st.cache_resource. This command should be used to cache any global resources that will be shared across all reruns and sessions, e.g. database connections or ML models. For example if you’re initializing a connection or loading an ML model from disk. We’re also working on a more specific st.connection command, which will allow you to connect to databases in a single line of code and should abstract away caching and similar details (see our roadmap blog post). We’re also thinking if we can do something similar for initializing ML models (e.g. an st.model – comment if you have ideas!). In the long run, we see st.cache_resource as an advanced command that most users won’t need to touch.
Do a much better job in the docs to explain these two commands, what their differences are, and in which situation you should use what.
We are now implementing the new commands and a few other adjustments. We want to release them in December/January and will start the deprecation of st.cache then. We’re doing the deprecation in a very very careful way! Specifically, we won’t remove st.cache at least until 2.0 to prevent breakage and we’ll give a lot of guidance (both in the app and in the docs) on how to move over to the new commands. In most situations, it should just be a small name change and you’re good to go.
“”“>>>will let you connect to external databases and APIs with a single line of code. >>>st.database will launch a small database alongside every Streamlit app, so >>>you can permanently store data without any setup.”“”
Please think or give a thought of using duckdb or redis cache for the small db of streamlit… It would be helpful in many and a lot of other scenarios…
A friendly thought/suggestion…
I’ve seen a few users looking for a user-specific cache (across-sessions) and session-specific cache. They’re doing hacks to accomplish this, and if they’re building multitenant apps, risking leaking data/it’s quite unsafe.
Strictly necessary cookies
These cookies are necessary for the website to function and cannot be switched off. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms.
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us understand how visitors move around the site and which pages are most frequently visited.
These cookies are used to record your choices and settings, maintain your preferences over time and recognize you when you return to our website. These cookies help us to personalize our content for you and remember your preferences.
These cookies may be deployed to our site by our advertising partners to build a profile of your interest and provide you with content that is relevant to you, including showing you relevant ads on other websites.