Keyed widget state persistence - discussion, possible fixes

Hi all!

TL;DR: Take a look at the example app and let us know how it impacts you, and what you think about how to fix this issue!

Discussion on GitHub here: #6074

One of the current Streamlit behaviors that causes confusion for many of us is how widgets with key='foo' lose their session_state value when they are not rendered in a given run. It seems to come up most often with multipage apps that have some state dependency between pages, but also possible in a single page app with conditional rendering. A minimal example:

view = st.radio("View", ["view1", "view2"])

if view == "view1":
    st.text_input("Text1", key="text1")
    "☝️ Enter some text, then click on view2 above"
elif view == "view2":
    "☝️ Now go back to view1 and see if your text is still there"

I’m working with @tim on whether / how to update this behavior so it’s more intuitive, or at least possible to get the intuitive behavior easily.

I put together a simple example app showing the behavior and linking some of the prior issues about it:
https://keyed-widget-state-issue.streamlit.app/

Questions

  • Do you have a full working public app where you had to hack around this (such as duplicating or using “shadow keys”)? Please share with us!
  • Do you have an app or use case where you rely on the current behavior (widget has key set but needs value to reset after not rendering), that will break if we change this? We really want to hear about these! Please share!

Possible fixes

It seems like most developers we hear from find the current behavior very unintuitive, and we see a lot more use cases that have to work around this rather than benefit from it.

  • Given that, should we just do a breaking change to the default behavior to persist, like most of us seem to expect?
  • If so: We should still provide a path to clear keyed widget state when needed. Need to figure out how to best do this.

If not a breaking change: most proposals seem to prefer something like a persist= key. For example, both @whitphx and @mathcatsand propose something like this in #4458. Thoughts on this approach, or other ideas?

8 Likes

Hi @jcarroll I am quite sure that I encountered vanishing key session states during the development of https://adventure.streamlit.app/. My final solution was to use callback and temp storage.

To be honest, final code probably is not the best use case as it “only” clears text input, but hey, you asked for any examples :wink:

def clear(ss_variable):
    st.session_state["temp"] = st.session_state[ss_variable]
    st.session_state[ss_variable] = ""
    directions_container.text_input(
        "What to do?",
        key="introSceneActions",
        on_change=game_def.clear,
        args=["introSceneActions"],
    )
1 Like

My instinct: avoid a breaking change that sets the default behavior to keep keys after a widget is destroyed. Bad scenarios I imagine:

  1. I can imagine having widgets generated in a loop or having some real-time app that is doing a lot of refreshes. If this is combined with having something like a random or time-stamped key to forcibly destroy and recreate widgets over and over, it will create a mess of session state. I know there have been instances on the forum where I have advised “you can set the key to include a timestamp so that it is forcibly destroyed and starts over from the beginning.”

    Here’s an example where that advice came up: Input widgets inside a while loop

    Maybe this was bad advice and manually setting the value of the key would be better, but it is one approach that could have been taken that relies on the current default cleanup. However, if there are a lot of widgets involved, a batch prefix in the form of a timestamp for each key is a really easy way to reset all of them.

  2. There may be cases when different pages use the same key on widgets that are not intended to connect back up. This may especially be the case if each page is a kind of template or variation meant to start fresh, be independent, and maybe have some consistent naming conventions. I can’t imagine how many people were not trying to connect widgets on different pages and have overlap of simple names for keys.

    Certainly there would have a be a carve out for st.button, st.download_button, st.file_uploader, and st.form which cannot be set using st.session_state.

    A recent thread in the forums discussed creating an aggregating app form individual, standalone apps. I would think this case would be ripe to have a conflict with keys forcibly maintained across pages: Multi-page App from Single Apps

  3. This is probably a moot point as “does not apply” or “easily not included,” but I’d want to confirm the case for generated keys. If you did do a breaking change, I imagine by necessity that it could only apply to manually assigned keys and never generated keys. Currently, we get an error with the following on one page:

    st.text_input('A')
    st.text_input('A')
    

    DuplicateWidgetID: There are multiple identical st.text_input widgets with the same generated key.

    If any generated key was kept (extending the unique label restriction across all pages in the absence of a distinct key), I think it would cause a lot of breakage with the same kinds of widgets having the same labels on different pages and no intent to connect them.

For me the default cleanup behavior makes a lot of sense as a safe way to avoid clogging up the system. Sure, it might not be the first thing people think of when they want to carry info between pages, but I think the “clean by default” might be a little underappreciated in favor of the current problem that draws our attention. Hence, I like adding another keyword to change what/how much is exempted from cleanup.

If you want to get super slick about it,

  1. keep the default behavior,
  2. add a keyword to one-off change the behavior,
  3. and provide an option in the app configuration file to set a different default for the whole app.
3 Likes

This might be what you are doing, but I remember multi-selects being especially difficult. I might as well chime in with my solution

Explanation:
Since “non-widget session state key-values” are preserved, and because a “widget session state key” has the value of the current selection, I store the selection as a default session state key and then on each page refresh assign that as the default.
I made use of the “on_change” value of a widet to update a “non-widget session state key-value” to be the same as the “widget session state key-value”. This on_change function callback can be used in general for many wigets, so long as they have a default option.

Place this in a page in a multi-page app.

import streamlit as st

st.header(
    "This is a minimal reproducible example of how to create a persistent multiselect"
)


st.text(
    """
    -If you leave this page and return, note that the multiselect's key,
    "st.session_state["multiselect_key"]" does not exist
    
    -The non-widget key in session state,
    "st.session_state["multiselect_default"]" does, though.
    
    -The multiselect widget has an on_change function
    This stores the multiselect values into a non-widget key 
    When the widget is recreated it will retain previous values selected
    
    -The values selected on the multislsect can be accessed on other pages via 
    "st.session_state["multiselect_default"]" """
)
st.write(st.session_state)

st.header("multiselect")
st.write("On the first run, the default is assigned as empty")
# The multiselect options will persist by updating the default value, but the default value needs to be initialized
if "multiselect_default" not in st.session_state:
    st.session_state["multiselect_default"] = []
# We can have a callback function.
def update_ms():
    st.session_state["multiselect_default"] = st.session_state["multiselect_key"]


st.multiselect(
    label="Persistent Multipage Multiselect",
    options=[2, 3, 5, 7, 9, 11, 13],
    default=st.session_state["multiselect_default"],
    key="multiselect_key",
    on_change=update_ms,
)

st.header("Writing the session state at the end of the page")
st.write(st.session_state)

2 Likes

And if what I wrote is helpful, and you want to expand it further, I generalized that solution into one that elegantly (I hope?) lets you use multiple widgets, at least multiselects. Hopefully it is clear how one could store all the widgets.

import streamlit as st

st.header("Writing session state here. It will be empty on first app run.")
st.write(st.session_state)


multi_selects = [
    {
        "def_key": "def0",
        "sel": [],
        "ms_key": "ms0",
        "options": [0, 1, 2, 3, 4, 5],
        "label": "first multiselect",
    },
    {
        "def_key": "def1",
        "sel": [],
        "ms_key": "ms1",
        "options": [10, 11, 12, 13, 14, 15],
        "label": "second multiselect",
    },
]
for multi_select in multi_selects:
    if multi_select["def_key"] not in st.session_state:
        st.write("multi-select widgets do not have  doesn't exist so setting to empty")
        st.session_state[multi_select["def_key"]] = []

def update_widget(ms):
    st.write(ms)
    st.session_state[ms["def_key"]] = st.session_state[ms["ms_key"]]

st.header("Here we make multi-selects")
for ms in multi_selects:
    # st.write(ms)
    st.multiselect(
        label=ms["label"],
        options=ms["options"],
        default=st.session_state[ms["def_key"]],
        key=ms["ms_key"],
        on_change=update_widget,
        args=(ms,),
    )

st.header("Here we write the session state at the end of each refresh")
st.write(st.session_state)
1 Like

Hi there,

Thank you for considering this change. This issue is probably my biggest usability pet peeve about how Streamlit works right now, so this post might be biased.

Do you have a full working public app where you had to hack around this (such as duplicating or using “shadow keys”)? Please share with us!

Not anything public unfortunately, but all of my older apps have callbacks to save session_state on widget change, like:

def on_change_number():
    st.session_state["number"] = st.session_state["number"]

number = st.number_input("number", key="number", on_change = on_change_number()

or a blank st.session_state.update(st.session_state) between pages.

In my more recent apps, I just stopped using the key parameter altogether. Instead, I save the variables I want saved in st.session_state directly. It’s more predictable and once you factor in correcting for the broken behavior of key, it’s about the same number of lines.

Given that, should we just do a breaking change to the default behavior to persist, like most of us seem to expect?

IMO, unequivocal yes, but then again I am biased as I have never understood the reason for doing it differently in the first place.

The base argument goes something like this: if I put something inside a dict or a list or any other data structure, I would expect it to stay there until it is expliitly removed, not depending on the visual state of the frontend.

The use cases go something like this:

  • An entry form with several steps. You do not expect the previous steps to be cleared just because the user moved on.
  • A complex computation with tons of parameters, grouped by tabs or pages. You do not expect everything to be reset just because you change the page.
  • A conditional widget where you don’t want to make the user re-enter data just because they switched some toggle back and forth.
  • et cetera.

My ideal use pattern would be:

  1. Save something in state,
  2. If you want to remove it at some point, do that yourself.

The current use pattern goes like this:

  1. Save something in state,
  2. Think hard about whether it might possibly vanish silently,
  3. Save it again so that it doesn’t.

Both can be managed, but in the first one, you only need to think about what you save and remove in the code. In the second, you need to also consider all possible side effects of the auto-deletion mechanism, or just save everything twice to be sure.

I would say “explicit is better than implicit” and if someone explicitly provides a key parameter to save the widget state, it should not be implicitly cleared.

Finally, I imagine maintaining one state instead of two separate ones plus a bridge between them would simplify the Streamlit internals a bit? :wink:

If so: We should still provide a path to clear keyed widget state when needed. Need to figure out how to best do this.

To me, this seems like a documentation issue more than implementation issue. If a widget state and session_state were stored in the same data structure, would the normal Python methods not be enough?

I understand the session_state could can get busy and accidental collisions could be one argument for clearing it between pages. In fact recently I’ve taken grouping individual parameters inside data classes and saving those inside session_state. So if this went into effect, maybe grouping session states from different pages into objects would be a good idea. Or maybe the users with few keys could keep going as they do, and users with many keys can make grouping objects themselves.

But namespace pollution, inelegant as it is, takes a backseat to programming logic which IMO is kind of broken right now.

For me, this would work just as well! But I imagine there would be quite maintenance burden with that, and explaining to everyone that widget state is kind of like session state but not really, unless you change some configuration option, would IMO make it even more newcomer-unfriendly.

Edit: the strongest argument I can see against making a breaking change is that it would be, well, breaking. That in itself might be enough not to do it, if there are enough users relying on current behavior, even though personally I would be all for it.

I think a decent compromise would be to include a persist argument ASAP, and to re-work state to only have a single persistent session_state, maybe with page-specific namespaces, in Streamlit 2.0 whenever that comes.

As for the implementation in that case, I would suggest to keep it simple and make it a boolean. There should really be one dependable mechanism to preserve widget state and multiplying the ‘preserve, but only sometimes, and also maybe in some other cases’ options will create even more confusion.

One more point: I think in any case, the documentation needs to be way more explicit about what is saved where. One can get pretty far using Streamlit and never realizing that this issue exists until it bites them.

Consider Multipage apps - Streamlit Docs and Add statefulness to apps - Streamlit Docs. These pages make it sound like widget state can be saved in session_state, and session_state is persisted across pages! In fact the only page I am aware of that makes this distinction is Widget behavior - Streamlit Docs, but it’s not an easy one to find.

1 Like

As I’ve kept mulling over this, a thought occurred to me. Supposing there is an up-welling of support for changing the default behavior, I’d suggest this nuance:


Option A: When someone does a manual assignment of a value in a callback, through initialization, or anywhere in the program to st.session_state['my_key'] then mark ‘my_key’ as protected from deletion upon widget cleanup for the entirety of the session, across all pages. (If a user manually deletes the key, the exemption should also be removed so it’s like it never existed.) However, if a key is only present because it was assigned to a widget, but not manually assigned anywhere, leave the default cleanup procedure unless a setting is declared (at the widget, page, or app level) to say otherwise.


Option B: Alternately, you could make the distinction of whether or not a key was created manually or via a widget, regardless of any read/write commands that happened to it later. Hence, if a user creates a key and then assigns it to a widget, it stays. If a key is created with the widget either through automatic generation or assignment to the key keyword, it gets deleted same as current.


Why?

I see two fundamental reasons why someone might have set a key on a widget.

  1. To access and manipulate the value
  2. To avoid a duplicate widget ID

I hypothesize that it is far more likely in case 1 that a user has some desire to keep that key around, and far less likely in case 2. Furthermore, having some manual manipulation of session state within the code is proof that case 1 applies. (Sure, reading the value from session state could also be a trigger, but I feel that would be a little more confusing describing the paradigm.)

By not jumping all the way to protecting all manually assigned keys, I think it would be a little less jarring of a change, hopefully avoiding some really messy scenarios with loops. I feel like the oddest detail to explain about the current cleanup is that a variable unassigned to a widget on one page will get deleted if you pass through a page with that assigns it to a widget. In that case you have something that is persistent across many pages until you hit that one where you used it with a widget.

For my part, I can work with any kind of default, but I would very much like to be able to set the default in configuration and override it as needed. There I times I want things kept tidy and times where I want to keep something around. I would definitely want the ‘always remove’ and ‘always protect’ options for keys, but this middle ground would also be a nice option to set, though not strictly necessary if we can manually override at the widget level.

I would be very much opposed to adding any more silent conditional behaviors. The API should be simple, predictable and do exactly what it says, no more no less. Hidden side effects are how you get bugs which are hard to find and hard to fix. An object should behave the same no matter how it was initialized; the fact that session_state, at the moment, does not, is exactly the problem.

Edit: having also thought about it some more, I am starting to understand how one could expect that if they bind session_state to a widget, the state entry would vanish with the widget. So a simple boolean persist or keep or something like that could be a good compromise as long as it is explicit, documented and avoids behind-the-scenes shenanigans.

1 Like

Just to continue the stream of thought for the sake of discussion… (i.e. I intend this as a academic discussion rather than a good/bad declaration. :slight_smile: I can work with whatever default, especially if they create the option to switch the behavior in some way. But there are two things in question: 1. what should be the default and 2. what should be the available options.)

In my mind, there is some consistency to differentiating keys created/overwritten/handled via st.session_state.my_key vs key='my_key' within a widget. When you don’t specify the key, the widget has state and indeed some kind of “key” enough to generate a DuplicateWidgetID errors saying the “generated key” is the same. The error exists for the same kind of widget with the same label on the same page, but not for the same kind of widget with the same label on different pages. So, even when I do not touch the key parameter to register a key to st.session_state, there is still something that exists implictly.

Example: This widget retains state even though the key is removed from session state every time before it is rendered. The widget is stateful so long as it is rendered.

import streamlit as st

st.session_state

if 'elephant' in st.session_state:
    st.session_state.clear()

st.session_state

st.number_input('Memory', 0, step=1, key='elephant')

st.session_state

What I mean to say is that the st.session_state we use is just a convenience for working with the real state that lies deeper. It’s a pass-through. So in one sense, it is just a variable like any other and should not be inextricably tied to any widget because it isn’t even that widget’s real state. However, that is exactly how most people are thinking about it. That’s why there is some consistency with the above mentioned middle ground, in my brain at least.

Admittedly, my brain can be a bit different-y. Just a thought. Again, I can personally work with whatever default, but I do like having options.

Edit: And yes, the end result is that a parameter is needed to specify behavior. Since use of key is serving two, potentially distinct purposes (duplicates vs values), we need something to identify the desired behavior, regardless of whatever default is chosen.

1 Like

Thanks all for the great discussion on this so far!! And all the good ideas and examples.

I would be very much opposed to adding any more silent conditional behaviors. The API should be simple, predictable and do exactly what it says, no more no less. Hidden side effects are how you get bugs which are hard to find and hard to fix. - @ennui

I tend to agree with this point pretty strongly.

Configuration
Re: the configuration option to toggle the behavior - yes, I think the maintenance and explainability burden of this would be a big barrier for a feature that is so ingrained into the library. So I don’t think we will do this option.

Widget/Page State?
One other idea that came up in Discord was moving towards a separate widget_state (or maybe page_state) where values were naturally persisted and also scoped by page in multi-page apps - maybe addressable by page_name → widget_key or similar. This might be more natural than trying to combine widget state with developer-defined session state in a single object (as we do today), and allow for a non-breaking change.

We might still need to consider some way to “link” widgets across pages or set “global” widget values or some similar thing for multi-page apps. Not exactly sure the best way to do that. It could also be done like today where you need to use a callback or explicit session_state value.

Also note - we are considering adding an app_state in the future for state that needs to live across sessions (it was originally called st.database in the roadmap but I think this name is better). So it would have some consistency with a few different state objects at different scopes that work similarly.

Curious if folks here have thoughts or reactions to any of these ideas!

I agree here. Just like you guys split st.cache I think it would be a good idea to split the idea of some value to define a widget value and a value we are keeping track of for reruns or to carry over to other pages.

I’m unclear how this could be implemented to avoid a breaking change, though. If widgets are given a distinct way to access and store their values, wouldn’t that break things since the current design puts that information directly into st.session_state by widget key upon widget creation? I think there is more value in a breaking change if it was a fundamental rework vs just changing the “silent” behavior. If you change the syntax people use to access things, it’s more clear that there was in fact a change.

Stuff in session state falls into these categories for me:

  1. Things you want to keep for a whole session
    a. Things I want to access on multiple pages
    b. Things I associate to a specific page
  2. Things I want to keep only while I remain on a page
  3. Widgets

1a and 1b don’t need to be separated in terms of logic flow, but it would be convenient to have access to groups of keys, usually by page but abstract groups would be nice too. If you are only accessing information in session state manually, this is easily done with a dictionary in session state, but that doesn’t play well with widgets since the keyword argument doesn’t let you point to a key within a dict stored in session state.

If widgets are left with their current default cleanup behavior, but their keys are put into st.widget_state instead of st.session_state I think that would be nice structurally but also definitely break a lot of apps. However, by pulling the step of writing information to “the real session state” outside of the widgets direct control it does create the opportunity for much cleaner logic. You wouldn’t need to specialize session state for the whole app vs pages as users could abstractly make any grouping of keys via dictionaries. Users could easily wipe these dictionaries or leave them as is with good naming practices. If they want to carry widget information in session state to persist it longer than its own rendering, it would be manually copied as output or from st.widget_state into the correct key or key_group[key] in session state.

Pros

  • Session state wouldn’t be bloated from widgets dumping information directly into it.
  • A widget would never silently change session state.
  • The widgets would no longer conflict with dictionaries used to group values in session state (whether by page or some other logical grouping); If you want a widget value saved in session state, you can control at what level you store it.
  • You gain the ability to implement “page state” flexibly, without actually reworking the basis of session state.

Cons

  • It creates a little extra work to send the stored information from session state back to a widget

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.