Weird session-state behaviour when storing a list of dictionaries

Hi everyone, I recently started to use streamlit and came accross a behaviour that I donā€™t really understand, basically, Iā€™m trying to create json files (so, python dicts), with as many keys/values as the user wants for each dict, the dicts are then stored in a list.
I first use a list stored in the session state to store the different key/value pairs and allow the user to add more of them, then I turn my list (and another value thatā€™s just input once globally, so no session state there) into an actual dict, which I then store in another session state list. Hereā€™s the code (stripped down for convenience, but the issue is still there):

import streamlit as st

if 'memoire' not in st.session_state:
    st.session_state['memoire'] = []
if 'var_list' not in st.session_state:
    st.session_state['var_list'] = []

variable_name = st.text_input('Variable name', value='variable_test')
variable_type = st.selectbox('Variable type', ['string', 'bool', 'float'])
variable_required = st.checkbox('Required ?')

add_to_list = st.button('Add variable')
if add_to_list:
    st.session_state.var_list.append({variable_name: {"type": variable_type, "required": variable_required}})

names = []
for x in st.session_state.var_list:
    for key, value in x.items():
        names.append(key)

op1_name = st.selectbox('pick a variable', names)

schema = {}
for x in st.session_state.var_list:
    schema.update(x)
schema[names[0]]['custom_rules'] = {}
schema[names[0]]['custom_rules']['operations'] = [{'variable': op1_name}]

display_current = st.checkbox('display current schema')
if display_current:
    st.json(schema)

save_to_state = st.button("save schema to state")
if save_to_state:
    st.session_state['memoire'].append(schema)
st.json(st.session_state.memoire)


I would expect that, after saving one complete dict (what I call a schema in my code), it would just stay there and that newer schemas would simply be added to the list, but thatā€™s not the case, if you run the app and add 2 variables, say ā€œaā€ and ā€œbā€, then save to state, you can see that when you change the input in the ā€œpick a variableā€ field, the change also happens in ā€œmemoireā€ (displayed at the bottom of the app) :


letā€™s set the ā€˜pick a variableā€™ selectbox to ā€˜bā€™ and :

Furthermore, if you try to save several schemas, youā€™ll notice that they are always identical (and that changing ā€˜pick a variableā€™ edits all of them at once).
It seems to me that the ā€˜memoireā€™ variable is stored as a list of the variable ā€˜schemaā€™, instead of being a list of values ā€˜schemaā€™ held when it was saved, but thatā€™s just a random guess. Is this behaviour intended in streamlit, and does anyone know of a way to get around this ?
Thanks in advance !

1 Like

Okay so while investigating the issue I came across another unexpected behaviour which, i guess, causes my issue : In the following code

import streamlit as st

if 'var_list' not in st.session_state:
    st.session_state.var_list = []

variable_name = st.text_input('Variable name', value='variable_test')
variable_type = st.selectbox('Variable type', ['string', 'bool', 'float'])
variable_required = st.checkbox('Required ?')

add_to_list = st.button('Add variable')
if add_to_list:
    st.session_state.var_list.append({variable_name: {"type": variable_type, "required": variable_required}})

names = []
schema = {}
for x in st.session_state.var_list:
    for key, value in x.items():
        names.append(key)
    schema.update(x)
st.write(schema)
op1_name = st.selectbox('pick a variable', names)
schema[names[0]]['custom_rules'] = {} 
schema[names[0]]['custom_rules']['operations'] = [{'variable': op1_name}]
"variable list"
st.write(st.session_state.var_list)
"schema"
st.write(schema)

if you launch the app then add a variable, youā€™ll notice that st.session_state.var_listā€™s unique element has a ā€œcustom_rulesā€ key, as well as ā€œoperationsā€ and ā€œvariableā€ ones, thatā€™s of course not supposed to happen since the only time something is added to var_list is when add_to_list is clicked, in which case something of the shape {variable_name: {"type": variable_type, "required": variable_required}} is added. I messed around with some print statements and found out that the extra keys are added to var_list when theyā€™re added to schema, i.e. in

schema[names[0]]["custom_rules"] = {} 
schema[names[0]]["custom_rules"]["operations"] = [{"variable": op1_name}]

Why does this happen ? The only link between schema and var_list is that schema is created by taking the elements of var_list and using the update function, which, as far as Iā€™m aware, should not ā€œlinkā€ the two variables together.
If anyone understands whatā€™s happening Iā€™d really appreciate some explanations.
Thanks

I usually get weird ā€˜linkedā€™ behaviour such as this when working with dictionaries. The way I deal with it is to create a deepcopy of the original list/dictionary, perform my operations on the copy and delete the copy at the end. You could try something like:

import copy

var_list_copy = copy.deepcopy(st.session_state.var_list)
names = []
schema = {}

for x in var_list_copy:
    for key, value in x.items():
        names.append(key)
    schema.update(x)

# your other code here, and at the end;

del var_list_copy
del names, schema # delete these too since it looks like it's only the 'var_list' that needs to be persistent.

This way, there will be no ā€˜linksā€™ between var_list and any other operations you perform on its data. I admit itā€™s a wierd behaviour but using copy.deepcopy() works for me every time.

2 Likes

This works perfectly, thanks !

Glad to help.