Updating starting dataframe in experimental_data_editor

Summary

I get unexpected behavior when choosing the starting dataframe for editable dataframe. When starting from blank, I can edit the dataframe as intended and it “stick.” When loading data instead, I can edit the dataframe but it doesn’t “stick” and instead returns the original loaded data.

Steps to reproduce

Code snippet:



import streamlit as st
import pandas as pd
import numpy as np

# User-defined create mode
df_choice = st.radio(
    label='Choose your destiny...',
    options = ['Create New','Modify Existing']
)

# Update starting df by create modes
if df_choice == 'Create New':
    starting_df = pd.DataFrame({
        'A': pd.Series(dtype=str),
        'B': pd.Series(dtype=float),
        'C': pd.Series(dtype=float)
    }).reset_index(drop=True)

if df_choice == 'Modify Existing':
    starting_df = pd.DataFrame({
        'A': ['a','b','c','d'],
        'B': np.random.randint(4),
        'C': np.random.rand(4)
    })

edited_df = st.experimental_data_editor(
    data = starting_df,
    num_rows='dynamic',
    use_container_width=True
    )

st.write('df at end:',edited_df)

If applicable, please provide the steps we should take to reproduce the error or specified behavior.

Expected behavior:

When choosing “Modify Existing”, one should be able to add data and return the new, edited dataframe.

Actual behavior:

Works as intended with Create New, but when Modify Existing any added data just returns the original loaded data.

Debug info

  • Streamlit version: 1.20.0
  • Python version: 3.10.10
  • Using Conda
  • OS version: Windows 10
  • Browser version: Chrome 113.0.5672.64 (Official Build) (64-bit)

You are randomly creating the starting dataframe. As such, it will be a new, random dataframe every time the page loads with ‘Modify Existing’ selected. You need to randomly generate it once and store it in session state.

Try starting your script with:

import streamlit as st
import pandas as pd
import numpy as np

if 'df' not in st.session_state:
    st.session_state.df = pd.DataFrame({
        'A': ['a','b','c','d'],
        'B': np.random.randint(4),
        'C': np.random.rand(4)
    })

# User-defined create mode
df_choice = st.radio(
    label='Choose your destiny...',
    options = ['Create New','Modify Existing']
)

# Update starting df by create modes
if df_choice == 'Create New':
    starting_df = pd.DataFrame({
        'A': pd.Series(dtype=str),
        'B': pd.Series(dtype=float),
        'C': pd.Series(dtype=float)
    }).reset_index(drop=True)

if df_choice == 'Modify Existing':
    starting_df = st.session_state.df

Yes, the app now functions as intended, thanks. There is something I am misunderstanding about how the data editor reruns. 1) When choosing “Create New” why isn’t the dataframe overwritten each time? i.e. I can add/edit and it isn’t overwritten. 2) Is there a need to write the edited df back to session state in order to keep it for the next rerun? If I add st.session_state.df = edited_df to the end of the script it doesn’t work as intended.

Please help me understand the flow when experimental_data_editor is edited.

When you use ‘Create New’ you are reliably getting an empty dataframe with just the column labels. Rerunning the page doesn’t change what that empty dataframe looks like. The problem with the randomly generated one is the actual “random” part; it would have different values/data with each page load and thus create a new data editor with each page load.

So long as you don’t change what you have passed to the data keyword parameter, the data editor will remain stateful and keep track of the edits. If you want to save those edits so you can navigate to another page, or retain the info while passing something different to data, then yes, you would need to push the edited dataframe to session state in some way.

Whenever you are changing some creation parameter of a widget (in this case the dataframe passed to data), you have to carefully work through the order of operations.

Consider:

import streamlit as st
import pandas as pd

if 'df' not in st.session_state:
    st.session_state.df = pd.DataFrame([0])

edited_df = st.experimental_data_editor(st.session_state.df)

st.session_state.df = edited_df
  1. The page initializes and we have a dataframe with a single cell having value 0.
  2. You change the value 1.
  3. The page reloads with the dataframe in session state still have value 0.
  4. When the script gets to the data editor, the data input still has 0, but the edited output now has 1.
  5. The script ends by overwriting the dataframe in session state, so it now has 1 as well.
  6. Now you change the value to 2.
  7. The page reloads with the dataframe in session state having value 1.
  8. When the script gets to the data editor, it sees that there is now a new value for data. Hence the state of the previous data editor is discarded (bye-bye 2), and you get a new data editor starting fresh with a value of 1.

Here is an example with an added step to get around this:

2 Likes

@mathcatsand Thank you so much for the patient explanation and workaround example! In my full app I intend to allow the user to change the starting data, so the example is perfect. I had not internalized the editor is being recreated each time the incoming data changes.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.