Adding rows in st.data_editor from loaded dataframe

Summary

Hey guys,

I am using st.data_editor and save the resulting dataframe as a json file.
When I load the dataframe with st.file_uploader and want to edit it with st.data_editor again, I can only edit the number of rows that have been in the loaded dataframe. I can add a new row and put some inputs in it, but it won’t be saved in the data_editor, it will only be displayed in the GUI.
Is there a solution to this problem?

Thanks for your help!

import streamlit as st
import pandas as pd
import json

with st.expander("Load Data"):
    uploaded_file = st.file_uploader("Choose a file")
    if uploaded_file is not None:
        loaded_df = pd.read_json(uploaded_file)
        if 'loaded' not in st.session_state:
            st.session_state.loaded = True
            st.session_state.df = loaded_df

# initialize Dataframe
if 'df' not in st.session_state:
    st.session_state.df = pd.DataFrame({'name': ["Name 1", "Name 2"]})

# create editable Dataframe 
edited_df = st.data_editor(
    st.session_state.df,
    use_container_width=True,
    num_rows="dynamic",
    column_config={
        "name": "Name",
    },
    hide_index=True,
)
print(edited_df)
savedata = edited_df.to_dict()

if st.button(label="Save Data"):
    st.success("Data saved")
    with open("Save_dataframe.json", "w") as data:
        json.dump(savedata, data)

If applicable, please provide the steps we should take to reproduce the error or specified behavior.

Expected behavior:

Add and edit new rows in the edited dataframe.

Actual behavior:

Added rows and edits will only be saved in the variable edited_df for the number of rows of the loaded dataframe. E.g. if you loaded a dataframe of 3 rows, only 3 rows will be editable. Although added rows are displayed in the GUI, you can’t adress them in the code.

Debug info

  • Streamlit version: 1.24
  • Python version: 3.11
  • Using Pycharm
  • Browser version: Chrome and Edge
2 Likes

Thanks for reporting, @Betze!

I’m tagging @lukasmasuch, who should be able to comment on that potential issue. :slight_smile:

Thanks,
Charly

@Charly_Wargnier data_editor is a nice widget, start using it today,
but observed the same issue as @Betze reported.

To reproduce, I used the sample code from API doc

import streamlit as st
import pandas as pd
st.info(st.__version__)
df = pd.DataFrame(
    [
       {"command": "st.selectbox", "rating": 4, "is_widget": True},
       {"command": "st.balloons", "rating": 5, "is_widget": False},
       {"command": "st.time_input", "rating": 3, "is_widget": True},
   ]
)
edited_df = st.data_editor(df, key="demo_df", num_rows="dynamic", hide_index=False)

favorite_command = edited_df.loc[edited_df["rating"].idxmax()]["command"]
st.markdown(f"Your favorite command is **{favorite_command}** 🎈")

st.subheader("Edited df")
st.write(edited_df)

st.subheader("Edited rows")
edited_rows = st.session_state["demo_df"].get("edited_rows")
st.write(edited_rows)

in above screenshot, index=0 row was deleted
index=[3,4] rows were added

Currently

  1. edited_rows only stored updates to original df rows, not rows added/deleted in UI
  2. edited_df stores rows shown in UI (including inserted new row, or missing row if deleted)

One workaround is to do set-op between the original df and edited_df to figure out inserted/deleted rows, then merge with edited_rows.
However, it would be nice if edited_rows can store all the changes (insert, delete, update)
Thanks

1 Like

Hello wgong,

I am not sure you have the same problem as Betze. You can access added and deleted rows too, by changing the second to last line in your code to
edited_rows = st.session_state["demo_df"]
I assume it is intended that “edited_rows” does not contain added and deleted rows.

But that Betze’s “edited_df” can not be saved correctly seems like a bug to me.

1 Like

Thank you @hEAkahEq
it worked just as you described, wish the documentation could improve by adding something like

the added/deleted/updated rows information is stored in st.session_state[<data_editor_key>]

In case anyone has a similar problem:
a workaround is to reset the index of the dataframe before giving it to data_editor.

st.session_state.df.reset_index(drop=True, inplace=True)

data_editor expects a dataframe with a “RangeIndex”.

2 Likes

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.