Newbie trying to understand data_editor

Hi everyone! Hope you getting a good night… anyways,
Im having trouble understanding streamlit for a more complicated use-case than just show a plot or a dataframe.

Basically the app is one that receives some invoices images uploaded by the user manually, they go into a LLM call to GPT-4 vision that returns a json for each image. Basically ending with a array of json. Then when the image processing ends, a dataframe is shown but I can’t make it editable without the entire app re-rendering again. I’m lost into this sea of session-state over cache and vice-versa. What Im a doing wrong? Is this not the use-case for streamlit even for a simple app like this? I just want to see the changes reflected without re-render the entire app again and starting with the unedited json

I feel I’m almost there but cant find a solution yet. If someone can point to me where I should make code changes would be great.

This is a json example:

[
  {
    "date": "2024-02-22",
    "invoice_identifier": "",
    "spend_detail": "ELABORACION PROPIA",
    "payment_method": "Cash",
    "amount": 6780,
    "currency": "ARS",
    "file_name": "IMG_1173.jpg"
  },
  {
    "date": "2024-02-11",
    "invoice_identifier": "",
    "spend_detail": "Coca Cola Pet 1.5 L",
    "payment_method": "Credit",
    "amount": 2200,
    "currency": "ARS",
    "file_name": "IMG_1171.jpg"
  }
]

And here is some code if someone is willing to help!

def load_dataframe(data):

    return pd.DataFrame(data)


def init_uploaded_images_state():
    if 'uploaded_images' not in st.session_state:
        st.session_state.uploaded_images = []


def render_fixed_fund_form():
    init_uploaded_images_state()
    uploaded_files = st.file_uploader("Upload your receipts", type=[
                                      'jpg', 'jpeg'], accept_multiple_files=True, label_visibility='visible')

    # Display thumbnails of uploaded images
    if uploaded_files:
        st.session_state.uploaded_images = uploaded_files
        cols = st.columns(len(uploaded_files))
        for col, uploaded_file in zip(cols, uploaded_files):
            # Adjust width as needed
            col.image(uploaded_file, caption=uploaded_file.name)

    if st.button("🚀 Process Uploaded Images 🚀"):
        if st.session_state.uploaded_images:
            process_images(st.session_state.uploaded_images)
        else:
            st.warning("Please upload at least one image before processing.")

def display_dataframe(df):
    edited_df = st.data_editor(df, key="my_key", num_rows="dynamic", hide_index=True)
    # Optionally, save the edited DataFrame back to session state if necessary
    st.session_state['processed_data'] = edited_df

    st.divider()
    st.write("Here's the value in Session State:")
    if "my_key" in st.session_state:
        st.write(st.session_state["my_key"])

def process_images(uploaded_images):
    # Only process if there's no processed data already
    if 'processed_data' not in st.session_state:
        with st.spinner("Processing images with AI, please wait... this can take a moment.. or two."):
            json_array = []
            for uploaded_file in uploaded_images:
                pil_image = Image.open(uploaded_file)
                img_base64 = convert_image_to_base64(pil_image)
                response_from_llm = get_json_from_llm(img_base64)
                response_dict = json.loads(response_from_llm)
                response_dict['file_name'] = uploaded_file.name
                json_array.append(response_dict)

            df = pd.DataFrame(json_array)
            st.session_state['processed_data'] = df  # Save processed DataFrame in session state

            st.subheader("JSON:")
            st.json(json_array)
        st.success("Processing complete! 🌟")
    else:
        df = st.session_state['processed_data']  # Retrieve the DataFrame from session state

    # Now, use df for further operations
    display_dataframe(df)

You need to read and understand the main concept of streamlit.

Things in the future are brighter, partial/isolated code reruns had been implemented and under tests.

The roadmap shows:

If you put the data_editor inside a form, the user can interact with it without rerunning.

1 Like

Thank you ferdy! Good to know that this is on development!