Understanding streamlit data flow and how to submit form in a sequential way

Below is a simple reproducible example that works to illustrate the problem in its simple form. You can jump to the code and expected behaviour as the problem description can be long.

The main concept

There are 3 dataframes stored in a list, and a form on the sidebar shows the supplier_name and po_number from the relevant dataframe. When the user clicks the Next button, the information inside the supplier_name and po_number text_input will be saved (in this example, they basically got printed out on top of the sidebar).

enter image description here

Problem

This app works well when the user don’t change anything inside the text_input, but if the user changes something, it breaks the app. See below pic for example, when I change the po_number to somethingrandom, the saved information is not somethingrandom but p123 from the first dataframe.

enter image description here

What’s more, if the information from the next dataframe is the same as the first dataframe, the changed value inside the text_input will be unchanged for the next display. For example, because the first and second dataframe’s supplier name are both S1, if I change the supplier name to S10, then click next, the supplier_name is still S10 on the second dataframe, while the second dataframe’s supplier_name should be S1. But if the supplier name for the next dataframe changed, the information inside the text_input will be changed.

Justification

If you are struggling to understand why I want to do this, the original use for this is for the sidebar input area to extract information from each PDFs, then when the user confirms the information are all correct, they click next to review the next PDF. But if something is wrong, they can change the information inside the text_input, then click next, and the information of the changed value will be recorded, and for the next pdf, the extracted information should reflect on what the next pdf is. I did this in R shiny quite simply, but can’t figure out how the data flow works here in streamlit, please help.

Reproducible Example

import streamlit as st
import pandas as pd

# 3 dataframes that are stored in a list
data1 = {
    "supplier_name": ["S1"],
    "po_number": ["P123"],
}
data2 = {
    "supplier_name": ["S1"],
    "po_number": ["P124"],
}
data3 = {
    "supplier_name": ["S2"],
    "po_number": ["P125"],
}
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df3 = pd.DataFrame(data3)

list1 = [df1, df2, df3]

# initiate a page session state, every time next button is clicked
# it will go to the next dataframe in the list
if 'page' not in st.session_state:
    st.session_state.page = 0

def next_page():
    st.sidebar.write(f"Submitted! supplier_name: {supplier_name} po_number: {po_number}")
    st.session_state.page += 1

supplier_name_value = list1[st.session_state.page]["supplier_name"][0]
po_number_value = list1[st.session_state.page]["po_number"][0]

# main area
list1[st.session_state.page]

# sidebar form

with st.sidebar.form("form"):
   supplier_name = st.text_input(label="Supplier Name", value=supplier_name_value)
   po_number = st.text_input(label="PO Number", value=po_number_value)
   next_button = st.form_submit_button("Next", on_click=next_page)

Expected behaviour

The dataframe’s info are extracted into the sidebar input area. The user can change the input if they wish, then click next, and the values inside the input areas will be saved. When it goes to the next dataframe, the values inside the text input will be refreshed to extract from the next dataframe, and repeats.

The main things that your example is missing are:

  1. Keeping track of the data from each page in st.session_state so that it doesn’t get cleared on every interaction
  2. Actually updating that data when the form is submitted

Here is a simplified example of your app, just using dictionaries instead of dataframes

import streamlit as st

# 3 dataframes that are stored in a list
data1 = {
    "supplier_name": "S1",
    "po_number": "P123",
}
data2 = {
    "supplier_name": "S1",
    "po_number": "P124",
}
data3 = {
    "supplier_name": "S2",
    "po_number": "P125",
}

if "list1" not in st.session_state:
    st.session_state["list1"] = [data1, data2, data3]

# initiate a page session state, every time next button is clicked
# it will go to the next dataframe in the list
if "page" not in st.session_state:
    st.session_state.page = 0

current_item = st.session_state["list1"][st.session_state.page]

# main area
current_item

# sidebar form

with st.sidebar.form("form", clear_on_submit=True):
    supplier_name = st.text_input(
        label="Supplier Name", value=current_item["supplier_name"]
    )
    po_number = st.text_input(label="PO Number", value=current_item["po_number"])
    if st.form_submit_button("Next"):
        item = st.session_state["list1"][st.session_state.page]
        item["supplier_name"] = supplier_name
        item["po_number"] = po_number
        # This is just to make sure you don't go to a page that doesn't exist
        st.session_state.page = (st.session_state.page + 1) % len(
            st.session_state["list1"]
        )
        st.experimental_rerun()

st.write("All data:", st.session_state["list1"])
1 Like

Further looking and testing the code, I found below points.

  1. clear_on_submit=True is the key to refresh the form
  2. st.experimental_rerun() is crucial, but I don’t quite understand why this needs to be there to force the app to rerun from top to bottom. As in the documentation, Streamlit apps have a unique data flow: any time something must be updated on the screen, Streamlit reruns your entire Python script from top to bottom. So why didn’t it rerun from top to bottom, given the variables need to be displayed have changed?
  3. Updating that data when the form is submitted is not necessary (I didn’t explain it clearly my bad), because I will save that data in a excel spreadsheet rather than updating the original table. However it is interesting that, the values will get saved when doing if st.form_submit_button but when I try to save the data using a function below, then assign this to on_click, it simply won’t save the new value, given in the function, I told it to save the value before going to the next page st.session_state.page += 1
def next_page():
    st.session_state.dict1["supplier"] = supplier_name
    st.session_state.dict1["po_number"] = po_number
    st.write(f"{st.session_state.dict1['supplier']} {st.session_state.dict1['po_number']}")
    st.session_state.page += 1

I think the reason your next_page function isn’t working well is that the only reliable way to have access to the latest version of widget values inside a callback is by getting them from session state. You can do this by adding a key to each widget, and then using that key to get them from session state. If you do this, then you can use a callback, and it eliminates the need for experimental_rerun.

def next_page():
    item = st.session_state["list1"][st.session_state.page]
    item["supplier_name"] = st.session_state["supplier_name"]
    item["po_number"] = st.session_state["po_number"]
    st.session_state.page = (st.session_state.page + 1) % len(st.session_state["list1"])


with st.sidebar.form("form", clear_on_submit=True):
    supplier_name = st.text_input(
        label="Supplier Name", value=current_item["supplier_name"], key="supplier_name"
    )
    po_number = st.text_input(
        label="PO Number", value=current_item["po_number"], key="po_number"
    )
    st.form_submit_button("Next", on_click=next_page)

st.write("All data:", st.session_state["list1"])
1 Like

I see, thank you so much, have learned a lot today. So the key inside the widget basically created a session state for the value, and can be called later. I thought it’s the same as the variable name.

I’ve got one last question in regards to the title. How does st.experimental_rerun() affect @st.cache ? Is @st.cache still working as intended or got refreshed as well?

Hi @subaruspirit,

Good question – st.experimental_rerun doesn’t affect the cache.

By the way, if you have a new version of streamlit, you should now use st.cache_data or st.cache_resoure rather than st.cache Caching - Streamlit Docs

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.