Update Cache value after Pandas DF manipulation

Hi all,
Iā€™m relatively new to this awesome tool and would like to understand if it is possible to update the cache value with the new value.

Background:
I am developing a tool where data is loaded from the database, displayed using Ag-Grid, where the user can manipulate the data on the grid, and upon changes, the data should then overwrite the cached data with the new data so that whenever the page is refreshed, ā€œaddress_frameā€ takes the last data updated (from the grid) rather than the raw data from the database. Lastly, user has the option to store the data to the database once heā€™s done with the manipulation.

Where I need support!
I would like to understand how to update the cache once I have the updated dataframe.
Right now, even after I assign the updated dataframe to the variable address_frame, it will still contain the cached data that was loaded from the database

Code

Code snippet:

@st.experimental_memo(show_spinner=True,suppress_st_warning=True)
def load_Raw_Address():
       with st.spinner('Loading Cleansed Address Data From Database. This may take a while...'):
           query_raw='SELECT * FROM [dbo].[DQ_Raw_Address]'
           address_frame_raw = pd.read_sql(query_raw,con)
       return address_frame_raw

#Assign raw data to address_frame    
address_frame = load_Raw_Address()

#Show on AgGrid, user can manipulate the data on the grid    
grid_return = AgGrid(address_frame, gridoptions ,editable=True ,allow_unsafe_jscode=True,theme = 'balham' , width = "100%", height = "800px")

#manipulated data is ultimately assigned to the address_frame again. 
#NOTE: THIS NEEDS TO REPLACE PREVIOUS CACHED VALUE WITH NEW ONE   
address_frame = grid_return['data']
            

Actual behavior:

Currently, If i manipulate the data on the dataframe and refresh the page, the cached data is still stored as raw data from DB and is loaded.

This sounds like a good use of st.session_state, where you could check to see if ā€œedited_dataframeā€ exists in the session state, and if it doesnā€™t, load the data from the cached function.

Something like this

import pandas as pd
import streamlit as st


@st.experimental_memo(show_spinner=True, suppress_st_warning=True)
def load_Raw_Address():
    with st.spinner(
        "Loading Cleansed Address Data From Database. This may take a while..."
    ):
        query_raw = "SELECT * FROM [dbo].[DQ_Raw_Address]"
        address_frame_raw = pd.read_sql(query_raw, con)
    return address_frame_raw


# Assign raw data to address_frame
if "address_frame" not in st.session_state:
    address_frame = load_Raw_Address()
    st.session_state["address_frame"] = address_frame

address_frame = st.session_state["address_frame"] 

# Show on AgGrid, user can manipulate the data on the grid
grid_return = AgGrid(
    address_frame,
    gridoptions,
    editable=True,
    allow_unsafe_jscode=True,
    theme="balham",
    width="100%",
    height="800px",
)

# manipulated data is ultimately assigned to the address_frame again.
# NOTE: THIS NEEDS TO REPLACE PREVIOUS CACHED VALUE WITH NEW ONE
st.session_state["address_frame"] = grid_return["data"]

Hi blackary,

Thankyou for the quick reply. The solution was very helpful in understanding the whole concept.
I implemented your code but now face an issue that needs support.

This is what i observe when the loop runs for the first time. (Pretty good so far)

As soon as i change a cell in AGGRID (replace it with a new value), The code reruns and the following output can be seen.

Observe after changing the value, the session state is updated with the new value in the grid below, but not in the grid above (For the same code that you provided). If I change the value again now, the code reruns again and the following output can be seen.

Observe, the value that was changed in the previous iteration was updated in this iteration.

What is the best possible way to tackle this scenario?
I hope I made myself clear.

Many Thanks :slight_smile:

Is there any onChange() method implementable for Aggrid that can update the value of the Session state before rerunning?
I believe this is the missing piece to this issue.

Iā€™m not sure if thereā€™s an onChange method with AgGrid, but something like this might work:

if not grid_return["data"].equals(st.session_state["address_frame"]):
    st.session_state["address_frame"] = grid_return["data"]
    st.experimental_rerun()

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.