Cache a dataframe ; @st.cache

vishweshchaubal · February 7, 2022, 4:49am

I’m new to streamlit and have tried my best to look for a solution before posting here.

What I want to do is simple, I am loading a pandas dataframe as .csv, I am using a @st.cache decorator to cache this dataframe. I want to predict a classification by using a predefined classification model(RandomForest, XGBoost), essentially a column will be added to the original dataframe and stored in a new variable. However, I am having issues caching this new dataframe.

import pandas as pd
import numpy as np
from xgboost import XGBClassifier
import streamlit as st

def main():
    st.set_page_config(layout="wide")
    st.title('Classification Problem on Home Equity dataset')

if __name__ == '__main__':
    main()

#Load prediction data
    @st.cache
    def load_predict():
        data= pd.read_csv("hmeq_Predict_2.csv")  #Currently on my local machine
        return data
    df_predict = load_predict()

# Predict on data
    @st.cache
    def predictor_func():
        y_pred_nd = pd.Series(model.predict(df_predict),name='BAD')
        Predicted_X = pd.concat([df_predict,y_pred_nd],axis=1)
        #This is the Dataframe that I want cache
        return Predicted_X

#Run XGBoost classification , I have loaded X_train and y_train also, not shown in this example
    if classifier == "XGBoost":
        if st.sidebar.button("Run Classification", key="Classification"):
            model = XGBClassifier() 
            model.fit(X_train,y_train) 
           #I want this function to return the cached dataframe.
            Predicted_X=predictor_func()  
# This command will correctly display the Dataframe, meaning that the predictor_func() ran correctly
            st.write(Predicted_X)    

#However, when I want to display the dataframe, Predicted_X, only when I click this button
    if st.sidebar.button("Run Prediction on new Data", key="Prediction"):
        st.subheader('Check last column for prediction. ')
        st.write(Predicted_X)

This is the error I get :

Am I missing a key concept here?
Also, is there a way to cache a model from sklearn?

system · February 7, 2023, 4:50am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to cache multiple datasets? Using Streamlit cache , pandas	5	6498	November 19, 2021
St.cache and ouput mutation Using Streamlit	3	902	May 13, 2022
Using caching with API calls and messy DataFrames Using Streamlit cache , pandas	5	1246	November 19, 2021
Using Streamlit cache with Polars Using Streamlit	11	5250	August 29, 2024
Concurrency in a expensive cached dataclass Using Streamlit cache , pandas	5	464	July 21, 2024

Cache a dataframe ; @st.cache

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies