I’m new to streamlit and have tried my best to look for a solution before posting here.
What I want to do is simple, I am loading a pandas dataframe as .csv, I am using a @st.cache decorator to cache this dataframe. I want to predict a classification by using a predefined classification model(RandomForest, XGBoost), essentially a column will be added to the original dataframe and stored in a new variable. However, I am having issues caching this new dataframe.
import pandas as pd import numpy as np from xgboost import XGBClassifier import streamlit as st def main(): st.set_page_config(layout="wide") st.title('Classification Problem on Home Equity dataset') if __name__ == '__main__': main() #Load prediction data @st.cache def load_predict(): data= pd.read_csv("hmeq_Predict_2.csv") #Currently on my local machine return data df_predict = load_predict() # Predict on data @st.cache def predictor_func(): y_pred_nd = pd.Series(model.predict(df_predict),name='BAD') Predicted_X = pd.concat([df_predict,y_pred_nd],axis=1) #This is the Dataframe that I want cache return Predicted_X #Run XGBoost classification , I have loaded X_train and y_train also, not shown in this example if classifier == "XGBoost": if st.sidebar.button("Run Classification", key="Classification"): model = XGBClassifier() model.fit(X_train,y_train) #I want this function to return the cached dataframe. Predicted_X=predictor_func() # This command will correctly display the Dataframe, meaning that the predictor_func() ran correctly st.write(Predicted_X) #However, when I want to display the dataframe, Predicted_X, only when I click this button if st.sidebar.button("Run Prediction on new Data", key="Prediction"): st.subheader('Check last column for prediction. ') st.write(Predicted_X)
This is the error I get :
Am I missing a key concept here?
Also, is there a way to cache a model from sklearn?