How can I fix streamlit groupby.transform error?

Summary

Applied to train but not to test and val … Works fine on jupyter notebook but In streamlit, no values ​​are applied to test, val. why this happen?

Steps to reproduce

Code snippet:

df1 = pd.DataFrame({'value': [1, np.nan, np.nan, 2, 3, 1, 3, np.nan, 3], 'name': ['A','A', 'B','B','B','B', 'C','C','C']})

df2 = pd.DataFrame({'value': [1, np.nan, np.nan, 2, 3, 1, 3, np.nan, 3], 'name': ['A','A', 'B','B','B','B', 'C','C','C']})


[ df1 ]
  name  value
0    A      1
1    A    NaN
2    B    NaN
3    B      2
4    B      3
5    B      1
6    C      3
7    C    NaN
8    C      3

[ df2 ]
  name  value
0    A      1
1    A    NaN
2    B    NaN
3    B      2
4    B      3
5    B      1
6    C      3
7    C    NaN

numeric_only_columns = df1.select_dtypes(exclude = ['object', 'datetime']).columns.to_list()

for i in numeric_only_columns:
    df1[i] = df1[i].fillna(df1.groupby(groupby_columns)[i].transform('mean'))
    df2[i] = df2[i].fillna(df1.groupby(groupby_columns)[i].transform('mean'))



df2[i] = df2[i].fillna(df1.groupby(groupby_columns)[i].transform('mean')) working


my stremalit code

fill_na_columns = st.selectbox("select method?",('mean','min','max','median'))
                    groupby_columns = st.multiselect("Groupby columns select", train.columns.to_list(), default=train.columns.to_list()[0])

if fill_na_columns == 'mean':
   st.dataframe(train.groupby(groupby_columns).mean(numeric_only=True))
   numeric_only_columns = train.select_dtypes(exclude = ['object', 'datetime']).columns.to_list()


    for i in numeric_only_columns:
                            train[i] = train[i].fillna(train.groupby(groupby_columns)[i].transform('mean'))
                            test[i] = test[i].fillna(train.groupby(groupby_columns)[i].transform('mean'))
                            val[i] = val[i].fillna(train.groupby(groupby_columns)[i].transform('mean'))

If applicable, please provide the steps we should take to reproduce the error or specified behavior.

Expected behavior:

test, val apply train groupby mean values

Debug info

  • Streamlit version: 1.23.1 (get it with $ streamlit version)
  • Python version: 3.9.12 (get it with $ python --version)
  • Using Conda
  • OS version: window

Requirements file

Using Conda? PipEnv? PyEnv? Pex? Share the contents of your requirements file here.
Not sure what a requirements file is? Check out this doc and add a requirements file to your app.

Links

  • Link to your GitHub repo:
  • Link to your deployed app:

Additional information

all dataframe have same columns

Hi @bigdatanigel1513

Could you elaborate on the error that you’re facing? It is mentioned “applied to train but not test and val”.