How can I fix streamlit groupby.transform error?

Summary

Applied to train but not to test and val … Works fine on jupyter notebook but In streamlit, no values ​​are applied to test, val. why this happen?

Steps to reproduce

Code snippet:

df1 = pd.DataFrame({'value': [1, np.nan, np.nan, 2, 3, 1, 3, np.nan, 3], 'name': ['A','A', 'B','B','B','B', 'C','C','C']})

df2 = pd.DataFrame({'value': [1, np.nan, np.nan, 2, 3, 1, 3, np.nan, 3], 'name': ['A','A', 'B','B','B','B', 'C','C','C']})


[ df1 ]
  name  value
0    A      1
1    A    NaN
2    B    NaN
3    B      2
4    B      3
5    B      1
6    C      3
7    C    NaN
8    C      3

[ df2 ]
  name  value
0    A      1
1    A    NaN
2    B    NaN
3    B      2
4    B      3
5    B      1
6    C      3
7    C    NaN

numeric_only_columns = df1.select_dtypes(exclude = ['object', 'datetime']).columns.to_list()

for i in numeric_only_columns:
    df1[i] = df1[i].fillna(df1.groupby(groupby_columns)[i].transform('mean'))
    df2[i] = df2[i].fillna(df1.groupby(groupby_columns)[i].transform('mean'))



df2[i] = df2[i].fillna(df1.groupby(groupby_columns)[i].transform('mean')) working


my stremalit code

fill_na_columns = st.selectbox("select method?",('mean','min','max','median'))
                    groupby_columns = st.multiselect("Groupby columns select", train.columns.to_list(), default=train.columns.to_list()[0])

if fill_na_columns == 'mean':
   st.dataframe(train.groupby(groupby_columns).mean(numeric_only=True))
   numeric_only_columns = train.select_dtypes(exclude = ['object', 'datetime']).columns.to_list()


    for i in numeric_only_columns:
                            train[i] = train[i].fillna(train.groupby(groupby_columns)[i].transform('mean'))
                            test[i] = test[i].fillna(train.groupby(groupby_columns)[i].transform('mean'))
                            val[i] = val[i].fillna(train.groupby(groupby_columns)[i].transform('mean'))

If applicable, please provide the steps we should take to reproduce the error or specified behavior.

Expected behavior:

test, val apply train groupby mean values

Debug info

  • Streamlit version: 1.23.1 (get it with $ streamlit version)
  • Python version: 3.9.12 (get it with $ python --version)
  • Using Conda
  • OS version: window

Requirements file

Using Conda? PipEnv? PyEnv? Pex? Share the contents of your requirements file here.
Not sure what a requirements file is? Check out this doc and add a requirements file to your app.

Links

  • Link to your GitHub repo:
  • Link to your deployed app:

Additional information

all dataframe have same columns

Hi @bigdatanigel1513

Could you elaborate on the error that you’re facing? It is mentioned “applied to train but not test and val”.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.