Summary
Applied to train but not to test and val … Works fine on jupyter notebook but In streamlit, no values are applied to test, val. why this happen?
Steps to reproduce
Code snippet:
df1 = pd.DataFrame({'value': [1, np.nan, np.nan, 2, 3, 1, 3, np.nan, 3], 'name': ['A','A', 'B','B','B','B', 'C','C','C']})
df2 = pd.DataFrame({'value': [1, np.nan, np.nan, 2, 3, 1, 3, np.nan, 3], 'name': ['A','A', 'B','B','B','B', 'C','C','C']})
[ df1 ]
name value
0 A 1
1 A NaN
2 B NaN
3 B 2
4 B 3
5 B 1
6 C 3
7 C NaN
8 C 3
[ df2 ]
name value
0 A 1
1 A NaN
2 B NaN
3 B 2
4 B 3
5 B 1
6 C 3
7 C NaN
numeric_only_columns = df1.select_dtypes(exclude = ['object', 'datetime']).columns.to_list()
for i in numeric_only_columns:
df1[i] = df1[i].fillna(df1.groupby(groupby_columns)[i].transform('mean'))
df2[i] = df2[i].fillna(df1.groupby(groupby_columns)[i].transform('mean'))
df2[i] = df2[i].fillna(df1.groupby(groupby_columns)[i].transform('mean')) working
my stremalit code
fill_na_columns = st.selectbox("select method?",('mean','min','max','median'))
groupby_columns = st.multiselect("Groupby columns select", train.columns.to_list(), default=train.columns.to_list()[0])
if fill_na_columns == 'mean':
st.dataframe(train.groupby(groupby_columns).mean(numeric_only=True))
numeric_only_columns = train.select_dtypes(exclude = ['object', 'datetime']).columns.to_list()
for i in numeric_only_columns:
train[i] = train[i].fillna(train.groupby(groupby_columns)[i].transform('mean'))
test[i] = test[i].fillna(train.groupby(groupby_columns)[i].transform('mean'))
val[i] = val[i].fillna(train.groupby(groupby_columns)[i].transform('mean'))
If applicable, please provide the steps we should take to reproduce the error or specified behavior.
Expected behavior:
test, val apply train groupby mean values
Debug info
- Streamlit version: 1.23.1 (get it with
$ streamlit version
) - Python version: 3.9.12 (get it with
$ python --version
) - Using Conda
- OS version: window
Requirements file
Using Conda? PipEnv? PyEnv? Pex? Share the contents of your requirements file here.
Not sure what a requirements file is? Check out this doc and add a requirements file to your app.
Links
- Link to your GitHub repo:
- Link to your deployed app:
Additional information
all dataframe have same columns