Why are my dataframe transformations not getting stacked?

If you’re creating a debugging post, please include the following info:

  1. Running the app locally
  2. Ok so I intend to upload a csv file and perform multiple data cleaning transformations onto the dataframe that gets loaded and rendered.
  3. My Create dataframe logic
  4. I mostly used “Drop” so here is my “Drop” Logic:
  5. Save CSV to “intermediate” folder logic

ISSUE: My dataframe that is getting rendered keeps getting reset to default, but how is that even possible from my clean2.py the “Drop” Logic send me a dataframe from the “intermediate” folder which i render, that CSV on every transformation changed irreversibly. I am so confused here,tried using session state not helping even cache not helping.

What exactly am I doing wrong here?

Streamlit Code:
import streamlit as st
import pandas as pd
import clean2
from clean2 import handle_nonnumeric_missing_vals_drop
from clean2 import handle_nonnumeric_missing_vals_fill

st.set_page_config(page_title=“Hackathon”, layout=“centered”, initial_sidebar_state=“collapsed”)

st.header(“Clean your data with out intuitive UI”)

if “uploaded_file” not in st.session_state:
st.session_state.uploaded_file=None
if “col_names” not in st.session_state:
st.session_state.col_names = None
if “handle_non_num_btn” not in st.session_state:
st.session_state.handle_non_num_btn = None
if “dataframe” not in st.session_state:
df=None
if “form_inputCsv” not in st.session_state:
st.session_state.form_inputCsv=None
if “form_cleanNonNumeric” not in st.session_state:
st.session_state.form_cleanNonNumeric = False

def handle_submission():
“”"
Handle the submission logic when the button is clicked.
“”"
col_names = st.session_state.col_names
action = st.session_state.handle_non_num_btn
df=st.session_state.dataframe

if action == "Drop":
    df, message, status = clean2.handle_nonnumeric_missing_vals_drop(df, col_names)
else:
    df, message, status = clean2.handle_nonnumeric_missing_vals_fill(df, col_names)

if status == "succ":
    st.success(message)
    st.session_state.df=df
    st.dataframe(df)
else:
    st.error(message)

UPLOAD FILE

st.session_state.uploaded_file = st.file_uploader(“Choose a CSV file”, type=“csv”)
if st.session_state.uploaded_file is not None:
# CREATE DATAFRAME
df,message,stat=clean2.createDataFrame(st.session_state.uploaded_file.name)
st.text("Uploaded data: ")
if stat==“succ”:
st.success(message)
st.dataframe(df.head(20), width=800, height=600)
st.session_state.dataframe=df

else:
    st.error(message)

Streamlit form for handleNonNumericValues

with st.form(key="handle_nonNum",clear_on_submit=True):
    st.subheader("Handle Missing Non-Numeric Values by Dropping or Filling")

    # Dropdown for fill or drop
    action=st.selectbox("Chose action",["Drop","Fill"])

    # Text filed for column names
    col_names=st.text_input("Enter column names to check for null and handle")

    # Update session state

    
    
    # Submit btn
    if st.form_submit_button(label="Submit"):
        st.session_state.col_names=col_names
        st.session_state.handle_non_num_btn=action
        handle_submission()
        st.session_state.form_cleanNonNumeric = True

Clean2.py contains the createDataframe, saveDataframeToCSV and Drop Logic, SS relevant to that file has been attached
Please help me out here

Hi,
Please copy paste the code a screenshoot is a pain in the low back to debug !

What is see in “handle_nonnumeric_missing_vals_drop”, you got df=pd.read_csv('intermediate/clean.csv) so the function return this (Instead of saving and immediately reloading the dataframe, you should return the modified dataframe directly)

What is “clean2” in “clean2.handle_nonnumeric_missing_vals_fill” ?

To debug, look at different step, before and after saving, calling function with st.write(“Current dataframe:”, df) and you should find the place the problem occur :slight_smile:

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.