Issue with caching data uploaded via the file uploader

Hi Guys! (sorry me again, I’m on a roll with questions today! :stuck_out_tongue:)

I’m trying to upload tabular data from the file uploader.

Despite using the cache decorator @st.cache the way Andfanilo described it here, each I move anything (sliders, multi select etc…, see video below), data seems to be constantly re-uploading, which makes the app completely unsuable with any csv with a size greater than 3MB.

Here’s the Python code I’m using:

@st.cache(allow_output_mutation=True)
def load_data(file):
    df = pd.read_csv(file, encoding='utf-8', nrows=50)
    return df
uploaded_file = st.file_uploader("Choose a CSV file", type="csv", key='file_uploader')

I’m not sure if that’s expected - I’m pretty sure I’m doing something sub optimal here! :slight_smile:

Thanks in advance!

Charly

Hi @Charly_Wargnier,
I am experiencing the same issue. Did you manage to resolve this issue at your end?
Thanks in advance!

1 Like

Hey @mzeidhassan!

Try this:

st.markdown('### **1️⃣ Upload CSV file 👇 **')

@st.cache(allow_output_mutation=True)
def load_data(file):
    df = pd.read_csv(file, encoding='utf-8', nrows=50)
    df.columns = ['url', 'redir']
    return df

uploaded_file = st.file_uploader("", type="csv", key='file_uploader')

if uploaded_file is not None:
    df = load_data(uploaded_file)  

Let me know if that works for you! :slight_smile:

Charly

1 Like

Thanks @Charly_Wargnier for your prompt reply. I will give it a try. Thanks a million!

1 Like

Hello!
I am having the exact same issue with my app - it doesn’t seem to cache the uploaded file when I do st.write() - it loads the table every single change, which isn’t feasible.
I tried the code snippet from here as well, but no luck.
Any idea?

Here is my code snippet:

uploaded_file = st.sidebar.file_uploader(
                        label="Upload your CSV or Excel file. (200MB max)",
                         type=['csv', 'xlsx'])

@st.cache
def load_data(filename):
    """Function for loading data"""
    data = pd.read_csv(filename)

    numeric_df = data.select_dtypes(['float','int'])
    numeric_cols = numeric_df.columns

    text_df = data.select_dtypes(['object'])
    text_cols = text_df.columns

    return data, numeric_cols, text_cols

df, numeric_columns, non_numeric_columns = load_data(uploaded_file)
st.write(df)

This triggers a reload every time I change something. If I remove st.write(df), it does seem to cache the data…so not sure about this one.
Any help is appreciated!

I had the same problem that the caching my dataframes wasn’t working with the file_uploader.

It worked when I used @st.experimental_memo instead of @st.cache

1 Like