File Upload Limitation?

asehmi · November 20, 2021, 12:00pm

The file is being uploaded into the app, not being saved on the server per se. It exists as a StringIO or BytesIO object, which you can then use to physically create and save a file. See API docs.

app.py

import os
import time
from random import randint
import streamlit as st
from data import csv_to_df, excel_to_df

# Important: This folder must exist!
SAVE_PATH = os.path.join(os.getcwd(), 'uploads')

state = st.session_state

if 'FILE_UPLOADER_KEY' not in state:
    state.FILE_UPLOADER_KEY = str(randint(1000,9999))

st.markdown('## \U0001F4C2 Upload data files')
st.write('Upload one or more Excel data files. Duplicate files will be ignored.')
excel_files =  st.file_uploader('', type=['xlsx', 'csv'], accept_multiple_files=True, key=state.FILE_UPLOADER_KEY)
save = st.checkbox(f'Save files in {SAVE_PATH}?')
if len(excel_files) > 0 and st.button('\U00002716 Clear all'):
    state.FILE_UPLOADER_KEY = str(randint(1000,9999))
    st.experimental_rerun()

# This will remove duplicate files
excel_files_dict = {}
for excel_file in excel_files:
    excel_files_dict[excel_file.name] = excel_file

message = st.empty()

for _, excel_file in excel_files_dict.items():
    message.info(f'Loading {excel_file.name}...')
    if excel_file.type in ['application/vnd.ms-excel', 'application/octet-stream']:
        df = csv_to_df(excel_file)
        if save:
            message.info(f'Saving: {os.path.join(SAVE_PATH, excel_file.name)}') 
            df.to_csv(os.path.join(SAVE_PATH, excel_file.name))
    else: # 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
        df = excel_to_df(excel_file)
        if save:
            message.info(f'Saving: {os.path.join(SAVE_PATH, excel_file.name)}') 
            df.to_excel(os.path.join(SAVE_PATH, excel_file.name))

    st.subheader(excel_file.name)
    st.dataframe(df)
    if save:
        message.info('Your files have been saved.')
    else:
        message.info('Upload complete.')
    time.sleep(2)
    message.write('')

data.py

import streamlit as st
import pandas as pd

@st.experimental_memo(persist='disk')
def csv_to_df(excel_file):
    df = pd.read_csv(excel_file)
    return df

@st.experimental_memo(persist='disk')
def excel_to_df(excel_file):
    # https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html
    # New in Pandas version 1.3.0.
    #   The engine xlrd now only supports old-style .xls files. When engine=None, the following logic will be used to determine the engine:
    #   If path_or_buffer is an OpenDocument format (.odf, .ods, .odt), then odf will be used.
    #   Otherwise if path_or_buffer is an xls format, xlrd will be used.
    #   Otherwise if path_or_buffer is in xlsb format, pyxlsb will be used.
    #   Otherwise openpyxl will be used.
    #
    # import openpyxl
    # df = pd.read_excel(excel_file, engine=openpyxl)
    #
    # Therefore... do not need to provide "engine" when using a "path_or_buffer"
    df = pd.read_excel(excel_file)
    return df

Screenshot

This solution has been adapted from code I wrote here.

Topic		Replies	Views
Can we set a path for the file will be uploaded in st.fileupload API? Using Streamlit windows	7	7714	May 13, 2022
Uploading a CSV file using file_uploader Using Streamlit file-upload , pandas	3	7702	July 6, 2023
Use local file path to access the file when the app is on the air Using Streamlit	4	2839	August 17, 2023
Getting network error when uploading file in streamlit Using Streamlit	4	1587	March 27, 2024
Upload an excel file in my streamlit app mantaining the format Using Streamlit windows , file-upload	13	4946	February 10, 2024

File Upload Limitation?

app.py

data.py

Screenshot

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies