How to read a folder full of excel files

I am familiar with Pandas but would like to automate my work using Streamlit. I have played around with it but can’t seem to find out how to read a folder full of many excel files at once like the way I do it in pandas. I would really appreciate if you anyone can guide me in this.

Below is my code in pandas.

import sys
import os
import functools
import pandas as pd
from functools import reduce


path = "./City_Occupancy/KBS/KBS Journey Reports_November"
files = [file for file in os.listdir(path) if not file.startswith('.')] # Ignore hidden files

all_months_data = pd.DataFrame()

for file in files:
    current_data = pd.read_excel(path+"/"+file, index_col=None,skiprows=7,parse_dates = True)
    current_data = current_data.dropna(how='all')
    current_data['filename'] = os.path.basename(file)
    all_months_data = pd.concat([all_months_data, current_data])

Hey @Cyubahiro_Patrick,

Welcome to the Streamlit community! :tada: :tada: :tada: :tada:

So this code is your original working code you used to import a whole directory of excel files. But when you add some streamlit calls into this it no longer works?

Can you post the code that you most recently tried and we can try working on a solution that might work together?

Thanks,
Marisa

# File Selection Drop Down
import pandas as pd
import streamlit as st
import os
from typing import Dict


@st.cache(allow_output_mutation=True)
def get_static_store() -> Dict:
    """This dictionary is initialized once and can be used to store the files uploaded"""
    return {}


# This is the code I use in pandas that reads all files in a folder
def file_selector(folder_path):


    files = [file for file in os.listdir(folder_path) if not file.startswith('.')] # Ignore hidden files

    all_months_data = pd.DataFrame()

    for file in files:
        current_data = pd.read_excel(folder_path+"/"+file, index_col=None,skiprows=8,parse_dates = True)
        current_data = current_data.dropna(how='all')
        current_data.drop(current_data.tail(1).index,inplace=True)
        

        all_months_data = pd.concat([all_months_data, current_data])
    
    return all_months_data
    
def main():

    folderPath = st.text_input('Enter Abys bus monthly data folder path:')
    #I want to pass a folder path with many excel files that are read
    files = file_selector(folderPath)
    df = files[['Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 28', 'Unnamed: 29']]
    df.rename({'Unnamed: 5':'PlateNumber','Unnamed: 6':'Line Code','Unnamed: 7':'Start Date ','Unnamed: 28':'Total Pieces ','Unnamed: 29':'Total RWF'}, axis=1, inplace=True)
    #df['month_year'] = pd.to_datetime(df['Start Date ']).dt.to_period('M')
    st.write(df)
    
    
    folderPath = st.text_input('Enter the occupany data :')
    files = file_selector(folderPath)
    df = files[['Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 28', 'Unnamed: 29']]
    df.rename({'Unnamed: 5':'PlateNumber','Unnamed: 6':'Line Code','Unnamed: 7':'Start Date ','Unnamed: 28':'Total Pieces ','Unnamed: 29':'Total RWF'}, axis=1, inplace=True)
    #df['month_year'] = pd.to_datetime(df['Start Date ']).dt.to_period('M')
    st.write(df)

main()

The code to read in the folder actually work and appends the files when I pass the folder path but I get some errors before passing the folder path because it’s empty by default. I get FileNotFoundError: [WinError 3] The system cannot find the path specified: ‘’
Traceback:
How do I stop this by default?

https://drive.google.com/drive/folders/1AuYG3HeorZg5Om-s8o1ihgMdf4dvET-U?usp=sharing

My code has been flagged but above it the folder to the work files and screenshots that I used for the streamlit code.

In case they don’t unflag my code, I will share it again.

Thank you

Hey @Cyubahiro_Patrick,

Thanks for linking your google drive with some test excel files, they were very helpful for me to test the code you sent.

So I have this working locally:
Firstly, in your def main() you need to add in a couple logic statements so that if the user has not put in anything in your input fields, the app does not try to run with those empty. Those are both errors that you have screenshotted in your drive:

I have fixed this like so:

def main():

    folderPath = st.text_input('Enter Abys bus monthly data folder path:')
    #I want to pass a folder path with many excel files that are read
    if len(folderPath) != 0:
        files = file_selector(folderPath)
        df0 = files[['Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 28', 'Unnamed: 29']]
        df0.rename({'Unnamed: 5':'PlateNumber','Unnamed: 6':'Line Code','Unnamed: 7':'Start Date ','Unnamed: 28':'Total Pieces ','Unnamed: 29':'Total RWF'}, axis=1, inplace=True)
        #df['month_year'] = pd.to_datetime(df['Start Date ']).dt.to_period('M')
        st.write(df0)


    occPath = st.text_input('Enter the occupany data :')
    if len(occPath) != 0:
        files = file_selector(occPath)
        df1 = files[['Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7', 'Unnamed: 28', 'Unnamed: 29']]
        df1.rename({'Unnamed: 5':'PlateNumber','Unnamed: 6':'Line Code','Unnamed: 7':'Start Date ','Unnamed: 28':'Total Pieces ','Unnamed: 29':'Total RWF'}, axis=1, inplace=True)
        #df['month_year'] = pd.to_datetime(df['Start Date ']).dt.to_period('M')
        st.write(df1)

I was also more specific about the 2 paths your taking in and your data frames, giving them unique names. Because Streamlit runs your script from top to bottom each time you interact with a widget on the page, if you are overwriting your dataframes and paths you could run into trouble.

Once this was working fine, I actually ran into another problem, because I didn’t have xlrd installed. So i was able to install that following info from this link:
pip install xlrd OR conda install -c anaconda xlrd

And then it runs and the data loads. Now I am not able to enter any info into the second text_input because I am not sure what data your looking for here. But here is the app working up to this spot on my local computer:

Hope this helps!
Marisa
P.S. Sorry about the flag, the system can be too overzealous when people copy paste code in!

You’re the absolute best Marisa. For that reason I am now determined to learn streamlit all credit to you and thanks for the quick turnaround when resolving all my issues. Thank you

2 Likes

No problem @Cyubahiro_Patrick! :smiling_face_with_three_hearts:

Happy Streamlit-ing!!! :grin:
Marisa

@Marisa_Smith, I have now finalized my app and hosted it on Heroku but seems like for the work files I worked with, I have to upload them. The file upload doesn’t work with a local file path. Do you know of any way I can work a local file path like
“C:\Users\pbuchana\City_Occupancy\KBS\KBS Jouney Reports_December”

https://streamlit-excel-app.herokuapp.com/ (You can use this link to test)

Or I have to upload all the files on Heroku as I have seen on other forums.

Hey @Cyubahiro_Patrick,

I’m not sure about this one as I haven’t hosted any apps on Heroku (I usually use Streamlit Sharing)

BUTTTT @Charly_Wargnier does do Heroku deployments a lot and I think he might be able to lend a hand on this!

@Charly_Wargnier any thoughts?

Marisa

Hi @cyubahiro_patrick :wave:

Looks like you’ll need to replace your local path with your Heroku/github path.

Hope that makes sense?

Charly

1 Like

Wow!!! Just checked out the Streamlit Sharing, it looks dope. I will look into it. Thanks

1 Like

Thank you @Charly_Wargnier and nice to meet you. I will there was a way to upload straight from the local path.

1 Like

Petty issue here!! Is it just me or does the slider just not work on the table display in streamlit.
The red part works but slider in Yellow doesn’t move, how do you do it?