How can I load and feed data from multiple excel/csv sheets into the mutliselect function?

Rushmore · March 5, 2021, 8:23am

Context

In my web-app, I want my users to select different data categories and upon selecting this data category, it will load the data from its respective csv into the mutliselect function. The below demonstrates this:

So upon selection of social and Economy, load the respective data/columns from their separate excel sheets. Though I am lost on how to do this exactly.

I initially tried a for loop but it ended up creating many different multiselect columns. But this is not what I wanted. I just want the capability to load data from ‘economy’ if I click economy and if I select both ‘social’ and ‘economy’ that both datasets load. They are stored in different excel sheets.

My code as of now:

if choose_theme_category == 'Country specific':
        st.write("Choose data unique to a country for this theme.")
        Country_choice = st.beta_columns(3)
        countries = ['','United Kingdom', 'Ireland', 'Germany'] 
        Country = Country_choice[0].selectbox("Country", countries)          

        if Country == 'United Kingdom':
            # Category of data
            Category = st.beta_columns(4)
            Data_category = Category[0].checkbox("Social") 
            Data_category1 = Category[1].checkbox("Economy") 
            Data_category2 = Category[2].checkbox("Environment") 
            Data_category3 = Category[3].checkbox("Health") 
            Data_category4 = Category[0].checkbox("Policy") 
            
           # categories = [Data_category,Data_category1,Data_category2,Data_category3,Data_category4]
        
        data_mix_buttons = st.beta_columns([3,1,1])
        # confirmation_data = data_mix_buttons[0].button("Show Data")
        Hide_data = data_mix_buttons[2].checkbox("Hide Data")
        # if hide data is not pressed, then show data
        if not Hide_data:
            # for every selection chosen, add data from their respective excel sheets
            if Data_category:
                Select_data = st.multiselect("Choose columns", options=COVID_19_cols,key=1)
                Social_data = COVID_19_data[Select_data]
                if not Select_data:
                     st.error("Please select at least one column.")
                else:
                    st.write(Social_data)

Is this possible?

asehmi · March 5, 2021, 2:19pm

I would combine all the Excel file data up-front (cacheing it) and then dynamically generate the filter values from the distinct data column values. This has the advantage that the filter values being displayed are then specific to the selections being made (i.e. cross-filtering). I do something like this in the st app I released here. It’s also advisable due to the execution model of st apps, to put settings like this in the sidebar, to help preserve filter selections between reruns of the app script.

Rushmore · March 6, 2021, 2:47am

Thanks for your response. Is this advisable even if the data does not have the same index? For me the index has a range of dates but the range of dates are different. Is this still feasible?

asehmi · March 6, 2021, 8:12am

Yes, if the original data tables (excel files) have variable time series periods as columns then unpivot those columns into two columns, e.g. Year, Value. Then with all tables in the same shape you can concatenate them all together. Also add an extra column to identify the source and any other columns that would help you slice by series and prevent unwanted aggregations across unrelated series. This is known as building a columnar data table representation. The trick is using the pandas melt function.

If your data is already unpivoted (sounds like it is), then simply add the additional columns you need to help with the sifting. If the years column are the index then you need to reindex so the years exist as a separate column and the tables append will work as expected.

Rushmore · March 10, 2021, 4:21am

Thanks, will give it a try today

Rushmore · March 10, 2021, 7:59am

I am a bit confused. This is what my dataframe looks like based on what you described. Please let me know if I am mistaken:

When I used the melt function, I first started by orienting the data around the country column as this will be the first column users will select to filter the data. But this created a dataframe with country and two additional columns. But not quite sure how it works? It made duplicates of the columns which I did not need and took away the contents in those columns. Not entirely sure how this works.

This is my original data:

I will keep playing around with it though I may not have fully understood what you indicated. Following from the data, could you perhaps elaborate?

Thank you

asehmi · March 10, 2021, 10:53am

You want to take that one step further and reshape it so it has columns: Country, Category, Series, Data, Date, Value.

This code should work:

def reshape_data(data: pd.DataFrame):

    # UNPIVOT DATA

    # !!change this regex so it finds your date columns!!
    date_cols_regex = "^(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.]((19|20)\\d\\d)$"

    value_vars = list(data.filter(regex=(data_cols_regex)).columns)
    # unpivot the dataframe on all years 
    data_unpivoted = pd.melt(data, id_vars=['Country', 'Category', 'Series', 'Data'], \
         value_vars=value_vars, var_name='Date', value_name='Value') \
         .astype({'Year':'datetime','Value':'float64'})
    data_unpivoted = data_unpivoted.sort_values(['Country', 'Category', 'Series', 'Data','Date'])
    data_unpivoted.loc[:,'Value'].fillna(method='bfill', inplace = True)

    st.dataframe(data_unpivoted.head(100))

Your Streamlit selectors can then be auto-populated based on filtering any of these columns [‘Country’, ‘Category’, ‘Series’, ‘Data’,‘Date’].

For example, in my app, I’m dynamically computing how to populate selection lists of available Indicators, and Measurements, based on a selected Country.

Rushmore · March 13, 2021, 4:38pm

Thanks for this, will give it a try later today. Sorry about the late response, have been going back and forth between work and this. Appreciate all the help!

Rushmore · March 14, 2021, 8:22am

Thanks for this, with some minor adjustments I was able to get it to work. I was able to create a new dataframe based on the new table using pivot to show results. I appreciate your help!

Topic		Replies	Views
Filtering data with pandas Using Streamlit pandas	5	3835	January 8, 2023
Iterate over the mulstiselect widget and display dataframe for each item Using Streamlit	2	1249	April 27, 2023
Streamlit Dataframe Filtering Using Streamlit	5	3134	February 2, 2023
Select an item from multiselect on the sidebar Using Streamlit multiselect	6	18601	November 17, 2021
Multiselect and Download button reload the page Using Streamlit session-state , multiselect , button	2	524	August 3, 2024

How can I load and feed data from multiple excel/csv sheets into the mutliselect function?

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies