Dynamic multiselect options tied to presence in a list/dataframe

Hi everyone. I had some fun today making streamlit’s multiselect options behave together in a co-dependent manner. This post here isn’t an issue but rather an observation of how to get something extra out of the system (with scope for change in the future? Or at least a guide on something other people may find interesting).

I’ve set up a system where a user selects an option from one multiselect box and, based upon their selection, the options for a second multiselect will be changed according to values in a pre-existing data structure.

Let’s say we have a dataframe called df:

    | 'A' | 'B' | 'C' |
1     4     6     1
2     6     4     2
3     6     9     3

And I set up some columns:
col1, col2, col3 = st.beta_columns(3)

Right now, if I want to use a dependent chain of multiselects whose options depended on what I chose for a previous multiselect, here’s what I can do:

with col1:
    option1 = st.multiselect("Option 1", list(set(df.iloc[:, 0])))
with col2:
    option2 = st.multiselect("Option 2", list(set(df.loc[(df.iloc[:, 0].isin(option1))][df.columns[1]])))
with col3:
    option3 = st.multiselect("Option 3", list(set(df.loc[(df.iloc[:, 0].isin(option1)) & (df.iloc[:, 1].isin(option2))][df.columns[2]])))
user_choice = list(product(*[option1, option2, option3]))

If the user chooses 6 for option 1, option2 will present options 4 and 9, whereas if the user chooses 4 for option 1, option2 will present only 6. So far so good. Now image you have six chained dependent options, each requiring more and more lines of code to whittle down the choices of the dataframe (because option3 would need to do an iloc of both option 1 and 2, and so on). Gets a bit heavy with 6 in total, but it gets the job done.

Note: invalid combinations may be present in user_choice, so they need to be parsed manually.

I’m wondering if there would be a possible cleaner solution to this issue? Incorporating linked keys so that multiselect widgets could operate in a hierarchy is a potential option (and potential nightmare). Or maybe this is more an issue of cleaner coding in python to not abuse the limitations of streamlit?

Seems to be a bit of interest in this topic elsewhere

1 Like

Hey @Cells,

First, Welcome to the Streamlit Community! :tada: :tada: :tada: :partying_face: :tada: :tada: :tada:

This solution is the same as I would have implemented for this, in fact I made a similar solution for a user who was looking to filter the whole dataframe (slightly different but linking here just in case it’s helpful!)

It does begin to get long if your looking to filter many options, but I haven’t yet come up with a more clever/cleaner solution!

Happy Streamlit-ing!
Marisa

1 Like

Hi Marisa, thanks for the welcome!

I like the simplicity of the other solution too; seems like this is the way to go for chaining together options between selectboxes or multiselects.

Looking forward to using streamlit more in the future

2 Likes

How to achieve the very same functionality in the reversed way?
Let’s assume two multiselects A & B, B is below A.
How to update the options for multiselect A based on selections made in multiselect B.
@Marisa_Smith the code you shared, the chaining works only in a downward fashion…it doesn’t work in an upward direction. Or maybe I am missing something.

Thanks.

Hi @ganesh_singh,

First, welcome to the Streamlit community! :tada: :star2: :partying_face: :tada: :tada: :partying_face: :star2:

You are right, this chaining only works in a downward fashion. There may be some clever way to use pandas and Streamlit to create this kind of functionality, but I suspect that it requires session state to do it properly.

There is a bit of a hack for session state on this github link: A possible design for doing per-session persistent state in Streamlit · GitHub

However, this is a much-anticipated update that our engineers are working on now to implement! The exact date for the launch of this new feature we aren’t sure yet. But I believe it’s scheduled to come out in Q2!!

Happy Streamlit-ing!
Marisa

Hi @Marisa_Smith, I am exactly trying to implement it with SessionState but with no luck. Just wondering if you have any sample code for this workaround of two way chaining multiselect? Thanks.

Great work on Streamlit by the way! Coming from a huge fan.

Hey @jasonchanhku,

First welcome to the Streamlit community!!! :partying_face: :partying_face: :nerd_face: :tada: :tada:

I personally don’t have any sample code for using session state with this particular example. But what sort of time frame are you needing to complete this by? :thinking:

If this is something that’s time-sensitive and needs a solution right away I would make a separate new post that links your GitHub or code and outlines what you have tried! I’m sure with the full brainpower :brain: of the community behind you a solution can be discovered!

Happy Streamlit-ing!
Marisa

pssst… can you keep a secret?
secret

Secret

The Streamlit team has been working on a version of session state that doesn’t require this hack, it will be built right into Streamlit’s core! :speak_no_evil:

AND its due to be released in early Q3, which is sooooooo exciting! :star_struck: So depending on what sort of project timeline your looking at you may be able to use this update!

Hey @Marisa_Smith thanks for the response, really appreciate it! I was using it for a stock outage forecast use case at work. I have already clarified with the users that the filters are single way at the moment and they are fine with it. Hence, I resorted in single way filters. Glad to hear a potential solution is coming later on! It is very nice of you to offer to help but please do not feel pressured or obligated! Below is the sample code I attempted by with no avail. I basically tried updating the list in sesh by using OrderedDict(). In short, I am trying to dynamically update location and work_order without the user having to worry about filtering order. Thanks again for the response and I look forward to future updates! Keep up the great work!


import SessionState
import streamlit as st
import pandas as pd
import numpy as np
from utils.download import download_button
from collections import OrderedDict
from streamlit.script_request_queue import RerunData
from streamlit.script_runner import RerunException



@st.cache
def load_data(cols=['item', 'location', 'work_order']):
    
    df = pd.read_csv('../data/stock_data.csv', encoding= 'unicode_escape')
    
    df.columns = ['item', 'func_location', 'fl', 'work_order']

    df['location'] = df['func_location'].str.split('-').str[2]
    
    df =df[~df['location'].isna()]
    
    return df[cols]

def filter_data(df, locations, work_orders):
    
    if len(locations) == 0:
        locations = list(df['location'].unique())
    
    if len(work_orders) == 0:
        work_orders = list(df['work_order'].unique())
        
    return df.query("location==@locations").query("work_order==@work_orders")
    
    
def aggregate_data(my_df):
    
    my_df = my_df.groupby(['location', 'work_order', 'item']).size().to_frame('count').reset_index()
    my_df['percentage'] = round(100 * my_df['count'] / my_df.groupby('location')['count'].transform('sum'), 2)
    
    return my_df


def app():
    
    sesh = SessionState.get(rerun=False, prepro_steps = OrderedDict([('loc_options', []), ('wo_options', [])]))

    data = load_data()
    
    st.title('Stock Outage Filtering')

    st.subheader('Filter Locations')
    print(sesh.prepro_steps)
    if len(sesh.prepro_steps['wo_options']) ==0:
    
        selected_loc = st.multiselect(
            'Select locations: ',
            data['location'].unique(),
            help="This is for locations extracted")
        
        sesh.prepro_steps['loc_options'] = selected_loc
        
    else:
        
        my_list = sesh.prepro_steps['wo_options']
        selected_loc = st.multiselect(
            'Select locations: ',
          data.query(f"work_order in @my_list")['location'].unique(),
            help="This is for locations extracted")
        
        sesh.prepro_steps['loc_options'] = selected_loc
    
    #if st.checkbox('Show raw data'):
    st.subheader('Filter Work order')
    
    if len(sesh.prepro_steps['loc_options']) == 0:
        
        last_wo = sesh.prepro_steps['wo_options']
            
        selected_wo = st.multiselect(
        'Select work order: ',
        data['work_order'].unique(),
        help="This is for work order extracted")
        sesh.prepro_steps['wo_options'] = selected_wo
        
        if last_wo != sesh.prepro_steps['wo_options']:
            raise RerunException(RerunData(widget_states=None))
        
        
    else:
        my_list = sesh.prepro_steps['loc_options']
        selected_wo = st.multiselect(
        'Select work order: ',
        data.query(f"location in @my_list")['work_order'].unique(),
        help="This is for work order extracted")
        sesh.prepro_steps['wo_options'] = selected_wo
    
    st.subheader('Filter Items')
    
    st.subheader('Filtered Dataset')
    
    filtered = filter_data(data, selected_loc, selected_wo)
    
    aggregated_df = aggregate_data(filtered)
    
    st.write(aggregated_df)
    
    tmp_download_link = download_button(aggregated_df, "filtered_data.csv", "Download CSV", pickle_it=False)
    st.markdown(tmp_download_link, unsafe_allow_html=True)



1 Like