Trying to use st.multiselect() not getting desired results

Hi everyone,
I am trying to make a data app with multiselect for visualizing data. I want to show a grouped bar chart with three parameters for the clients selected in the multiselect option. However, No matter whichever client I select the graph is shown in the same order as the original data i.e even if I select the 7th client in the multiselect, I still get the graph for the first row in the data frame.
Here’s the code:

DATA_URL= 'C:\\Users\\DELL\\Documents\\soj_data.xlsx'

#@st.cache(persist =True)
def load_data():
	data1 = pd.read_excel(DATA_URL)
	return data


data = load_data()

st.markdown('### Client Selection, Offers and Joinings')
clients= data['Client']
clients1=clients.to_list()
options=st.multiselect('Client List',clients1)
st.write(data)

selections=data['selections'] 
offers=data['offers']
joinings=data['joinings']

fig1 = go.Figure()
fig1.add_trace(go.Bar(
    x=options,
    y=selections,
    name='Selections',
    marker_color='indianred'
))
fig1.add_trace(go.Bar(
   x=options,
   y=offers,
   name='Offers',
    marker_color='lightsalmon'
))
fig1.add_trace(go.Bar(
   x=options,
   y=joinings,
   name='joinings',
    marker_color='indianred'
))

# Here we modify the tickangle of the xaxis, resulting in rotated labels.
fig1.update_layout(barmode='group', xaxis_tickangle=-45)
st.plotly_chart(fig1)



Can someone please help me out with this?
Thanks

Hello @ira_bajpai,

From what I can see, you’re not slicing the original data correctly based on what options are chosen using the multiselect. Using data I have, I made some corrections your code and it filters the original data as desired.

data = load_data()

st.markdown("### Client Selection, Offers and Joinings")
events = data["event"]
# in the line below, the 'unique()' needs to be specified 
#     to prevent duplicates being shown in the multiselect
events_list = events.copy().unique().tolist()[-10:]  # showing the last 10 events only
options = st.multiselect("Events List", events_list)

st.write(options)

if not options:
    filtered_data = data.copy()
else:
    filtered_data = data.loc[data["event"].isin(options)]

st.dataframe(filtered_data, width=1000, height=500)

# from this point, only the filtered data is used 
comments = filtered_data["comments"]
duration = filtered_data["duration"]
views = filtered_data["views"]

Full Data

Filtered Data

I believe modifying your code as above will get you the desired results.

Cheers.

1 Like

Hi @Outsiders17711 ,

Thankyou It is working fine now, I am still not able to understand exactly what’s happening here, can you please explain?

Hello @ira_bajpai, I’m glad I was able to help. I will do my best to explain:

Your aim is to filter the original data based on specific clients’ in the data['Client'] column. Then you do some plots using the selected clients’ and the ['selections'], ['offers'] and ['joinings'] columns.

There were three mistakes in your code:

  1. You didn’t filter the original data based on the selected options. The lines below might as well not be in your code as they had no effect on the output of the app.
clients= data['Client']
clients1=clients.to_list()
options=st.multiselect('Client List',clients1)
  1. In the subsequent lines, you still referred to the original data containing all the clients’. Rather, you should be referring to a smaller dataframe that contains only the clients’ chosen in the multiselect. So again, the multiselect is not having any effect on the output of the app.
selections=data['selections'] 
offers=data['offers']
joinings=data['joinings']
  1. Finally, when you were plotting, you passed in x=options, which will not be compatible with the y=selections,, y=offers, and y=joinings,. Those y values will contain all the available clients’ in the original data which will be bigger than the smaller number of clients’ contained in options.

In summary, if you intend to work with a smaller subset of a dataframe, you need to first create a new dataframe containing the smaller subset of data and perfrom any subsequent operations on the new, smaller dataframe.

You did not create a smaller, subset dataframe and you kept referring to the full data which means the multiselect has no effect on your program. I hope that helps.

Cheers.

3 Likes

@Outsiders17711 This is very detailed, Thank you so much for explaining so well.
Cheers :slight_smile: