My app has sliders in the left sidebar that allow the user to narrow down the range of various columns, then defines a filt
based on a long string of these sliders &'d
together.
My problem is when I attempt to show a multiselect box that has the Journal Name’s which meet that filter. If the user selects a Journal Name that is after a row that was removed by the filter, I get a KeyError.
In this example, the whole dataframe has 7 rows, of which indexes 3, 4, and 6 were filtered out. The multiselect box then shows the four remaining titles as options to choose. If I pick one of the first three, everything is fine. If I pick the fourth option (“Theoretical Computer Science”, corresponding to index 5), I get this error.
I have investigated the .loc operator in pandas and was able to make a working example in a Jupyter Notebook. I think this is traced down to the multiselect widget.
selected_titles = st.multiselect('Journal Name:', pd.Series(df.loc[filt, 'title']), help='Displayed in order provided by the underlying datafile')
Try resetting the index of the dataframe so in your case it’d go from 0 - 3.
But what happens when the user changes the filter, or adds more requirements? I’d have to reset the index again. And I wouldn’t want to modify the actual, clean dataframe so I’d have to make a copy of the df and process that. Then overwrite the copy again when the filters change.
I guess I don’t understand why I need to worry about the index at all. Doesn’t multiselect return a list, which I can then use to loop over and do other things in the next part of my code?
what’s the statement applying selected_titles?
if st.button('Commit change!'):
for title in selected_titles:
title_filter = (df['title'] == title)
df.loc[title_filter, 'subscribed'] = radiovalue
If the user wants to change the subscribed
status of the Journal Title (and thus the color coding in the charts), they select the Journal Name from the multiselect, choose an option from a radio button, and hit the Commit change button. If they want to change the status of multiple titles at a time, the loop lets them do that.
If you’re using VS Code for development, I’d recommend running the ptvsd
remote debugger to step through the code to see how the filters are working. Or do it the hard way and print out values.
Debugging in VS Code
See this article for details: How to use Streamlit with VS Code
Essentially follow these steps:
pip install ptvsd
- Add the following snippet in your
<your-app_name>.py
file.
import ptvsd
ptvsd.enable_attach(address=('localhost', 5678))
ptvsd.wait_for_attach() # Only include this line if you always want to manually attach the debugger
- Then start your Streamlit app
streamlit run <your-app_name>.py
- From the
Debug
sidebar menu configure Remote Attach: Attach to a remote ptvsd debug server
and update your launch.json
file with the details below.
{
"name": "Python: Remote Attach",
"type": "python",
"request": "attach",
"port": 5678,
"host": "localhost",
"justMyCode": true,
"redirectOutput": true,
"pathMappings": [
{
"localRoot": "${workspaceFolder}",
"remoteRoot": "."
}
]
}
- Make sure you manually insert the
redirectOutput
setting.
- By default you will be debugging your own code only. If you want to debug into streamlit code, then change
justMyCode
setting from true
to false
.
- Finally, attach the debugger by clicking the debugger play button.
2 Likes
Thanks. I’m using Spyder but I did explore how to use their debugger on your suggestion.
I still think this is an issue with how the mutiselect understands the user’s selection. It seems the tool passes the index and then converts to a name, but does not pass the actual name itself.
If the dataframe was filtered, the indexes don’t line up anymore, and it causes problems like this.
I think I was able to fix it by defining a new dataframe with only the valid titles after the filter, then offering that new dataframe as choices in the multiselect but also turning on reset_index(drop=True).
filtered_titles_df = df.loc[filt]['title'] #make a new df with only the valid titles
selected_titles = st.multiselect('Journal Name:', pd.Series(filtered_titles_df.reset_index(drop=True)))
2 Likes