I’ve built an app with the following functionality:
CSV Upload: Users upload a CSV with vehicle data.
Field Mapping: Users map two CSV columns to fields: VIN (Vehicle Identification Number) and Address.
VIN Validation: The app checks the validity of the VINs.
Vehicle Classification: Users classify unique vehicle types in a new “Category” column and add notes in a “Notes” column. These updates apply back to the fleet dataset.
Address Categorization: Users categorize unique addresses, with changes reflecting in the dataset.
Final Dataset Display: The updated dataset is displayed, combining original CSV data with new or modified columns. Users can go back to steps 4 or 5 to make further updates.
Summary Generation: A summary is created, including a table of vehicles per address, highlighting missing addresses for user action.
XLSX Download: Users can download the final dataset and summary as an Excel file with two tabs.
Currently, all these features are on a single page, which makes managing states and actions challenging. Initially, I created multiple DataFrames (df1, df_with_vehicle_changes, etc.) and used conditional logic (e.g., if "df_with_vehicle_changes") to determine the next steps. This approach has proven difficult to maintain, especially as new features are added.
Seeking Best Practices
I’m exploring two potential solutions:
Solution 1: Single DataFrame Approach
Use a single df_final stored in st.session_state to capture all updates, reducing the number of intermediate DataFrames.
Replace condition checks (if df) with state checks (if 'vin_validation_action' in st.session_state).
Modularize business logic into separate functions to simplify the main app file.
While this approach has cleaned up some code, I still face challenges with state management and debugging.
Solution 2: Multi-Page Approach
Split functionality across multiple pages, each handling a specific task.
Retain a single df_final DataFrame and use session_state to track user actions and unlock new pages.
This might simplify debugging and code maintenance, but I’m unsure if it’s the best path forward. Certainly worse for the user experience of having everything in one page.
Request for Input
What would you recommend as best practices for this use case to maintain clean, easily debuggable, and bug-free code? Any thoughts or suggestions would be greatly appreciated!
Hi Lucas,
Interesting! I’m dealing with a similar problem at work. My advice:
A single simple data upload page + verification: just make sure the input file is OK. No functionalities are shown until data is uploaded and verified.
If all OK, dynamically add different pages with specific functionalities to user. This is easier to maintain, update and grow.
I hate working on large files and scrolling up and down trying to find the precise function. Things work out better if you modularize the components.
In addition to what @sebastiandres has mentioned, remember to use st.fragment on individual, independent actions to prevent the entire script from rerunning.
I do have some validation to the input file that I didn’t mention. Not a big deal so far.
makes sense different isolated pages would be easier to maintain.
Question for you: How do you use the session_state then both for tracking multiple states and the “most updated version of the df”? Do you keep them in memmory or go to a SQL DB to read/write the changes made by the user?
and it seems you tackle very similar problems that I do and solve them with Streamlit.
Would you be up for having a chat and show a bit how we tackle similar issues, eventually learning something new? In case you’re up to I can DM you to figure something out.
I feel like my problem is a bit of the opposite, when something updates in the widget the user is working on (say st.data_editor), I need this update to be updated over alllllllll the other widgets.
So far I haven’t needed to pass dataframes to memory, as the way I’m splitting the functionalities allows for “clean cuts”.
I think that whether keep them on memory or pass them to a file/database depends more on the computing cost. If it takes < 1 minute, keeping them on memory is good enough IMHO.
And sure, send me a DM. Would love to chat and brainstorm.
Thanks for stopping by! We use cookies to help us understand how you interact with our website.
By clicking “Accept all”, you consent to our use of cookies. For more information, please see our privacy policy.
Cookie settings
Strictly necessary cookies
These cookies are necessary for the website to function and cannot be switched off. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms.
Performance cookies
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us understand how visitors move around the site and which pages are most frequently visited.
Functional cookies
These cookies are used to record your choices and settings, maintain your preferences over time and recognize you when you return to our website. These cookies help us to personalize our content for you and remember your preferences.
Targeting cookies
These cookies may be deployed to our site by our advertising partners to build a profile of your interest and provide you with content that is relevant to you, including showing you relevant ads on other websites.