Large, Complex Streamlit Apps performance

Hi,

Can Streamlit be used to write medium-large, complex apps?

Encouraged with the result of creating several small streamlit apps, I have taken a big step ahead:

I now have a streamlit program that currently stands at about 5.5K lines (and I am only about half way through). Some of the data entry screens contain 30+ widgets of different types.

Every time I try to update a field, it takes anywhere between 6-11 seconds (sometimes more) per field update / screen refresh. Consequently, the entire data entry process is slowed down considerably.

I can’t use st.forms as many widgets on a screen have different conditional logic attached to them. I have tried performance enhancements (eg. create lists with sql outside the data entry function, etc.), without much improvement.

I am unable to reproduce the code here due to confidentiality issues and code size.

Any suggestion will help. Is streamlit really the right choice for large, complex apps?

Thanks in advance

I think this is the answer to why you are having issues. If there are 30 different widgets than can be changed, and each one kicks off the a re-run, you’d have to use a pretty hefty combination of caching and state to make this process run in a performant manner.

Thanks @randyzwitch , for the prompt response. Will try your suggestion.

Cheers

1 Like

Hey @Shawn_Pereira,

I’m a product manager at Streamlit and would love to see that massive app / hear more about what you’re doing. We’re currently working on some performance improvements, so curious to see if we’re tackling the right problems. Would you be up for a 20-30 min call sometime this or next week?

Cheers, Johannes

1 Like

Sure, @jrieke , we can have a zoom meeting any day next week. I can take you through my Streamlit projects then.

Cheers

Shawn, India

1 Like

Fantastic! I’ll write you a DM to set it up.

@Shawn_Pereira, what helped me to build large apps is this benchmarking approach here: Benchmarking a streamlit app

I leave this code permanently in my app and wrap it with a feature toggle if.
In this way you can quickly run small benchmarks by just setting a bool to true and get a pretty good feeling about where your code is slow.
If you focus on the right parts of your code, you can even run large amounts of code quite quickly.

I also store my data in optimized dataframes stored in parquet files that can be read very quickly. This can even be quicker than caching the dataframe.

Another point is that you need caching but you should not overdo it and not use it for every function, especially for large dataframes.

For me it has worked out to use kind of a function chaining where each function calls the previous one and also gets some settings from userinput, usually as a dict. In this way you can use caching for some of them and only rerun the calculations that need to be rerun based on the new user input.

I also made some performance improvements on the aggrid component to better suit my application. Things like this can also be necessary if you find your bottleneck not to be in your own code.

1 Like

Ah totally forgot: I also wrote a component (streamlit-profiler) a few weeks ago, which does the same benchmarking trick @thunderbug1 just posted (but it’s a tiny bit easier to use and looks a bit more Streamlit-y :wink: ). Let me know if that’s useful!

@jrieke, Neat component!
For practical use it would be great if there were arguments to enable/disable the profiler and to activate the detailed output which was handy in a couple of situations :slight_smile:

thanks @thunderbug1 and @jrieke for sharing and your help.

Cheers