I was curious if there blogs, docs, etc on how well streammlit scales in general? I am looking for more just ball park number of simultaneous users can generally handle well (with assumption that incredible high CPU intensive tasks, eg dashboards generally using couple dataframes with perhaps 20-50 rows, not high CPU intensive transforms and such being done on dataframe).
I namely ask because reading articles like Why we are excited about JupyterLab 3.0 dynamic extensions! | Quansight know makes claims on streamlit not being scalable, but wondering to what extent talking about here.
For more specifics for our company we are thinking of using streamlit for internal tooling and with perhaps 100-200 users using, with perhaps highs of probably around 30-50 users at a time using app (these numbers are pretty rough but just give some idea of scale thinking of).
Don’t know if any one has experience on hitting load issues and where about were seeing these issues?
Where are you planning to host the app? A big part of the scaling is how you set up the deployment
Streamlit uses the Tornado framework. It would be nice to read this discussion about some features that made Tornado more interested than Flask for this project. But the main point is that:
Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed. By using non-blocking network I/O, Tornado can scale to tens of thousands of open connections, making it ideal for long polling, WebSockets, and other applications that require a long-lived connection to each user.
From: Tornado Web Server — Tornado 6.2 documentation
So, I would say that Tornado Framework scale very well and is excelent for the requirements and needs of Streamlit, but there is a lot of other points that can impact in the performance and scalability of Streamlit:
- Deployment: If you use only a VM or local server, the application will never scale like Kubernetes and other Container Orchestration tools.
- IO Operations: If the application needs a lot of IO operations, this can be the bottleneck depending if a cache is used, the network resources available, database configuration and limits… all this things can decrease the performance of the applicantion and are not related with the Streamlit server
- Program Complexity: If the application is computation expensive, has many operations or/and has a lot of widgets, this could be the bottleneck of the application, because the request needs a lot of CPU resource/time to complete the client request. Large, Complex Streamlit Apps performance
So yes, as the last post, the most important thing for Streamlit App scaling is how you deploy your code (Consider that IO is not a limitation)
@Amanda_Kelly / @fhtsibuya so as of now we are deploying to single local server (so very simple). is a multi page streamlit dashboard with each dashboard being fairly low CPU/IO. I am thinking of trying to run some load tests via locust (will report back if do). just was wondering if there are general rules of thumb and expectations from peoples experiences?
You can have a few hundred connections retrieving database results without issues. The one call out is if you restart a server, every single tab open reconnects causing a delay for users. Best to have multiple access ports for improved experiences in those cases.