Hello, Streamlit newbie here. I’m building a web app to label data, which will run on an AWS instance. In short, it consists of Streamlit + MongoDB (local database). In first approximation, how many concurrent users could the app have? In your opinion, are there limitations that I should take into account? Thanks heaps!
Hi @albusdemens -
Streamlit is built upon Tornado, which provides this passage in their documentation:
Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed. By using non-blocking network I/O, Tornado can scale to tens of thousands of open connections, making it ideal for long polling, WebSockets, and other applications that require a long-lived connection to each user.
So ignoring the fact that Streamlit could be using Tornado wrong (not saying that it’s not possible, let’s just assume we aren’t or that it’s a fixable issue), the next question is whether every other library you want to use is also written for performance. Of course, that’s a big assumption!
The biggest place where you will run into issues is thread-safety. matplotlib is a notorious performance trap in Streamlit, because the user needs to add some code for thread-safety. You can extend this line of thinking even further…is the Python mongodb library thread-safe, and so on.
So how many concurrent users a Streamlit app can support can be viewed as a function of:
- the machine you are running on, what machine mongodb is running on
- the Python libraries you use
- what you are doing, and how performant that underlying Streamlit function is
- data size
- and so on
We do have large corporate users, and within Streamlit Cloud we do have enterprise customers where we scale their resources up considerably higher than the ‘community’ free tier, so it can be done. It just becomes a systems integration issue than purely a Streamlit one.
Best,
Randy
That is a very accurate statement.
So do you mean the ‘community’ free tier doesn’t scale for thousands of users using Tornado when deploying it on AWS?
Hi @Danielse,
The “community free tier” is Streamlit Cloud – so if you’re deploying Streamlit with AWS, you’re using the open-source product, not the free tier of Streamlit Cloud. We can’t really speak to the scalability of your Streamlit app if it’s deployed outside of Streamlit Cloud, since that depends on your setup/the resources allocated to the app. That said, it’s always a good idea to keep thread safety in mind when developing your app.
Hope this helps!
Best,
Caroline
A relative Streamlit newbie here. I have an app hosted on the Community Cloud being tested by < 5 users at present. Not encountering any issues yet, but in your experience does Streamlit scale to 100s or 1000s of concurrent users? At what point should an alternative solution be considered - either migrating away from the community cloud, or moving to another platform? When the app is fully deployed in production, we expect a maximum of about a 1000 users, perhaps a max of 500 concurrent at any time.
I also would like to know that.
Would my app crash if it had 1000 concurrent users on Streamlit community cloud? I have no idea.
The good news is that in Streamlit app gallery there are some apps with more than 50k views, which might indicate they could have handled a lot of users simultaneously.
On the one hand, Streamlit Community Cloud allocates a fixed amount of resources to each deployed app (no auto-scaling), so you need to test your apps’ memory usage and performance while having a large number of concurrent sessions.
On the other, and as mentioned by other people on this thread, scalability depends heavily on your app code and dependencies. Example considerations:
- Are your app dependencies and code thread-safe?
- How much memory is your app using (baseline memory usage + additional memory for each concurrent session)?
- Are you performing CPU-bound tasks in your app?
- Is your app connecting to other resources (for example, if you are connecting to a database) which add their own scalability considerations?
Hi Streamlit newbie here. The thread safety issue is interesting to me. I’m using st.session_state
to persist information from one page to the other in my app… would each concurrent user have their own or would they all share, effectively making my idea impossible?
I know the documentation says for each user session
, but others seem concerned about this topic anyway… do things get tricky?
@Amit_Kohli, from my understanding, each st.session_state is not shared across users. Each concurrent user has their own.
But Streamlit has in their roadmap a plan to create a shared noSQL database for each Streamlit app:
What code should I add to make matplotlib thread-safety?