I am making an app where users can upload data and I would like to be explicit in my warnings about the data processing flow.
So there are frontend (user browser running JS) and backend (server running Python).
From my understanding the streamlit backend is hosted on Google servers
When an user uploads a file, then:
It is sent to the backend Streamlit Google servers running Python
Python reads the file and sends it in a BytesIO format back to the front end (it is not kept on the server whatsoever)
The data is then stored in the RAM memory on the user machine (potentially cached so that it’s not lost on a browser reload)
Similarly, when I’m later fitting a model on the data
Data is again sent to the server
Python fits the model and gets a result
The result is sent to the front end
Is that a correct understanding?
Or is the data actually stored in the RAM on the server side? Ii would make sense because then there’s no need to send it back and forth between the server and user, but that would require tons of RAM from streamlit servers?
Or maybe Python is somehow translated to JS and actually simply runs in the user browser.
All of the data is held and processed on the server hosting Streamlit. What is sent to the front end are the visuals for display, whatever that may be. So when a user uploads a file, it goes to the server machine (running Python/Streamlit) and sits on that server in memory until/unless you do something to write that data somewhere. That data only goes back to the front end of you do something to display that data or explicitly provide a download option to the user.
The user’s browser isn’t doing any of the data processing. Only the server is executing the Python computations you specify. Streamlit Cloud has a 1GB resource limit, so if you have a large model, you won’t be able to use it and will need to find another hosting solution.