Feature idea, Pyodide JupyterLite like Streamlit

So it’s a bit “out there” but the JupyterLab notebook is currently able to be completely “hosted” on a static web page using Pyodide kernels using web assembly. This is Python running directly within the user’s browser. No application server needed:

JupyterLite also has a really easy way to bundle up notebooks and python environments all within a nice reproducible archive.

I can see this model being absolutely awesome for Streamlit. Imagine being able to share a local version of a Streamlit app just by sending someone an archive that opens up the web browser and all of the Python runs within the user’s browser, without needing any Python installed. Self hosting Streamlit would be multiple orders of magnitude easier, one could host a Streamlit app with GitHub pages.

Anyway, here’s my proposal, support a Pyodide backend for Streamlit allowing Streamlit apps to run completely within the browser :slight_smile:

1 Like

This idea also crossed my mind :slight_smile: and actually I recall multiple Creators have toyed with the idea: @whitphx @asehmi if you want to pop in the conversation with your experiences.

It will probably run through the same issues as Install with pyodide · Issue #7764 · dask/dask · GitHub :

  • tornado install with micropip still is experimental
  • from memory, Streamlit runs python code from ReportContext in a thread, which I don’t know how WASM manages yet since its threading model is supposedly in proposal mode.

I also wanted to do the smaller thing, load ipywidgets into Streamlit through Pyodide ipywidgets support (or any object with _repr_html_) · Issue #1598 · pyodide/pyodide · GitHub but it is still a little bit hard. Yuichiro I think managed to do a PoC of streamlit webrtc client-side

But yeah we are definitely very interested in the idea!

3 Likes

@andfanilo I haven’t done anything in this space as yet :slight_smile:

1 Like

Hi, thanks for mentioning me :slight_smile:

Although I have not made any progress on code yet, I will dump my thoughts here roughly:

  • Execution flow overview of Streamlit at initialization (code reading memo)
    • Launch the server process
    • The server serves these paths:
    • When a user access to the server, index.html and associated contents are loaded and React’s SPA is boot up.
    • The SPA establishes a WebSocket connection
    • This WebSocket connection access /stream path defined above. From this point, the frontend and the server communicates via this WebSocket connection and the frontend is rendered dynamically.
    • When the WebSocket connection starts, the session object (a ReportSession object) is created at _BrowserWebSocketHandler#open()
      • ReportSession next instantiates its child objects and the call chain would be like ReportSessionScriptRunnerReportThread
  • What we have to do to turn Streamlit into frontend Pyodide runtime
    • Run the server-side code on WebWorker with Pyodide
      • This code will be called from the frontend SPA. So the booting process would be kind of “reversed” - frontend is loaded first, the server-side is loaded next from the frontend.
    • Replace the WebSocket connection with messaging on the WebWorker
      • It would not be much difficult as both are async so they are mostly interchangeable, I think.
    • Other paths like health check would be the same.
  • Implementation details
    • To replace the WebSocket connection,
    • For such rewrite, I think the Streamlit repository have to be forked
      • In the case of Jupyter, the server and frontend are decoupled and the connection between them is highly abstracted. This is the reason why Jupyter Lite can be developed independently from the Jupyter and can import Jupyter core into its Pyodide runtime.
      • On the other hand, Streamlit is tightly coupled with Tornado and the communication layer is not abstracted.

1 Like

Surely not forked, instead just have the required preparatory refactoring rework undergone within the main repository first achieving the required decoupling. I think its quite important to keep this ecosystem/community tightly knit and not split out into forks.

Even if that decoupling work takes ~6 months worth of pull requests, for the long term maintainability of something like this that investment is worth it.


Another benefit I should add for the Streamlit company is that the costs for running the Streamlit cloud hosting could be significantly reduced when Python is running client side instead of on Streamlit servers. Also, in cases where the data being processed is sensitive, having that data never leave the client’s machine is quite beneficial.

1 Like

@SimonBiggs
That’s right.
I didn’t want to refer to the project fork.

I wanted to say just about technical issue that some fixes are needed on the core and the development must be tightly coupled with the core development, which is different from Jupyter Lite. The fork here simply means “forking a branch in the main repo” unlike Jupyter Lite, which is a separate repo independent from the core.

As a separate discussion, I’m not sure if it can be merged into the main stream - the core design/development is not community-driven, but the core development team of the company does.
Of course the forked development can be merged into the main branch, but maybe not. It must be discussed with the core dev team.
It’s a project management/design decision and out of scope I wanted to write about in the post.


The advantages you wrote is true and such information is important to discuss with the core dev team, I think.
If I add something,