@Marc
Many thanks for the detailed analysis.
First of all a couple of words about our background - we do ML, we know a bit of python and math, I used to code a bit using php + ajax years ago (using pre-built js modules), but anything js and / or callback related is a major pain (doable, but veeeery slow). Probably, this is an archetypical description of an ML team.
They key pain for us as a team is that we do not really need a dedicated web-programmer (and if we buy work from a freelancer, I will have to support it), but our market puts glossy marketing above proper models and datasets, and if we go public with our demos, they have to be perfect. Whenever I tried building web apps from scratch the looked horrible (for some reason people also find stock jupyter look not very appealing).
So maybe you will find the below comments helpful
- A hello world example is much, much easier to create with Streamlit for somebody new to Python. It’s just one package to install and a simple .py file.
- Streamlit comes with a working layout/ style out of the box. Voila does not. This is really important for development speed. For this reason alone I prefer Streamlit. It speeds up my development time. But also means it’s so easy to onboard inexperienced pythonistas/ data scientists.
- Streamlit starts up faster than Voila. Because Voila needs to start a Kernel. I’m in doubt whether Voila can scale to many users because of this “one kernel per user”
- Deployment of Streamlit is simpler because it does not have a lot of dependencies. As far as I understand nobody has really described how to deploy voila apps yet. And I have a feeling it includes setting up a jupyter hub server which seems complex to me.
All of these are game-changers / deal breakers.
Obviously it is naive to try to bridge a “valley of death” gap with a dedicated web team from day one, but “one kernel per user” is really “expensive”.
Prior to learning about viola / streamlit I built a crude annotation app and deployed it for 30+ people using jupyter hub. It was not a rough experience, but all in all - the deployment took 2-3 days. So, this app eats at least 100-200 MB RAM per user, which is ludicrous.
In case of a web demo - probably it will mean that during more popular phases / PR moments - it will become a bottleneck. I am not talking about real high-load here, but not being able to serve a web app (not the STT model itself, but just a page) during peaks to say 50 people at the same time is a recipe for disaster.
- Both Streamlit and Voila can provide a develop and test in your editor and not notebook experiense. For Streamlit it’s out of the box. For Voila you can set it up like that by having a simple notebook where you import and run your app.py file.
- With Voila you can serve a folder of apps. With Streamlit you cannot serve a folder of apps directly. You can serve one app.py file and then you have to select the app from a st.selectbox or st.radiobutton that you provide.
Both things are good enough here
- Voila has a huge unlocked potential wrt. widgets and layout. Everything is there. And it’s fast and performent. But nobody has put it together yet. So it takes a lot of time to put together. And the api is cumbersome because it really developed from many different packages that have been added “on top” of exising functionality over a long period of time. And you have to really spend time on Google in order to put things together.
- Voila is a part of the Jupyter ecosystem with many users and contributors. That could be a huge advantage.
This point is very interesting.
But I would argue that in this case the streamlit approach probably would win, because such apps are built for end-users.
Notebooks are so popular because of flexibility and lack of bells and whistles when you do not need them. Obviously, inheriting from a large ecosystem is very tempting, but I guess not only we have limited resources.
In a similar fashion - I tried jypiter lab. It was good, but not good enough and it added some complexity to the really simple notebook approach. Still cannot switch because it lacks a collapsible-headings plugin.
- With Voila you can develop really advanced stuff using callbacks and jslinks. And it’s really fast. Maybe because you only run small pieces of code for each “call”.
For teams like us … callbacks and jslinks are black boxes =)
And then theres also Bokeh, Panel, Dash, Flask and Django to consider depending on your use case.
I suppose that Bokeh, Panel, Dash are more like dedicated BI applications (ofc you can use bokeh just as a cool plotting library). You really need them in enterprise (but people tend to buy Tableau anyway, lol) when you have a billion charts.
In a fast moving ML team setting, whenever I was asked to build a dashboard … I ended up using SQL + Excel (I know what you are thinking) just to avoid the setup cost.
But with Viola or Streamlit - just write a plain .py
data extraction script (or I just store data alongside and update it daily) and build a nice chart as I usually do.
Flask and Django
To be honest I would argue that an ML team would need django only when building some middleware.
All in all, looking forward to cues from the streamlit team on where to start with webrtc.
I hope hacking one plugin in will be easier than customizing viola.