Using streamlit for an STT / TTS model demo?

Hi!

We are building an STT/ TTS dataset and STT/ TTS models.
Now we have open-sourced around 5k hours of annotation, and we are planning to release further 15k hours.

At some point in time, we would really want to produce some high quality public demos showcasing our STT and TTS models and some cool features that require interactivity. All of this can be done in a web-framework, but none of us are web developers, and as usual I will have to suffer (or my demos will be of low quality, as it usually happens).

Any such demo would require:

  • Showing audio clips
  • Recording short audio clips / uploading files (I saw that you will soon publish an upload widget)
  • Sending such clips to be recognized to a local STT service running on a separate machine
  • Text inputs (already covered)
  • Showing images / tables (already covered)

I just learned about Streamlit, read some docs, looked through the examples. It is awesome! Actually it would work well for dashboards as well.

I found some boilerplate for such a demo, but this would actually require fiddling with web-servers and probably JS, and of course as it goes - it will slow down the development.

So, without further ado, I read this issue (number of links for a new user is limited) because there is a default wav HTML5 player that is very easy to use. I also looked at the docs, and have not found any dedicated widgets to work with audio!

(i)

From the first glance, it seems that the following set of widgets to work with audio would be beneficial:

  • an audio_player widget to show an audio (maybe just a quick and dirty html-in-markdown widget and that’s it?)
  • an audio_recoder widget

Thanks for any pointers and advice.

  • I understand the the first MVP may be just done using HTML + file_picker widget, but nevertheless maybe anyone has better ideas?
  • Adding an external requests call to a local API to get the STT model output also would not be an issue (for security / latency reasons I would not like to call my model from within the streamlit app directly)?
  • Maybe I should open a feature request?

(ii)

Also we use this awesome HTML player hack in our notebook (we just embed an HTML player in the HTML code of the table).
Maybe it is possible to do this in streamlit as well somehow?

Alex

1 Like

Hi @snakers41and welcome to the forum :wave:,

Your STT/TTS dataset and models look awesome and should run well with Streamlit :slightly_smiling_face:!

  • an audio_player widget to show an audio (maybe just a quick and dirty html-in-markdown widget and that’s it?)

This is already supported in Streamlit. See st.audio(data). To use it, just pass any URL or byte array as the data argumnet. And if you put this widget in a loop you’ll get a list of audio players.

  • an audio_recoder widget
    (…)
    I understand the the first MVP may be just done using HTML + file_picker widget, but nevertheless maybe anyone has better ideas?

We don’t have an audio recorder today, but:

  • Adding an external requests call to a local API to get the STT model output also would not be an issue (for security / latency reasons I would not like to call my model from within the streamlit app directly)?

That should work!

Streamlit by itself shouldn’t be less secure than, say, Flask + React. If anything, we try to err on the side of less customizability in order to provide APIs that help you not shoot yourself in the foot.

  • Maybe I should open a feature request?

I’ve opened one for the audio recorder here: https://github.com/streamlit/streamlit/issues/643

But if you have any more requests, feel free to create them on Github.


Hope this helps and encompasses the different functionalities you need. If not, feel free to reach back out, we’d love to keep the dialogue going.
Cheers

Dani Caminos

Hi Dani, many thanks for your replies!
They are really helpful.

(i)

st.audio(data) .

Cool, I totally missed it, because it was located in the charts section.

(ii)

I’ve opened one for the audio recorder here: https://github.com/streamlit/streamlit/issues/643

I will reply within the new ticket - https://github.com/streamlit/streamlit/issues/643, I have some ideas on how to do this really easily =)

(iii)

As you mention, a file uploader would do the trick for now. We’re actually implementing st.file_uploader() at this very moment,

Well, actually when researching this topic I kind of found the alternative that has audio recorder now:

  • Use viola instead of streamlit
  • Just use ipywebrtc to record audio
  • Then you can hack anything - you just need a web developer to customize the look, because stock look of notebooks can be poorly met by the general public
  • … profit

This snippet actually can be polished and made to work with viola

from IPython.display import Audio
from ipywebrtc import CameraStream, AudioRecorder

# actually I found this hack in some js code
# just pass mime type =)
camera = CameraStream(constraints={'audio': True,
                                   'video': False},
                      mimeType='audio/wav')
recorder = AudioRecorder(stream=camera)

# turn the recorder on
# still a bit rusty on whether I need to show it 
# in a separate cell later to make it work
recorder.recording = True
# say something
# turn the recorder off
recorder.recording = False
recorder.save('test.wav')
# enjoy your wav (a typical user will be happy with compressed sound ofc)

(iv)
Also what are your thoughts on viola vs streamlit?

Them seem to fill the similar niche.

I am not a web programmer, but on the surface streamlit seems to have more polish / more secure and viola seems to have many more capabilities - because if I can fit it in HTML5 + notebook => then I can fit it into viola (probably).

(v)

Also I wonder if you are planning to support a multi-page applications (I guess by hacking this you could achieve a look of a multi-page react app), i.e. something like this:
image

Best,
Alex

@snakers41:

Thanks for all the thoughts and suggestions!

(ii) + (iii) Thanks for bringing ipywebrtc to our attention. That’s a great potential solution!

(iv) Voila is fantastic. I suggest you play with both and see which flow you prefer!

(v) A number of people have created multi-page applications in Streamlit (see @Marc’s awesome-streamlit for an example). Is that what you had in mind? If not, please feel free to submit a feature request.

More generally, we are designing some exciting which will integrate Streamlit and React more closely, and we’re excited to release those in the coming months.

Thank you for using Streamlit! :heart:

2 Likes

@Adrien_Treuille

(ii) + (iii) Thanks for bringing ipywebrtc to our attention. That’s a great potential solution!

Happy to help.

In case by the time I will be actually building my app (~start of December maybe), this will not will have been implemented, can you probably point me to some class / commit / doc line, so that if I add ipywebrtc to streamlit, I can do it effectively and then maybe just submit a PR? Necessity is a great motivation, you know.

(v) A number of people have created multi-page applications in Streamlit (see @Marc’s awesome-streamlit for an example). Is that what you had in mind? If not, please feel free to submit a feature request.

I am speechless.
Wow this is just awesome! A great starter!
Tried to find similar beautiful stuff using viola, but did not find it yet.
This looks like a proper interactive web application with all the perks of streamlit.
Also what is remarkable - I was able to find what is related to what immediately (!).
Obviously you can build your styles with viola, but this awesome app is just perfect …
Too good to be true, the best thing I saw since notebooks (+ extensions) and pytorch!

(iv) Voila is fantastic. I suggest you play with both and see which flow you prefer!

To be honest, I kind of leaned towards viola at first (you can just inherit all notebook ecosystem).
But now since you shared this example, I believe that the more optimal path would be to add a recorder widget by hacking streamlit!

1 Like

Hi @snakers41

I’m the developer of awesome-streamlit.org. So thanks for your feedback above :+1:

I’ve am experimenting with both Streamlit and Voila because I can just see the potential of lowering the friction for creating (analytics) apps in Python.

There are a few comparison apps in the Gallery at awesome-streamlit.org. And I’m also trying to do a more general comparison here https://github.com/MarcSkovMadsen/awesome-analytics-apps.

For me the differences are really about the detail. Currently the main points are

  • A hello world example is much, much easier to create with Streamlit for somebody new to Python. It’s just one package to install and a simple .py file. So it’s much easier to teach inexperienced pythonistas/ data scientists how to create simple apps. There is a huge untapped potential there.
  • Streamlit comes with a working layout/ style out of the box. Voila does not. This is really important for development speed. For this reason alone I prefer Streamlit. As soon as advanced HTML and CSS knowledge is required, then from my experience I end up using a lot of time there. And I should not.
  • Streamlit starts up faster than Voila. Because Voila needs to start a new Kernel per user. I’m in doubt whether Voila can scale to many users because of this “one kernel per user”
  • Both Streamlit and Voila can provide a develop and test in your editor and not notebook experiense. For Streamlit it’s out of the box. For Voila you can set it up like that by having a simple notebook where you import and run your app.py file.
  • Voila has a huge unlocked potential wrt. widgets and layouts. Everything is there. And it’s fast and performent. But nobody has put it together yet. So it takes a lot of time to put together. And the api is cumbersome because it really developed from many different packages that have been added “on top” of exising functionality over a long period of time. And you have to really spend time on Google in order to put things together.
  • With Voila you can develop really advanced stuff using callbacks and jslinks. And it’s really fast. Maybe because you only run small pieces of code for each “call”.
  • Deployment of Streamlit is simpler because it does not have a lot of dependencies. As far as I understand nobody has really described how to deploy voila apps yet. And I have a feeling it includes setting up a jupyter hub server which seems complex to me.
  • With Voila you can serve a folder of apps. With Streamlit you cannot serve a folder of apps directly. You can serve one app.py file and then you have to select the app from a st.selectbox or st.radiobutton that you provide.
  • Voila is a part of the Jupyter ecosystem with many users and contributors. That could be a huge advantage.

If Streamlit could provide something similar to the the ipywidgets including Layout and extensions like the qgrid and with the ability to setup callbacks it could be a game changer for me.

If Voila could simplify the startup time, api, layout and deployment story then it would be a game changer for me.

And then theres also Bokeh, Panel, Dash, Flask and Django to consider depending on your use case.

But Streamlit is by far the easiest to get started with and the fastest to get production with. It’s a huge advantage.

If you (yes you :slight_smile:) believe I’m wrong, then maybe it’s because i’m wrong. Please challenge or enlighten me. Thanks.

1 Like

@Marc
Many thanks for the detailed analysis.

First of all a couple of words about our background - we do ML, we know a bit of python and math, I used to code a bit using php + ajax years ago (using pre-built js modules), but anything js and / or callback related is a major pain (doable, but veeeery slow). Probably, this is an archetypical description of an ML team.

They key pain for us as a team is that we do not really need a dedicated web-programmer (and if we buy work from a freelancer, I will have to support it), but our market puts glossy marketing above proper models and datasets, and if we go public with our demos, they have to be perfect. Whenever I tried building web apps from scratch the looked horrible (for some reason people also find stock jupyter look not very appealing).

So maybe you will find the below comments helpful

  • A hello world example is much, much easier to create with Streamlit for somebody new to Python. It’s just one package to install and a simple .py file.
  • Streamlit comes with a working layout/ style out of the box. Voila does not. This is really important for development speed. For this reason alone I prefer Streamlit. It speeds up my development time. But also means it’s so easy to onboard inexperienced pythonistas/ data scientists.
  • Streamlit starts up faster than Voila. Because Voila needs to start a Kernel. I’m in doubt whether Voila can scale to many users because of this “one kernel per user”
  • Deployment of Streamlit is simpler because it does not have a lot of dependencies. As far as I understand nobody has really described how to deploy voila apps yet. And I have a feeling it includes setting up a jupyter hub server which seems complex to me.

All of these are game-changers / deal breakers.
Obviously it is naive to try to bridge a “valley of death” gap with a dedicated web team from day one, but “one kernel per user” is really “expensive”.

Prior to learning about viola / streamlit I built a crude annotation app and deployed it for 30+ people using jupyter hub. It was not a rough experience, but all in all - the deployment took 2-3 days. So, this app eats at least 100-200 MB RAM per user, which is ludicrous.

In case of a web demo - probably it will mean that during more popular phases / PR moments - it will become a bottleneck. I am not talking about real high-load here, but not being able to serve a web app (not the STT model itself, but just a page) during peaks to say 50 people at the same time is a recipe for disaster.

  • Both Streamlit and Voila can provide a develop and test in your editor and not notebook experiense. For Streamlit it’s out of the box. For Voila you can set it up like that by having a simple notebook where you import and run your app.py file.
  • With Voila you can serve a folder of apps. With Streamlit you cannot serve a folder of apps directly. You can serve one app.py file and then you have to select the app from a st.selectbox or st.radiobutton that you provide.

Both things are good enough here

  • Voila has a huge unlocked potential wrt. widgets and layout. Everything is there. And it’s fast and performent. But nobody has put it together yet. So it takes a lot of time to put together. And the api is cumbersome because it really developed from many different packages that have been added “on top” of exising functionality over a long period of time. And you have to really spend time on Google in order to put things together.
  • Voila is a part of the Jupyter ecosystem with many users and contributors. That could be a huge advantage.

This point is very interesting.
But I would argue that in this case the streamlit approach probably would win, because such apps are built for end-users.

Notebooks are so popular because of flexibility and lack of bells and whistles when you do not need them. Obviously, inheriting from a large ecosystem is very tempting, but I guess not only we have limited resources.

In a similar fashion - I tried jypiter lab. It was good, but not good enough and it added some complexity to the really simple notebook approach. Still cannot switch because it lacks a collapsible-headings plugin.

  • With Voila you can develop really advanced stuff using callbacks and jslinks. And it’s really fast. Maybe because you only run small pieces of code for each “call”.

For teams like us … callbacks and jslinks are black boxes =)

And then theres also Bokeh, Panel, Dash, Flask and Django to consider depending on your use case.

I suppose that Bokeh, Panel, Dash are more like dedicated BI applications (ofc you can use bokeh just as a cool plotting library). You really need them in enterprise (but people tend to buy Tableau anyway, lol) when you have a billion charts.
In a fast moving ML team setting, whenever I was asked to build a dashboard … I ended up using SQL + Excel (I know what you are thinking) just to avoid the setup cost.
But with Viola or Streamlit - just write a plain .py data extraction script (or I just store data alongside and update it daily) and build a nice chart as I usually do.

Flask and Django

To be honest I would argue that an ML team would need django only when building some middleware.

All in all, looking forward to cues from the streamlit team on where to start with webrtc.
I hope hacking one plugin in will be easier than customizing viola.

1 Like

Hi @snakers41. Thanks for providing really good insights and perspectives. :+1:

Really fascinating conversation guys. It’s shaping our thinking about the future.

@Marc: We are actively thinking about layout primitives and are excited to share a proposal with the community. There are a number of other big initiatives, e.g. a plugin architecture, and so we also need to think about sequencing among these features. Your perspective is always helpful as we think about these issues! :heart:

@snakers41: We definitely encourage forking Streamlit and playing with new features, so excited to hear what you learn from ipywebrtc. We haven’t had time to think about what a process would look like to integrate a contribution of the magnitude. Therefore, please allow me to share some preliminary thoughts now, and please know that I’m open to other perspectives here…

  1. The ability to record sound in Streamlit requires careful API consideration because Streamlit has a rather unique event model and we want to make sure all the ideas flow well together! :ocean: Prior to our accepting any PRs into mainline Streamlit, we would want to make sure we’d had a full discussion about API options.
  2. Streamlit and Jupyter have fundamentally distinct architectures. (We considered unifying them at one point, but realized that that would prevent us from doing a lot of the cool features we planned.) As such, it’s not clear that the ipywebrtc path is going to be optimal for Streamlit, or even possible at all. Just a heads-up.

That said, we absolutely encourage forking Streamlit and seeing what you can do! If nothing else this (very ambitious) experiment could teach us a lot about where Streamlit needs to go over the next year.

Please let us know how we can help!

:heart: Adrien

1 Like

(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)

Links I could not post

@Adrien_Treuille

We haven’t had time to think about what a process would look like to integrate a contribution of the magnitude

Oh, I see now why you say so. I should have read this before suggesting ipywebrtc . Sorry for wasting your time here.

As far as I can see from their code, ipywebrtc is mostly based heavily modified jupyter widgets. Your model is very different. So joining them would require basically going up the stack to the jupyter itself, which probably I am not qualified to do =(

Also I see that you use tornado (which is another black box for me) and probably I am jumping over my head again … but tornado has this (I cannot post links for some reason):

Maybe this is just another dumb idea, but just consider this:

  • If you load a webrtc recorder in JS, like here for example (I cannot post links for some reason)
  • JS does its job
  • Your app deals only with an audio file that js saves to a local folder
  • Timestamp + some uid can be used to distinguish between files for several users

I am not sure if what I am suggesting makes sense though. Someone did this with tornado (I cannot post links for some reason) + web rtc, but I cannot tell if it is legit.

1 Like

I also noticed that in the charts section of the docs you provide some wrappers around js libraries
Maybe for webrtc that would also work?

1 Like