New Component: streamlit-webrtc, a new way to deal with real-time media streams

I have a question:
Is it possible to change the audio input device from microphone to audio out/speakers ?
Such that somethin can be played and worked with like you can do with the voice from the mic?

WEBRTC_CLIENT_SETTINGS is not passed to webrtc_streamer.

iceServers information provided via client_settings argument of webrtc_streamer() is necessary to establish the connection over the Internet.
To know about these stuff, see Python WebRTC basics with aiortc - DEV Community or google keywords like “ICE” or “STUN” with “WebRTC”.

2 Likes

Is it possible to change the audio input device from microphone to audio out/speakers ?

You can select the input audio device from “SELECT DEVICE” button, but I can’t understand what changing the input device to speakers means.
At least I think it’s not streamlit-webrtc's duty.

Such that somethin can be played and worked with like you can do with the voice from the mic?

You can play audio files via MediaPlayer object and apply processing to its stream. See the app.py for the usage.
However I can’t understand how it relates to the question above.

1 Like

@whitphx Thanks. It fixed the problem when I deploy in Google cloud it still exists.

Hi community,

A new version, v0.24 has been released.
With this version, more flexible stream connections become possible, such as forking and mixing.

A new sample code app_multi.py shows such forking and mixing functionality:

A single input are forked to multiple outputs with different filters:

Multiple inputs with different filters are mixed to a single output:

In addition, app_videochat.py is also an interesting example. Forking and mixing media streams are necessary parts for building a video chat system, in combination with streamlit-server-state, which enables communication across sessions.

Please also see A video chat app with realtime snapchat-like filters! for video chat apps!

4 Likes

Hi, thanks for the fast answer.
I have seen and used the mediaplayer object, i was wondering whether this could be made independent of such an mediaplayer function by grabbin the audiostream that is going to the speakers from for example youtube and working with this. like you can do with digital soundcard.

I have a new question i m using the audio processor factory to send audioframes to a backend and create subtiltes for videos. This is working but audio and sometimes the video gets laggy because of a huge amount of “Thread ‘async_media_processor_3’: missing ReportContext” errors. Because i am not starting the async_media_processor myself, i can not give it the ReportContext. Do you know how to handle this problem?

Thanks for your work!

When starting your “app_videochat.py” i encountered following problem:

ImportError: cannot import name ‘WebRtcStreamerContext’ from ‘streamlit_webrtc’
(/usr/local/lib/python3.8/dist-packages/streamlit_webrtc/init.py)

streamlit 0.85.1
streamlit-server-state 0.2.0
streamlit-webrtc 0.24.0

i had to run --upgrade to get to version 0.25.1, now it works, just if someone stumbles on this because he thinks 0.24 is the newest version :slight_smile:

1 Like

Hi,

You had 2 kinds of problems below,

  1. Thread 'async_media_processor_3': missing ReportContext
  2. ImportError: cannot import name ‘WebRtcStreamerContext’ from ‘streamlit_webrtc’

and the second one has been resolved by updating streamlit-webrtc, right?

For 1., I have not seen such a problem and it’s interesting.
Is the message Thread 'async_media_processor_3': missing ReportContext only the line you see?
Or are there any other messages like stack trace? If so, please copy and paste all the lines of the errors.
I appreciate if you create a new GitHub issue for further investigation and discussion in that case.

For 2. your solution is right. That is a bug introduced in 0.24.0 and fixed in 0.24.1.

Hey, i created an issue.
Maybe you know a different way to get the frames out of the webrtc_streamer function?

If i do operations(calculations, get/post, write to file) for example in your “OpenCVVideoProcessor”/“AudioProcessor” functions it leads to lagg and the issued errors.

If i try to just pass frames from “recv” into a global objekt (list) the problem is that the python ID of the global object used inside the VideoProcessor, is not the same as on the outside of the webrtc_streamer function, so i can´t do my operations without lagg on the “outside” by using an outside function that gets his own thread just using the global list of frames.

Thank you for creating the issue.

Do you want to deal with the frames outside recv()?
Though I’m not sure if it’s related to the issue, anyway, New Component: streamlit-webrtc, a new way to deal with real-time media streams - #23 by whitphx may help.
As you said, recv() is running in a different thread and cannot access global objects from inside recv.
Instead, you can pass values through processor instance’s attributes like the example above as they can be accessed both from recv() and the main thread.

1 Like

Hi developers, amazing contribution and work on streamlit-webrtc, really appreciate your efforts. I wanted to know if it is possible to process a frame every few seconds with out stopping the real time media stream ? Thank you

hello @whitphx
I made a STT webapp using streamlit and azure cognitive services and deployed on heroku .But the problem with webapp is it is unable to record voice as there’s no microphone in the server.How does this streamlit-webrtc help me resolving this.

Can you share me the code snippet how to connect the mic?

@Silvester_Stephens
It is exactly what streamlit-webrtc is created for.
Did you see the sample apps, for example real-time object detection? https://share.streamlit.io/whitphx/streamlit-webrtc-example/main/app.py
or is it different from what you want?

@mlnewbie987

how could I save audio to wavfile? · Issue #357 · whitphx/streamlit-webrtc · GitHub might help.

Also, streamlit-stt-app/app_deepspeech.py at main · whitphx/streamlit-stt-app · GitHub is a STT app I have created with streamlit-webrtc. It is different from yours in that it does not use an external API and then deals with all audio chunks in memory without writing them out to files, though it still might be helpful.
This STT app contains 2 types of implementation.

  1. Using audio_receiver property of the context object: streamlit-stt-app/app_deepspeech.py at 53aa726415a4add1b411829d8bd1b4eba007336b · whitphx/streamlit-stt-app · GitHub
    In this way, you can see the audio frames (chunks) are obtained from ctx.audio_receiver object here. This is similar to the one linked above in that both use the ctx.audio_receiver object.
  2. Using AudioProcessor class: streamlit-stt-app/app_deepspeech.py at 53aa726415a4add1b411829d8bd1b4eba007336b · whitphx/streamlit-stt-app · GitHub
    In this way, like above, you can get audio frames in _recv_queued(). See here. In this app, the obtained frames are stored in AudioProcessor.frames property and consumed here afterwards.

Hi I’m trying to make an app that can record video through a webcam and then save that video as a file to the computer.

Do you know how I could do that by using streamlit-webrtc?

@FarzeenF

Hi, please check app_record.py in the repo for recording video/audio.
Though it can record media streams into files on a server, Streamlit now does not provide an official method to download them. For that, search in the forum or other web pages for workarounds, such as How to download file in streamlit
st.download_button is supported from v0.88.0: Version 0.88.0

Hey whitphx and everyone! I wrote a blog on Streamlit and described how to use webrtc snapshot functionality to do object detection with a TF lite model. If you’re curious, check it out. Also, here’s the app.

Again, thanks whitphx for all the hard work! :star_struck:

1 Like

@soft-nougat Hi,
What a great work! Thank you :smiley:

1 Like

Hey i want to thank you too whitphx! You build a really good tool!
Also the new server_state is a nice extension!! :astonished:

I still have questions that are actually “documentation” questions, but as far as i know you are the only source of truth :slight_smile:

As you know i am playing with the audio stream and therefore always try to grab it from streamlit

When using “mode=WebRtcMode.SENDONLY” a way to get it is “webrtc_ctx.audio_receiver.get_frames”
When using “mode= WebRtcMode.RECVONLY,” somethin like “audio_processor_factory = AudioProcessor,” is working via “AudioProcessorBase”.

Now I was looking at your example_videochat, where i would like to get the audio of the opponent
which you receive by SFU over “mode=WebRtcMode.RECVONLY,” with “source_audio_track=ctx.input_audio_track,” what is the best way to get the audio frames?

I tryed to use “audio_processor_factory = AudioProcessor” again and “ctx.audio_processor.frames_lock” like you did in another example, but i get “AudioReciver is not set” even though if i check - there are frames passing the “AudioProcessor”…

Thanks again :smiley:

2 Likes

@Bonian_Riebe
Hi,

Your understanding is all correct.
(If I can add one thing, you can only use audio_processor_factory with RECVONLY mode only when source_audio_track is set)

Then, for your main question, the answer would be:

  • If you want to record the audio, use the recorder.
  • If you want to analyze audio frames, access them inside AudioProcessor.recv(). There would be 2 ways for it.
    1. Set AudioProcessor to the input webrtc_streamer. In this case, the input webrtc_streamer's mode must be set as SENDRECV. SENDONLY does not support processors now.
    2. Use “process_track” inserted between the input and output webrtc_streamers like the MCU with filters example and set the AudioProcessor to the process_track.
      • If you want to set mode=RECVONLY to the input, this way is to go.

FYI: In addition, you cannot use audio_receiver in this case where streamlit-server-state is used.
To use receiver, you have to set up a loop and check the existance of ctx.audio_receiver in each epoch like the audio receiver example (I think the reason you found AudioReciver not set is that you skipped this check).
However, streamlit-server-state cannot work when there is a running loop because it requests sessions to rerun when the state is updated but it’s blocked by the running loop.

I haven’t tested the answer actually. Please let me know if there is something wrong.
Additionally please tell me if the current API cannot cover your needs.

Hi @whitphx,

Thank you so much for creating this module, really appreciate it as I am working to produce this to benefit students in my class. Anyways I ran a into a small problem and I would like to ask for help from you and fellow community members. I have used your previous code to take a snapshot of a frame. I am actually trying to capture a series of frames but unfortunately when I append the image into a list array , when I output the image from the array. The image is in a much lower resolution. I would like to keep it in a higher resolution but do not know how. Appreciate your help. Thank you

Here is a snippet of my code
def main():
class VideoTransformer(VideoProcessorBase):
frame_lock: threading.Lock # transform() is running in another thread, then a lock object is used here for thread-safety.
in_image: Union[np.ndarray, None]

    def __init__(self) -> None:
        self.frame_lock = threading.Lock()
        self.in_image = None
        self.img_list = []

    def recv(self, frame: av.VideoFrame) -> av.VideoFrame:
        in_image = frame.to_ndarray(format="bgr24")

        global img_counter

        with self.frame_lock:
            self.in_image = in_image
            if img_counter > 0:
                print("capturing image")
                self.img_list.append(in_image)
                img_counter -= 1
            if img_counter == 0:
                print("here")
                #photo_capture = False
                #analysis_complete = True

        return av.VideoFrame.from_ndarray(in_image, format="bgr24")

ctx = webrtc_streamer(key="snapshot", video_processor_factory=VideoTransformer)

if ctx.video_transformer:
    if st.button("Snapshot"):
        with ctx.video_transformer.frame_lock:
            in_image = ctx.video_transformer.in_image
            img_list = ctx.video_transformer.img_list
            

        if in_image is not None: #put in column form 5 images in a row
            st.write("Input image:")
            st.image(in_image, channels="BGR")
            st.image(img_list[0], channels="BGR") #lower resolution than the one above in_image
        else:
            st.warning("No frames available yet.")