New Component: streamlit-webrtc, a new way to deal with real-time media streams

@Om_Surushe Hi, I think it is not possible with the current version, and the following issue would cover it. Please be patient for it.

Thank you for your reply I figured a way by which I was able to pass my source frames

using lock but the same did not work with audio

import av
import streamlit as st
from streamlit_webrtc import WebRtcMode, webrtc_streamer
import threading
import time

# create a lock for the thread
lock = threading.Lock()
video = {
    'video_frame': None,
    'video_count':-1, 
    'audio_frame': None,
    'audio_count':-1,
    }


def video_frame_callback(frame: av.VideoFrame) -> av.VideoFrame:
    # with lock:
    #     frame_list = video['video_frame']
    #     count = video['video_count']
    
    # if count == -1:
    #     container = av.open('question_0.avi')
    #     frame_list = list(container.decode(video=0))
    #     count = 0
    #     with lock:
    #         video['video_frame'] = frame_list
    #         video['video_count'] = count
    # else:
    #     count += 1
    #     with lock:
    #         video['video_count'] = count

    # if count >= len(frame_list):
    #     count = 0
    #     with lock:
    #         video['video_count'] = count
    # print("video ",count)
    # return frame_list[count]
    return frame

def audio_frame_callback(frame: av.AudioFrame) -> av.AudioFrame:
    # with lock:
    #     frame_list = video['audio_frame']
    #     count = video['audio_count']
    
    # if count == -1:
    #     container = av.open('question_0.wav')
    #     frame_list = list(container.decode(audio=0))
    #     count = 0
    #     with lock:
    #         video['audio_frame'] = frame_list
    #         video['audio_count'] = count
    # else:
    #     count += 1
    #     with lock:
    #         video['audio_count'] = count

    # if count >= len(frame_list):
    #     count = 0
    #     with lock:
    #         video['audio_count'] = count
    # print("audio ",count)
    # print(type(frame_list[count]))
    # return frame_list[count]
    return frame

ctx = webrtc_streamer(
    key="omg",
    video_frame_callback=video_frame_callback,
    audio_frame_callback=audio_frame_callback,
    media_stream_constraints={
        "video": True,
        "audio": True,
    },
    rtc_configuration={"iceServers": [
        {"urls": ["stun:stun.l.google.com:19302"]}]},
    mode=WebRtcMode.SENDRECV,
    )

here question_0.avi is my source

1 Like

@Om_Surushe
I see.
If it is OK to upload video and audio from the client and ignore all of them, the SENDRECV mode can be used just like you did.

In that case, the callback must return a frame object whose props are the same as the input frame because its original purpose was to transform the input to output. For example, in the case of audio, the props include the # of channels and the sampling rate.
I guess this is why your code didn’t work.

As I am not an audio expert, I don’t know the best practices for manipulating such props, but I used pydub for it in an audio example linked below, FYI.

Hello @whitphx,
Is this supported now?
I checked link below but I am not sure about that.
Thank you.

No,

it’s rather on this issue as written in New Component: streamlit-webrtc, a new way to deal with real-time media streams - #128 by whitphx, and it’s not available yet.

1 Like

It seems like GitHub - whitphx/streamlit-stt-app: Real time web based Speech-to-Text app with Streamlit (https://whitphx-streamlit-stt-app-app-deepspeech-m6tt1k.streamlit.app/ ) is not working on the streamlit cloud. It was working until yesterday but something went wrong and now it is not working.

@Vishnu_Teja Thank you for the report.
It may be streamlit-webrtc is not working and it is not due to component · Issue #6330 · streamlit/streamlit · GitHub, while it’s under investigation and still not fixed. Please track the issue.

WebRTC apps hosted on the Community Cloud have been broken as reported at Inconsistent issue with streamlit-webrtc in streamlit app · Issue #1213 · whitphx/streamlit-webrtc · GitHub, but they are now working after some fix.
Please let me know if there are still something broken.

@Vishnu_Teja The STT app should also work now. Please check it :slight_smile:

Hey, all I’m creating a web app that recognizes emotion from real-time video using Microsoft’s DeepFace library. I am able to get the webcam activated and have real-time analysis on my local computer. This works perfectly when the camera is running but when I hit the Stop button, I receive an error regarding setting the detected dominant emotion to st.session_state["user_emotion"]. I am able to accurately set the “user_emotion” session_state variable to the emotion detected until I hit the Stop button My error is given below:

Traceback (most recent call last):
  File "/Users/v.esau.hutcherson/.local/share/virtualenvs/StreamLit-ohTsyygW/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
  File "/Users/v.esau.hutcherson/StreamLit/pages/listings.py", line 146, in <module>
    if st.session_state["user_emotion"] == "neutral" or "suprised" or "happy":
  File "/Users/v.esau.hutcherson/.local/share/virtualenvs/StreamLit-ohTsyygW/lib/python3.10/site-packages/streamlit/runtime/state/session_state_proxy.py", line 90, in __getitem__
    return get_session_state()[key]
  File "/Users/v.esau.hutcherson/.local/share/virtualenvs/StreamLit-ohTsyygW/lib/python3.10/site-packages/streamlit/runtime/state/safe_session_state.py", line 111, in __getitem__
    raise KeyError(key)
KeyError: 'user_emotion'

My code for the webrtc_streamer and for the deep face integration is implemented like this:

lock = threading.Lock()
img_container = {"img": None}



face_cascade=cv2.CascadeClassifier("haarcascade_frontalface_default.xml")

def video_frame_callback(frame):
    img = frame.to_ndarray(format="bgr24")
    with lock:
        img_container["img"] = img
    return frame

frame_rate = 1
ctx = webrtc_streamer(key="example", video_frame_callback=video_frame_callback,
                       media_stream_constraints={
        "video": {"frameRate": {"ideal": frame_rate}},
    },
    video_html_attrs={
        "style": {"width": "50%", "margin": "0 auto", "border": "5px purple solid"},
        "controls": False,
        "autoPlay": True,
    },
                      )
if "emotion" not in st.session_state:
    st.session_state["emotion"] = ""
while ctx.state.playing:
    with lock:
        img = img_container["img"]
    if img is None:
        continue
    emotion_data = DeepFace.analyze(img_path=img,actions=['emotion'],enforce_detection=False)
    if emotion_data != []:
        st.session_state["emotion"] = emotion_data[0]["dominant_emotion"]

The data that I receive from the DeepFace.analyze method is given like this:

[
0:{
"emotion":{
"angry":0.0645486346911639
"disgust":0.0000023556083306175424
"fear":0.0018471573639544658
"happy":95.05292773246765
"sad":0.23144783917814493
"surprise":0.10018055327236652
"neutral":4.549040272831917
}
"dominant_emotion":"happy"
"region":{
"x":206
"y":103
"w":241
"h":241
}
}
]

I assumed I should always be able to access the analyzed dominant emotion by doing emotion_data[0]["dominant_emotion] and set it to the st.session_state["user_emotion"] variable however for some reason when the camera is run for around 30 seconds or more I receive an error regarding the st.session_state variable of “user_emotion” does anyone know of a fix?

Hello @whitphx ,

I have a small question , how can I use webrtc streamer to access the image frame and also the frame_number at the same time as I need to pass these into a function . Also I need to stop the stream at frame_number =23.

Thank you ,

Hi,

So I have tried this, however i get this error message => AttributeError: ‘list’ object has no attribute ‘render’

The entire code:

import numpy as np
import cv2
import av
from ultralytics import YOLO
from streamlit_webrtc import webrtc_streamer

model = YOLO('yolov8n-seg.pt')


def video_frame_callback(frame):
    image = frame.to_ndarray(format="bgr24")

    results = model(image)
    output_img = np.squeeze(results.render())
    #output_img = np.squeeze(results.render()[0])

    return av.VideoFrame.from_ndarray(output_img, format="bgr24")


webrtc_streamer(key="example",
                video_frame_callback=video_frame_callback,
                media_stream_constraints={"video": True, "audio": False})

Kindly assist.

Hi @whitphx, thank you for the great work you have done!
I noticed there are video_frame_callback and audio_frame_callback in the Callbacks. Is there a way to deal with both video and audio in a single callback? My intention is to process the input audio, and transform it into a streaming video, if there is no such callbacks, is there any work-around to deal with that?

Thanks very much !!!

@Weimeng_Luo I commented in Is there a way to process both video and audio in one callback? · Issue #1329 · whitphx/streamlit-webrtc · GitHub which is the same topic. Thanks

@AfroLogicInsect Looks like the error message told the exact reason…? I’m not familiar with the ultralytics package. You should check the type of result.

@stanny370599 Hi, there is not frame counter. You should implement it by yourself by incrementing a counter variable in a callback.

@Esau_Hutcherson I can’t find what’s wrong. What’s the code consuming st.session_state["user_emotion"]?

hi, thanks for a good framework. I’m just getting started. Please need support, how to launch the application without pressing the “Start” button, how to implement

@leggion Hi,
setting the desired_playing_state argument as True can do it.

↓This sample helps.

1 Like

Is it possible to somehow set resolution of the frame inside recv(frame)?
Maybe there is a way to enableCpuOveruseDetection = false

I am trying to take a snapshot of the frame and the resolution is low quality in mobile.

Thanks