Is is possible to stream video and overlay bounding boxes?

I have a CV application that spits out bounding boxes. Rather than creating a copy of every video just to overlay bounding boxes, it would be awesome if the boxes could be overlaid on the video during playback.

Is this possible to overlay content like bounding boxes on videos in streamlit?

1 Like

Hi @ClaytonSmith

st.video doesn’t have the ability to automatically overlay bounding boxes right now, so you would have to, as you say, first create another video (even if in-memory) that has the bounding boxes and then pass it to st.video.

Another solution, depending on your what you need for you exact use case, would be to calculate bounding boxes for each frame of the video and pass the frames one by one to st.image instead.

Something like this:
(Note: this code is untested!)

vid_obj = cv2.VideoCapture(path) 
success = True

frame = st.empty()

while success:
  success, image = vid_obj.read()
  bboxed_image = calculate_bounding_box(image)
  frame.image(bboxed_image, channels="BGR")  # OpenCV images are BGR!
1 Like

I have an app where I’m doing this ^. However, what I’m noticing is that there seems to be a lot of overhead in swapping out the images frame by frame. The rendering of the frames in the app can’t keep up with my object detector and the CPU usage goes through the roof (on a pretty beefy dev server). Is there any way to improve the efficiency of the image widget? Is there some other way to do this more efficiently?

Hi @zjorgensenbits, thanks for your question. This seems a really interesting problem. Can you share a mini-repro so that we can start repro-ing and do some profiling? I have filed a github issue and I would appreciate if you add any extra information there: https://github.com/streamlit/streamlit/issues/720

Best,
Matteo

1 Like

@monchier - Sorry for the delay – here’s a quick snippet you can use to reproduce the issue. I’m not seeing the super high CPU usage with this like I am in my other script, but you’ll see that Streamlit can’t keep up with the video stream at all. I believe the actual video streams at 30fps but Streamlit is only able to render the frames at a few FPS.

import streamlit as st
import cv2

image_placeholder = st.empty()
if st.button('Start'):
    video = cv2.VideoCapture('https://videos3.earthcam.com/fecnetwork/9974.flv/chunklist_w372707020.m3u8?__fvd__')
    while True:
        success, image = video.read()
        image_placeholder.image(image)

Yeah… @zjorgensenbits, yeah, it seems Streamlit it is not keeping up. I checked the CPU, but it stays around 50%. It may be that the latency of sending the images over Websocket is what it is causing the lag. That said, the better solution may be to add the overlay functionality to st.video. What API change would you need? It is not clear to me how to define a bounding box and how it could be passed to something like st.video. We can open a feature request and add this to our engineering pipeline.

Best,
Matteo

My recommendation is two fold:

  1. Render bounding boxes on the web. I think it’s easier and more efficient to render bounding boxes in an HTML5 canvas than render a new video stream in real time

  2. Define a schema for bounding boxes and force users to conform to it. I’d do dataframes with the columns: Frame number, color, Box label, Top, Left, Bottom, Right, Confidence

1 Like

Hi guys,

There’s an improvement going into the next release (hopefully) that will help a lot with this. In a side-by-side comparison running the example given by @zjorgensenbits using the current Streamlit release versus the development branch, the development branch’s display rate seemed to be at least 50% better in terms of fps than the current release.

It’s not perfect but it’s definitely improved.

We’ll keep working on this kind of stuff!

1 Like

Hi @zjorgensenbits, @ClaytonSmith,

Could you give things a try with the latest version of streamlit?

I’m going to close issue 720, feel free to re-open if the performance could use further improvement.

import streamlit as st
import time
import cv2

if st.button('Start'):
    video = cv2.VideoCapture('https://www.mediacollege.com/video-gallery/testclips/20051210-w50s.flv')
    video.set(cv2.CAP_PROP_FPS, 25)

    image_placeholder = st.empty()

    while True:
        success, image = video.read()
        if not success:
            break
        image_placeholder.image(image, channels="BGR")
        time.sleep(0.01)