Video input to a deep learning model

I am trying to deploy an app using streamlit with my trained deep learning models.
I have two models that are being deployed, the output of the first model is processed and input to the next model which gives the final output.
I am trying to display the input to first model and its output side by side using the st.beta_columns, the same is done for the second model input and output in another row.
One frame of the video is processed and all the outputs are displayed as needed but as soon as the next frame from the video is input to the models, the webpage scrolls down and makes another set of two rows with the new frame in them, what I am trying to accomplish here is that I would like that the same columns that were created first are updated again without having the app create another set of two rows.
Currently the video is input frame by frame and new rows are created each time for the frames which results in a very long webpage where I have to scroll to the bottom to see the latest frame.
I will attach an image for reference to show how it looks

1 Like

Hi @muazshahid, welcome to the Streamlit community!

If you want to re-use the same widget space over and over, to avoid the repetition you are seeing, the st.empty() pattern is what you need to use:

https://docs.streamlit.io/en/stable/advanced_concepts.html#animate-elements

Best,
Randy

I tried to do that, and I am still facing the same scrolling issue regardless of this.
When I don’t use the st.beta_columns in the code and display all outputs in a single column, those images are automatically replaced by the new frames and outputs but this does not work with the st.beta_columns.

Can you post your code so we can take a look at it?

I cannot share my entire code on here but part of what is dealing with streamlit is shown below, where the input is a video and I am capturing the frames of the video which are passed on to the network, the network returns the frames with predicted objects which I am displaying using the columns.
The columns do not update when a new frame comes and new predictions are made, instead of that new rows are added to the same columns.

while cv2.waitKey(1)<0:
hasFrame1,frame1=cap1.read()
if not hasFrame1:
print(“Done”)
cv2.waitKey(3000)
break
#cap1 = cv2.VideoCapture(new_name)
st.set_option(‘deprecation.showPyplotGlobalUse’, False)
FRAME_WINDOW1, FRAME_WINDOW2 = st.beta_columns(2)
FRAME_WINDOW1.subheader(“Original Image”)
st.text(“”)
plt.figure(figsize = (15,15))
plt.imshow(frame1)
FRAME_WINDOW1.pyplot()
frame1 = cv2.cvtColor(frame1,1)

	    # Create a 4D blob from a frame.
	blob = cv2.dnn.blobFromImage(frame1, 1/255, (inpWidth, inpHeight), [0,0,0], 1, crop=False)
	    # Sets the input to the network
	net_1.setInput(blob)

	    # Runs the forward pass to get output of the output layers
	outs_1 = net_1.forward(getOutputsNames(net_1))

	    # Remove the bounding boxes with low confidence
	cropped1 = postprocess(frame1, outs_1,classes_1)
	    
	    # Put efficiency information. The function getPerfProfile returns the overall time for inference(t) and the timings for each of the layers(in layersTimes)
	t, _ = net_1.getPerfProfile()
	label = 'Inference time: %.2f ms' % (t * 1000.0 / cv2.getTickFrequency())

	FRAME_WINDOW3, FRAME_WINDOW4 = st.beta_columns(2)
	st.text("")
	FRAME_WINDOW2.subheader("Display Localization")
	st.text("")
	plt.figure(figsize=(15,15))
	plt.imshow(frame1)   
	FRAME_WINDOW2.pyplot()  
	# FRAME_WINDOW3, FRAME_WINDOW4 = st.beta_columns(2)
	st.text("")

	FRAME_WINDOW3.subheader("Cropped Display")
	st.text("")
	plt.figure(figsize=(15,15))
	plt.imshow(cropped1)
	FRAME_WINDOW3.pyplot() 



	blob = cv2.dnn.blobFromImage(cropped1, 1/255, (inpWidth, inpHeight), [0,0,0], 1, crop=False)

	# Sets the input to the network
	net_2.setInput(blob)

	    # Runs the forward pass to get output of the output layers
	outs_2 = net_2.forward(getOutputsNames(net_2))

	    # Remove the bounding boxes with low confidence
	cropped2 = postprocess(cropped1, outs_2,classes_2)
	    
	   
	    # Put efficiency information. The function getPerfProfile returns the overall time for inference(t) and the timings for each of the layers(in layersTimes)
	t, _ = net_2.getPerfProfile()
	label = 'Inference time: %.2f ms' % (t * 1000.0 / cv2.getTickFrequency())
	st.text("")
	FRAME_WINDOW4.subheader("Digit Localization")
	st.text("")
	plt.figure(figsize=(15,15))
	plt.imshow(cropped1)
	FRAME_WINDOW4.pyplot() 
	st.write("Finished")