Web microphone stream to llm

nikmere · June 18, 2024, 8:09am

hello,
im trying to achive web microphone streaming
i have stremalit app in linux server that doesnt have mic, so im tying to use web microphone without success…
is there any way to achive my goal?

here is the code:

import streamlit as st
from streamlit_webrtc import webrtc_streamer, WebRtcMode, ClientSettings
import azure.cognitiveservices.speech as speechsdk
import numpy as np
import av
import threading


speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
audio_stream_format = speechsdk.audio.AudioStreamFormat(samples_per_second=16000, bits_per_sample=16, channels=1)
audio_input = speechsdk.audio.PushAudioInputStream(audio_stream_format)
audio_config = speechsdk.audio.AudioConfig(stream=audio_input)
recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

# WebRTC Client Settings (updated)
rtc_configuration = {"iceServers": [{"urls": ["stun:stun.l.google.com:19302"]}]}
media_stream_constraints = {"audio": True, "video": False}

# Initialize session state
if 'recognized_text' not in st.session_state:
    st.session_state['recognized_text'] = ""

def audio_callback(frame: av.AudioFrame) -> av.AudioFrame:
    audio_data = frame.to_ndarray()
    audio_data = audio_data.mean(axis=1).astype(np.int16).tobytes()  # Convert to mono, int16
    audio_input.write(audio_data)
    return frame

def recognize_continuously():
    while True:
        result = recognizer.recognize_once()
        st.session_state['recognized_text'] = "Recognized: {}".format(result.text)

st.title("Streamlit Web Microphone to Azure Speech SDK")

webrtc_ctx = webrtc_streamer(
    key="speech-recognition",
    mode=WebRtcMode.SENDRECV,
    rtc_configuration=rtc_configuration,
    media_stream_constraints=media_stream_constraints,
    audio_processor_factory=lambda: audio_callback,
)

if st.button("Start Recognition"):
    recognition_thread = threading.Thread(target=recognize_continuously)
    recognition_thread.start()

# Display the recognized text from session state
st.write(st.session_state['recognized_text'])

dataprofessor · June 18, 2024, 8:10pm

Hi @nikmere

Perhaps you can look into client side implementation to take audio from microphone input. Here’s a thread on this using HTML5

system · December 15, 2024, 8:11pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Real-Time Speech-to-Text Using Browser Microphone and Azure Web App in Streamlit Deployment azure , speech-to-text , discussion , azure-apps-service , javascript	0	470	October 29, 2024
Realtime speech to speech LLMs and AI discussion	0	161	August 23, 2024
Microphone input, audio output Using Streamlit	2	2158	November 19, 2021
Record sound from the user's microphone with streamlit Using Streamlit	9	37300	June 9, 2023
Speech to text in st.chat_input Using Streamlit	3	10318	February 17, 2024

Web microphone stream to llm

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies