Streamlit not able to access client mic while implementing STT

after deploying streamlit app on a server having Speech to Text implementation, the app is not considering the clients device for mic.

What are you using to access the user’s microphone? To access a user’s device (camera, microphone, files, etc), you need a Streamlit component that can pass that information through their browser to your Python backend.

There are custom components available for this, but fortunately, st.audio_input is coming out in the next version of Streamlit. You can try it out now with streamlit-nightly.

@mathcatsand thanks for replying, i’m using speech_recognition
with sr.Microphone() as source option to listen user speech

I recommend trying out st.audio_input. Unless a library for audio input is specifically written to work with Streamlit, deployed apps won’t work. When you run such an app locally, it can work because the Streamlit server and the client browser happen to be on the same machine. When an app is deployed, the Python server is remote from the user. The Python library will be trying to access the peripherals of the Streamlit server (somewhere “in the cloud”) and not the end user. A Streamlit-compatible library would have to access a user’s microphone through their browser and pass the input back to Streamlit through a custom component.

okay, so to try st.audio_input i just need to install streamlit-nightly and will be able to access this method?

Also, should i use it as: with st.audio_input() as source: ??

I’m pretty new to this setup, really appreciate your help.

Try streamlit-audiorec · PyPI

The docs for st.audio_input will be available when the next release comes out, but it works by returning UploadedFile after someone has pushed the button to record. So you have:

import streamlit as st

sound_clip = st.audio_input("Record speech")
if sound_clip:
    # process the file-like sound clip

The library that @SiddhantSadangi mentioned is one of those custom components I mentioned earlier and is another option.

hey @SiddhantSadangi, thanks for responding. I tried this library but this is failing for me. Do you have any example or article link with speech-recognition that will be helpful.

Hi @mathcatsand i tried with this but i’m getting error of audio_data.

Can you share what code you tried and what error message (with stack trace) you got?

Sure, sharing the code sample:

def speech_to_text(st):
       audio_value = st.experimental_audio_input("Record a voice message")
       if audio_value:
            print("audio is coming")
            st.audio(audio_value)
        else:
            print("audio-data is not coming")

in the logs else part is getting print.

Also, i checked the sample from code ref; docs/python/api-examples-source/widget.audio_input.py at main · streamlit/docs · GitHub

It’s working for me in this app: Deepgram API Playground · Streamlit (deepgram-playground.streamlit.app)

I would expect the else text to print at least once (when the component loads, before the user interacts with it). What happens when you click the microphone icon, talk, then click the stop icon? Are you getting any messages in the browser about permission to access the microphone? Also what browser and device are you viewing the app on?

FYI, that example is live-hosted here: https://doc-audio-input.streamlit.app/

1 Like

@mathcatsand that issue is resolved now, i was invoking that function on a button click of streamlit. Seems like that is the issue, if i invoke the function directly then st.audio is working.

Can you help me with passing this audio_value to speechrecogniztion? how should i pass it??

From their docs, it looks like you can pass the output of st.audio_input to sr.AudioFile.

yes, i thought the same but that is not working getting below error:

2024-09-30 11:59:07.397 Uncaught app exception
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/speech_recognition/__init__.py", line 241, in __enter__
    self.audio_reader = wave.open(self.filename_or_fileobject, "rb")
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/wave.py", line 649, in open
    return Wave_read(f)
           ^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/wave.py", line 286, in __init__
    self.initfp(f)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/wave.py", line 253, in initfp
    raise Error('file does not start with RIFF id')
wave.Error: file does not start with RIFF id

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/speech_recognition/__init__.py", line 246, in __enter__
    self.audio_reader = aifc.open(self.filename_or_fileobject, "rb")
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 954, in open
    return Aifc_read(f)
           ^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 364, in __init__
    self.initfp(f)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 322, in initfp
    raise Error('file does not start with FORM id')
aifc.Error: file does not start with FORM id

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/speech_recognition/__init__.py", line 272, in __enter__
    self.audio_reader = aifc.open(aiff_file, "rb")
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 954, in open
    return Aifc_read(f)
           ^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 364, in __init__
    self.initfp(f)
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 320, in initfp
    chunk = Chunk(file)
            ^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/chunk.py", line 67, in __init__
    raise EOFError
EOFError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 88, in exec_func_with_error_handling
    result = func()
             ^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 579, in code_to_exec
    exec(code, module.__dict__)
  File "/Users/admin/Downloads/projects/ai-prototype/text-sql/main-agent.py", line 222, in <module>
    main()
  File "/Users/admin/Downloads/projects/ai-prototype/text-sql/main-agent.py", line 116, in main
    user_input = speech_to_text(st)
                 ^^^^^^^^^^^^^^^^^^
  File "/Users/admin/Downloads/projects/ai-prototype/text-sql/text_speech.py", line 26, in speech_to_text
    with sr.AudioFile(audio_value) as source:
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/speech_recognition/__init__.py", line 274, in __enter__
    raise ValueError("Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format")
ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format

Also i tried converting it to bytes,:

audio_file = BytesIO(audio_value)

this also didn’t worked.

Hmmm… :thinking:

As long as you have the output logic-gated to not pass None, it should work. They even say:

…should be a file-like object such as io.BytesIO or similar.

Can you try:

import streamlit as st
import io

audio_file = st.experimental_audio_input("Say something")
if audio_file:
    st.write(isinstance(audio_file, io.BytesIO))
    # sr.AudioFile(audio_file) 

If you’re correctly getting True after recording a sound bite, I’d suggest checking with the SpeechRecognition library directly to see if there’s some further clarification. By a plain reading, io.BytesIO is an acceptable file type. :sweat_smile:

I tried this:

st.write(isinstance(audio, BytesIO))

it is coming as true.