after deploying streamlit app on a server having Speech to Text implementation, the app is not considering the clients device for mic.
What are you using to access the user’s microphone? To access a user’s device (camera, microphone, files, etc), you need a Streamlit component that can pass that information through their browser to your Python backend.
There are custom components available for this, but fortunately, st.audio_input
is coming out in the next version of Streamlit. You can try it out now with streamlit-nightly
.
@mathcatsand thanks for replying, i’m using speech_recognition
with sr.Microphone() as source
option to listen user speech
I recommend trying out st.audio_input
. Unless a library for audio input is specifically written to work with Streamlit, deployed apps won’t work. When you run such an app locally, it can work because the Streamlit server and the client browser happen to be on the same machine. When an app is deployed, the Python server is remote from the user. The Python library will be trying to access the peripherals of the Streamlit server (somewhere “in the cloud”) and not the end user. A Streamlit-compatible library would have to access a user’s microphone through their browser and pass the input back to Streamlit through a custom component.
okay, so to try st.audio_input
i just need to install streamlit-nightly
and will be able to access this method?
Also, should i use it as: with st.audio_input() as source:
??
I’m pretty new to this setup, really appreciate your help.
The docs for st.audio_input
will be available when the next release comes out, but it works by returning UploadedFile
after someone has pushed the button to record. So you have:
import streamlit as st
sound_clip = st.audio_input("Record speech")
if sound_clip:
# process the file-like sound clip
The library that @SiddhantSadangi mentioned is one of those custom components I mentioned earlier and is another option.
hey @SiddhantSadangi, thanks for responding. I tried this library but this is failing for me. Do you have any example or article link with speech-recognition that will be helpful.
Hi @mathcatsand i tried with this but i’m getting error of audio_data.
Can you share what code you tried and what error message (with stack trace) you got?
Sure, sharing the code sample:
def speech_to_text(st):
audio_value = st.experimental_audio_input("Record a voice message")
if audio_value:
print("audio is coming")
st.audio(audio_value)
else:
print("audio-data is not coming")
in the logs else part is getting print.
Also, i checked the sample from code ref; docs/python/api-examples-source/widget.audio_input.py at main · streamlit/docs · GitHub
It’s working for me in this app: Deepgram API Playground · Streamlit (deepgram-playground.streamlit.app)
I would expect the else text to print at least once (when the component loads, before the user interacts with it). What happens when you click the microphone icon, talk, then click the stop icon? Are you getting any messages in the browser about permission to access the microphone? Also what browser and device are you viewing the app on?
FYI, that example is live-hosted here: https://doc-audio-input.streamlit.app/
@mathcatsand that issue is resolved now, i was invoking that function on a button click of streamlit. Seems like that is the issue, if i invoke the function directly then st.audio
is working.
Can you help me with passing this audio_value to speechrecogniztion? how should i pass it??
From their docs, it looks like you can pass the output of st.audio_input
to sr.AudioFile
.
yes, i thought the same but that is not working getting below error:
2024-09-30 11:59:07.397 Uncaught app exception
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/speech_recognition/__init__.py", line 241, in __enter__
self.audio_reader = wave.open(self.filename_or_fileobject, "rb")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/wave.py", line 649, in open
return Wave_read(f)
^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/wave.py", line 286, in __init__
self.initfp(f)
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/wave.py", line 253, in initfp
raise Error('file does not start with RIFF id')
wave.Error: file does not start with RIFF id
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/speech_recognition/__init__.py", line 246, in __enter__
self.audio_reader = aifc.open(self.filename_or_fileobject, "rb")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 954, in open
return Aifc_read(f)
^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 364, in __init__
self.initfp(f)
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 322, in initfp
raise Error('file does not start with FORM id')
aifc.Error: file does not start with FORM id
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/speech_recognition/__init__.py", line 272, in __enter__
self.audio_reader = aifc.open(aiff_file, "rb")
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 954, in open
return Aifc_read(f)
^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 364, in __init__
self.initfp(f)
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/aifc.py", line 320, in initfp
chunk = Chunk(file)
^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/chunk.py", line 67, in __init__
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 88, in exec_func_with_error_handling
result = func()
^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 579, in code_to_exec
exec(code, module.__dict__)
File "/Users/admin/Downloads/projects/ai-prototype/text-sql/main-agent.py", line 222, in <module>
main()
File "/Users/admin/Downloads/projects/ai-prototype/text-sql/main-agent.py", line 116, in main
user_input = speech_to_text(st)
^^^^^^^^^^^^^^^^^^
File "/Users/admin/Downloads/projects/ai-prototype/text-sql/text_speech.py", line 26, in speech_to_text
with sr.AudioFile(audio_value) as source:
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/speech_recognition/__init__.py", line 274, in __enter__
raise ValueError("Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format")
ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format
Also i tried converting it to bytes,:
audio_file = BytesIO(audio_value)
this also didn’t worked.
Hmmm…
As long as you have the output logic-gated to not pass None
, it should work. They even say:
…should be a file-like object such as
io.BytesIO
or similar.
Can you try:
import streamlit as st
import io
audio_file = st.experimental_audio_input("Say something")
if audio_file:
st.write(isinstance(audio_file, io.BytesIO))
# sr.AudioFile(audio_file)
If you’re correctly getting True
after recording a sound bite, I’d suggest checking with the SpeechRecognition library directly to see if there’s some further clarification. By a plain reading, io.BytesIO
is an acceptable file type.
I tried this:
st.write(isinstance(audio, BytesIO))
it is coming as true.