I am creating a streamlit app that my non-technical colleagues can use to upload an audio file locally, which I will subsequently transcribe using Open AIās Whisper.
I am having problems using st.file_uploader. After loading the Whisper model, Iāve tried using:
model.transcribe(audio=uploaded_file),
model.transcribe(audio=uploaded_file.read()),
model.transcribe(audio=uploaded_file.getvalue()),
model.transcribe(audio=uploaded_file.getvalue().decode(āutf-8ā))
but I keep getting TypeError and UnicodeDecodeError errors.
I looked up the documentation on st.file_uploader but there is little information available. I canāt find much on the UploadedFile object as well to know if there are other methods. Or is it because Whisper doesnāt work if a file is uploaded?
Would appreciate some help if it is indeed possible to upload a file for Whisper to use.
Due to legal and ethics constraints, we cannot upload our audio files online. Can I confirm that st.file_uploader only hold the file in the userās computer ram only?
The issue is that model.transcribe
is expecting either a file name string, or a numpy array, or a Tensor, and the UploadedFile is none of these. (see whisper/transcribe.py at main Ā· openai/whisper Ā· GitHub for more details)
The easiest way to solve this is to save the uploaded file to a temporary file with a known path.
from tempfile import NamedTemporaryFile
import streamlit as st
import whisper
audio = st.file_uploader("Upload an audio file", type=["mp3"])
if audio is not None:
with NamedTemporaryFile(suffix="mp3") as temp:
temp.write(audio.getvalue())
temp.seek(0)
model = whisper.load_model("base")
result = model.transcribe(temp.name)
st.write(result["text"])
This seems to work well. I canāt speak to the legal constraints, but if you are running this app on your local machine, then it wonāt go anywhere else when you upload it. If you are running your app on a remote server, this method will certainly (at least temporarily) put the file on the serverās disk.
4 Likes
Thank you. I didnāt realise I can write to a temp file with streamlit. I successfully wrote audio.getvalue()
to a temp file and managed to display it using st.audio(temp.read())
.
I still got a āpermission deniedā error nonetheless but Iām pretty confident this is caused by ffmpeg rather than temp files or Whisper. Iāll mark your response the solution as my question is fundamentally about using streamlit.
1 Like
Is this solution deprecated? Iām trying the same code and getting a file not found.
Iāve seen some incredible apps made with Streamlit and whisper but cant make it work
Thanks!
Despite the message, the āfile not foundā is in fact an ffmpeg problem. It is very misleading. I get the same āfile not foundā error previously running locally because I forgot to install ffmpeg. Try adding āffmpegā to packages.txt. Not sure if this method still works though.
3 Likes
That seems like a very good guess. The only thing streamlit-related about the core of this code is audio.getvalue()
, which hasāt changed st.file_uploader - Streamlit Docs, so unless youāre either missing some dependency, or the whisper API has changed, this should still work.
1 Like
Thank you very much. I was having this issue and the error message āfile not foundā is in fact very misleading. This solved my problem. Thank you.
Hi , this solution is somehow not working for me.
below is my code
import streamlit as st
from tempfile import NamedTemporaryFile
import openai
audio = st.file_uploader("Upload an audio file", type=["mp3"])
if audio is not None:
with NamedTemporaryFile() as temp:
temp.write(audio.getvalue())
temp.seek(0)
result = openai.Audio.transcribe("whisper-1", temp.name,verbose = True)
st.write(result["text"])
getting the error below
AttributeError: 'str' object has no attribute 'name'
Traceback:
File "C:\Users\ayrus\Desktop\streamlit\myenv\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
exec(code, module.__dict__)
File "C:\Users\ayrus\Desktop\streamlit\myenv\pages\SpeechAI.py", line 22, in <module>
result = openai.Audio.transcribe("whisper-1", temp.name,verbose = True)
File "C:\Users\ayrus\Desktop\streamlit\myenv\lib\site-packages\openai\api_resources\audio.py", line 57, in transcribe
filename=file.name,
In this case, the transcribe method is looking for a file object, not a filename OpenAI Platform
When a file-like object is needed, you can often just use the variable that is returned by st.file_uploader. That works fine in this case:
import streamlit as st
import openai
audio = st.file_uploader("Upload an audio file", type=["mp3"])
if audio is not None:
result = openai.Audio.transcribe("whisper-1", audio, verbose=True)
st.write(result["text"])
1 Like