File uploading and reading using st.file_uploader

I am creating a streamlit app that my non-technical colleagues can use to upload an audio file locally, which I will subsequently transcribe using Open AIā€™s Whisper.

I am having problems using st.file_uploader. After loading the Whisper model, Iā€™ve tried using:
model.transcribe(audio=uploaded_file),
model.transcribe(audio=uploaded_file.read()),
model.transcribe(audio=uploaded_file.getvalue()),
model.transcribe(audio=uploaded_file.getvalue().decode(ā€œutf-8ā€))
but I keep getting TypeError and UnicodeDecodeError errors.

I looked up the documentation on st.file_uploader but there is little information available. I canā€™t find much on the UploadedFile object as well to know if there are other methods. Or is it because Whisper doesnā€™t work if a file is uploaded?

Would appreciate some help if it is indeed possible to upload a file for Whisper to use.

Due to legal and ethics constraints, we cannot upload our audio files online. Can I confirm that st.file_uploader only hold the file in the userā€™s computer ram only?

The issue is that model.transcribe is expecting either a file name string, or a numpy array, or a Tensor, and the UploadedFile is none of these. (see whisper/transcribe.py at main Ā· openai/whisper Ā· GitHub for more details)

The easiest way to solve this is to save the uploaded file to a temporary file with a known path.

from tempfile import NamedTemporaryFile

import streamlit as st
import whisper

audio = st.file_uploader("Upload an audio file", type=["mp3"])

if audio is not None:
    with NamedTemporaryFile(suffix="mp3") as temp:
        temp.write(audio.getvalue())
        temp.seek(0)
        model = whisper.load_model("base")
        result = model.transcribe(temp.name)
        st.write(result["text"])

This seems to work well. I canā€™t speak to the legal constraints, but if you are running this app on your local machine, then it wonā€™t go anywhere else when you upload it. If you are running your app on a remote server, this method will certainly (at least temporarily) put the file on the serverā€™s disk.

4 Likes

Thank you. I didnā€™t realise I can write to a temp file with streamlit. I successfully wrote audio.getvalue() to a temp file and managed to display it using st.audio(temp.read()).

I still got a ā€œpermission deniedā€ error nonetheless but Iā€™m pretty confident this is caused by ffmpeg rather than temp files or Whisper. Iā€™ll mark your response the solution as my question is fundamentally about using streamlit.

1 Like

Is this solution deprecated? Iā€™m trying the same code and getting a file not found.
Iā€™ve seen some incredible apps made with Streamlit and whisper but cant make it work
Thanks!

Despite the message, the ā€œfile not foundā€ is in fact an ffmpeg problem. It is very misleading. I get the same ā€œfile not foundā€ error previously running locally because I forgot to install ffmpeg. Try adding ā€œffmpegā€ to packages.txt. Not sure if this method still works though.

3 Likes

That seems like a very good guess. The only thing streamlit-related about the core of this code is audio.getvalue(), which hasā€™t changed st.file_uploader - Streamlit Docs, so unless youā€™re either missing some dependency, or the whisper API has changed, this should still work.

1 Like

Thank you very much. I was having this issue and the error message ā€œfile not foundā€ is in fact very misleading. This solved my problem. Thank you.

Hi , this solution is somehow not working for me.

below is my code

import streamlit as st
from tempfile import NamedTemporaryFile
import openai

audio = st.file_uploader("Upload an audio file", type=["mp3"])

if audio is not None:
    with NamedTemporaryFile() as temp:
        temp.write(audio.getvalue())
        temp.seek(0)
        result = openai.Audio.transcribe("whisper-1", temp.name,verbose = True)
        st.write(result["text"])

getting the error below

AttributeError: 'str' object has no attribute 'name'
Traceback:
File "C:\Users\ayrus\Desktop\streamlit\myenv\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
    exec(code, module.__dict__)
File "C:\Users\ayrus\Desktop\streamlit\myenv\pages\SpeechAI.py", line 22, in <module>
    result = openai.Audio.transcribe("whisper-1", temp.name,verbose = True)
File "C:\Users\ayrus\Desktop\streamlit\myenv\lib\site-packages\openai\api_resources\audio.py", line 57, in transcribe
    filename=file.name,

In this case, the transcribe method is looking for a file object, not a filename OpenAI Platform

When a file-like object is needed, you can often just use the variable that is returned by st.file_uploader. That works fine in this case:

import streamlit as st
import openai

audio = st.file_uploader("Upload an audio file", type=["mp3"])

if audio is not None:
    result = openai.Audio.transcribe("whisper-1", audio, verbose=True)
    st.write(result["text"])
1 Like