File uploading and reading using st.file_uploader

darylkdps · October 13, 2022, 5:06am

I am creating a streamlit app that my non-technical colleagues can use to upload an audio file locally, which I will subsequently transcribe using Open AI’s Whisper.

I am having problems using st.file_uploader. After loading the Whisper model, I’ve tried using:
model.transcribe(audio=uploaded_file),
model.transcribe(audio=uploaded_file.read()),
model.transcribe(audio=uploaded_file.getvalue()),
model.transcribe(audio=uploaded_file.getvalue().decode(“utf-8”))
but I keep getting TypeError and UnicodeDecodeError errors.

I looked up the documentation on st.file_uploader but there is little information available. I can’t find much on the UploadedFile object as well to know if there are other methods. Or is it because Whisper doesn’t work if a file is uploaded?

Would appreciate some help if it is indeed possible to upload a file for Whisper to use.

Due to legal and ethics constraints, we cannot upload our audio files online. Can I confirm that st.file_uploader only hold the file in the user’s computer ram only?

blackary · October 13, 2022, 2:13pm

The issue is that model.transcribe is expecting either a file name string, or a numpy array, or a Tensor, and the UploadedFile is none of these. (see whisper/transcribe.py at main · openai/whisper · GitHub for more details)

The easiest way to solve this is to save the uploaded file to a temporary file with a known path.

from tempfile import NamedTemporaryFile

import streamlit as st
import whisper

audio = st.file_uploader("Upload an audio file", type=["mp3"])

if audio is not None:
    with NamedTemporaryFile(suffix="mp3") as temp:
        temp.write(audio.getvalue())
        temp.seek(0)
        model = whisper.load_model("base")
        result = model.transcribe(temp.name)
        st.write(result["text"])

This seems to work well. I can’t speak to the legal constraints, but if you are running this app on your local machine, then it won’t go anywhere else when you upload it. If you are running your app on a remote server, this method will certainly (at least temporarily) put the file on the server’s disk.

darylkdps · October 14, 2022, 5:22am

Thank you. I didn’t realise I can write to a temp file with streamlit. I successfully wrote audio.getvalue() to a temp file and managed to display it using st.audio(temp.read()).

I still got a “permission denied” error nonetheless but I’m pretty confident this is caused by ffmpeg rather than temp files or Whisper. I’ll mark your response the solution as my question is fundamentally about using streamlit.

Franco_Maciel · April 14, 2023, 8:10am

Is this solution deprecated? I’m trying the same code and getting a file not found.
I’ve seen some incredible apps made with Streamlit and whisper but cant make it work
Thanks!

darylkdps · April 14, 2023, 9:11am

Despite the message, the “file not found” is in fact an ffmpeg problem. It is very misleading. I get the same “file not found” error previously running locally because I forgot to install ffmpeg. Try adding “ffmpeg” to packages.txt. Not sure if this method still works though.

blackary · April 14, 2023, 3:24pm

That seems like a very good guess. The only thing streamlit-related about the core of this code is audio.getvalue(), which has’t changed st.file_uploader - Streamlit Docs, so unless you’re either missing some dependency, or the whisper API has changed, this should still work.

alonsosilvaallende · August 1, 2023, 6:50am

Thank you very much. I was having this issue and the error message “file not found” is in fact very misleading. This solved my problem. Thank you.

star_xplorer · August 9, 2023, 8:46am

Hi , this solution is somehow not working for me.

below is my code

import streamlit as st
from tempfile import NamedTemporaryFile
import openai

audio = st.file_uploader("Upload an audio file", type=["mp3"])

if audio is not None:
    with NamedTemporaryFile() as temp:
        temp.write(audio.getvalue())
        temp.seek(0)
        result = openai.Audio.transcribe("whisper-1", temp.name,verbose = True)
        st.write(result["text"])

getting the error below

AttributeError: 'str' object has no attribute 'name'
Traceback:
File "C:\Users\ayrus\Desktop\streamlit\myenv\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
    exec(code, module.__dict__)
File "C:\Users\ayrus\Desktop\streamlit\myenv\pages\SpeechAI.py", line 22, in <module>
    result = openai.Audio.transcribe("whisper-1", temp.name,verbose = True)
File "C:\Users\ayrus\Desktop\streamlit\myenv\lib\site-packages\openai\api_resources\audio.py", line 57, in transcribe
    filename=file.name,

blackary · August 9, 2023, 1:40pm

In this case, the transcribe method is looking for a file object, not a filename OpenAI Platform

When a file-like object is needed, you can often just use the variable that is returned by st.file_uploader. That works fine in this case:

import streamlit as st
import openai

audio = st.file_uploader("Upload an audio file", type=["mp3"])

if audio is not None:
    result = openai.Audio.transcribe("whisper-1", audio, verbose=True)
    st.write(result["text"])

system · August 8, 2024, 1:40pm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error in audio transcription app using whisper hosted on streamlit cloud Deployment streamlit-cloud	4	1067	August 7, 2023
Upload and write any file type (File_Uploader&Textract) Using Streamlit file-upload	3	2016	November 19, 2021
Uploading file from vm Using Streamlit	3	518	September 15, 2023
Uploading wave file and play back Using Streamlit	2	2216	August 15, 2022
Displaying audio file fails with "TypeError: 'bytes' object is not callable" Using Streamlit	3	983	August 15, 2022

File uploading and reading using st.file_uploader

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies