I have tried to deploy a web-app (just a proof of concept) to take an audio file and transcribe it. It works fine on my localhost but in the cloud it runs into a run-time error when transcribing. I also built the requirements.txt from scratch with pipreq but it doesn’t change anything. This is the error:
Error: Traceback (most recent call last): File “/home/adminuser/venv/bin/whisper”, line 5, in from whisper.transcribe import cli File “/home/adminuser/venv/lib/python3.10/site-packages/whisper/init .py”, line 13, in from .model import ModelDimensions, Whisper File “/home/adminuser/venv/lib/python3.10/site-packages/whisper/model.py”, line 13, in from .transcribe import transcribe as transcribe_function File “/home/adminuser/venv/lib/python3.10/site-packages/whisper/transcribe.py”, line 20, in from .timing import add_word_timestamps File “/home/adminuser/venv/lib/python3.10/site-packages/whisper/timing.py”, line 7, in import numba File “/home/adminuser/venv/lib/python3.10/site-packages/numba/init .py”, line 55, in _ensure_critical_deps() File “/home/adminuser/venv/lib/python3.10/site-packages/numba/init .py”, line 42, in _ensure_critical_deps raise ImportError(“Numba needs NumPy 1.24 or less”) ImportError: Numba needs NumPy 1.24 or less
I have tried to explicitly include the above numPy and Numba versions but no chance.
Does anybody have a hint?
Thanks
By the way: the error does not show in the manage-app sidebar but directly in the st.write field where the transcript is supposed to be:
Thanks so much. I tried it but it didn’t help. This is what my requirements.txt says:
beautifulsoup4==4.11.2
docx==0.2.4
langchain==0.0.271
nltk==3.8.1
openai==0.27.6
pandas==2.0.2
pdfplumber==0.9.0
python_docx==0.8.11
rake_nltk==1.0.6
Requests==2.31.0
spacy==3.5.2
streamlit==1.25.0
numpy==1.24.0
numba==0.54.0
Sure. It is not much until now… just starting with the basics:
Many thanks in advance
#Define case 5
def whisper():
with st.form('audio form'):
API = API2
uploaded_file = st.file_uploader("Upload your audio file here...")
# submit button
submitted = st.form_submit_button("Transcribe audio")
if submitted:
# Save the uploaded file to disk
audio_file_path = os.path.join(os.getcwd(), "temp_audio_file")
with open(audio_file_path, 'wb') as f:
f.write(uploaded_file.getvalue())
# Start the process
process = subprocess.Popen(["whisper", audio_file_path], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
# Get both stdout and stderr outputs at once
stdout_data, stderr_data = process.communicate()
# Process stdout_data to remove timestamps and display on Streamlit
for line in stdout_data.splitlines():
clean_line = re.sub(r'\[\d{2}:\d{2}\.\d{3} --> \d{2}:\d{2}\.\d{3}\] ', '', line)
st.write("success")
# Check for errors
if process.returncode != 0:
st.write(f"Error: {stderr_data}")
# Optionally, delete the temporary file
os.remove(audio_file_path)
def whisper():
transcript=“”
# Define a function to preprocess and truncate the text
def preprocess_and_truncate(text, max_length=7000):
processed_text = text[:max_length] # Truncate to the specified max_length
return processed_text
if 'transcript' not in st.session_state:
st.session_state.transcript = ""
if 'summary' not in st.session_state:
st.session_state.summary = ""
with st.form('audio form'):
openai.api_key = API2
uploaded_file = st.file_uploader("Upload your audio file here (wav, mp3, mp4, m4a, mpeg, mpga): ")
#system_message = "Act as business consultant specialized in Know-your-customer analysis and topics around German export control."
# submit button
submitted = st.form_submit_button("Transcribe audio")
if submitted:
if uploaded_file:
# Start the transcription process using the uploaded file
transcription = openai.Audio.transcribe("whisper-1", uploaded_file)
transcript = transcription['text']
# Preprocess and truncate the transcript
processed_transcript = preprocess_and_truncate(transcript, max_length=7000) # Adjust the max_length as needed
formatted_transcript = transcript.replace("\n", "<br>").replace(".", ".<br>").replace("?", "?<br>").replace("!", "!<br>")
st.markdown("### Transcript:")
st.write(formatted_transcript, unsafe_allow_html=True)
#st.text_area("Transcript:", transcript, height=200)
prompt =f"summarize this in English language in a concise way in up to 10 full sentences using bullet points. Here is the context: {processed_transcript}"
summary = generate_text(prompt,"you are a helpful assistant",GPT_model, 0.5, 700)
st.markdown("### Summary:")
st.write(summary)
# Combine the transcript and summary for download
combined_text = f"Transcript:\n{processed_transcript}\n\nSummary:\n{summary}"
# Add a single download button for both transcript and summary
st.download_button('Download Transcript and Summary', combined_text, file_name='transcript_summary.txt')
Thanks for stopping by! We use cookies to help us understand how you interact with our website.
By clicking “Accept all”, you consent to our use of cookies. For more information, please see our privacy policy.
Cookie settings
Strictly necessary cookies
These cookies are necessary for the website to function and cannot be switched off. They are usually only set in response to actions made by you which amount to a request for services, such as setting your privacy preferences, logging in or filling in forms.
Performance cookies
These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us understand how visitors move around the site and which pages are most frequently visited.
Functional cookies
These cookies are used to record your choices and settings, maintain your preferences over time and recognize you when you return to our website. These cookies help us to personalize our content for you and remember your preferences.
Targeting cookies
These cookies may be deployed to our site by our advertising partners to build a profile of your interest and provide you with content that is relevant to you, including showing you relevant ads on other websites.