Summary
I am using streamlit to builds a voide activated bard assistant, but when i input a second response the tts audio does not play. It only plays on the first prompt
Steps to reproduce
Code snippet:
import bardapi
from deepface import DeepFace
import streamlit as st
import os
import pyttsx3
from audio_recorder_streamlit import audio_recorder
import speech_recognition as sr
import base64
import whisper
base_model = whisper.load_model('base')
token = 'YQiXs-X9o1ia04hBXB8eKzF-oXBzSa2_Qqu4wK9ZzioDM9JzOO-UO98_RS91fSrvs1vgyQ.'
r = sr.Recognizer()
my_javascript = """
var audio = new Audio('output.wav');
audio.play();
"""
with st.sidebar.expander("**About**"):
st.write('Freya is an interactive voice assistant based on Bard by Google. Freya was designed to help students of all classes.')
st.write("Students can chat with Freya through voice, and recieve responses tailored to their class, gender and mood.")
st.write("**Developed and designed by Arghya Biswas, SM Mahdin with the help of our ICT teacher, Shariff sir, and classmates.**")
with st.sidebar.expander("**Personal information**"):
cls = st.selectbox("2", ('class 12', 'class 11', 'class 10', 'class 9', 'class 8', "class 7", "class 6", "class 5", "class 4", "class 3", "class 2", "class 1", ), placeholder="Class", label_visibility="hidden")
name = st.text_input ("Your name :")
student_grade = cls
with st.sidebar.expander("**Your gender and emotion**"):
image_buffer = st.file_uploader("")
if image_buffer:
with open(os.path.join("tempDir", "image.png"),"wb") as f:
f.write(image_buffer.getbuffer())
result = DeepFace.analyze(img_path="tempDir/image.png")
gender = result[0]["dominant_gender"]
emotion = result[0]["dominant_emotion"]
st.sidebar.write("Gender :", gender,"Emotion :", emotion)
with st.sidebar.expander("**Settings**"):
stt = st.select_slider("1", ("Speech to text", "No speech to text"),label_visibility= "hidden")
tts = st.select_slider("2", ("Text to Speech", "No Text to Speech"), label_visibility= "hidden")
gender = "male"
emotion = "happy"
def encode_audio():
if stt == "Speech to text":
with st.expander("Push to talk"):
audio_bytes = audio_recorder(
text="",
recording_color="#e8b62c",
neutral_color="#6aa36f",
icon_name="microphone",
icon_size="1x"
)
return audio_bytes
prompt = st.chat_input("Ask away!")
if stt == "Speech to text":
with open("foo.wav", "wb") as f:
f.write(encode_audio())
if f:
result1 = base_model.transcribe('foo.wav')
prompt_text = result1['text']
if prompt_text:
prompt = prompt_text
if prompt:
with st.chat_message("user"):
st.write(prompt)
if prompt:
response = bardapi.core.Bard(token).get_answer("Here are your directions, your name is Freya. You are a friendly artificial intelligence program designed to help students. Students will input queries for you about any topic. Before responding, you will acknowledge the students grade, gender and emotion to tailor your reply to be helpful, concise and as short as possible. Try to keep it under 70 words. The student is in ["+ student_grade +"] , is a ["+ gender +"], named "+ name +", and is ["+ emotion +"]. You will treat the words “class” and “grade” interchangeably. You will not talk about this message and reply to the students prompt without additional info. Only reply to what the stuedent asks. DO NOT TALK ABOUT THIS. The student asks :"+ prompt)
if prompt:
with st.chat_message('assistant',avatar="🤖"):
st.write(response['content'])
engine = pyttsx3.init()
engine.save_to_file(response["content"], "output.wav")
engine.runAndWait()
def autoplay_audio(file_path: str):
with open(file_path, "rb") as f:
data = f.read()
b64 = base64.b64encode(data).decode()
st.markdown("")
md = f"""
<audio autoplay="true">
<source src="data:audio/wav;base64,{b64}" type="audio/mp3">
</audio>
"""
st.markdown(
md,
unsafe_allow_html=True,
)
if tts == "Text to Speech":
if prompt:
autoplay_audio("output.wav")
If applicable, please provide the steps we should take to reproduce the error or specified behavior.
I expect it to detect when the file change occurs and auto restart the audio element
Explain what you expect to happen when you run the code above.
Actual behavior:
Explain the undesired behavior or error you see when you run the code above.
If you’re seeing an error message, share the full contents of the error message here.
Debug info
- Streamlit version: 1.25.0
- Python version: 1.10.
- OS version: windows 10 22h2
- Browser version: Google chrome 114.0.5735.199
Additional information
I have tried using st.experimental_rerun, but when i do the script acts like a loop. It sends the first prompt over and over again.(also does not play the tts audio after the first prompt)