How to play an audio file automatically (generated using text-to-speech) in Streamlit?

I’m working on an application which will have an automatic text-to-speech feature. I have a piece of text which I’ve converted to speech using google text-to-speech library. I’m aware that we can generate an audio file of the same using st.audio() but here I want the audio to play automatically after it is generated. (without the user having to click any button anywhere).

Any ideas on how to implement this?
Thanks!

Hi @ayanatherate,

You can accomplish this using a similar method to the one you may have come across in this issue by base64 encoding the generated file.

import base64

import streamlit as st


def autoplay_audio(file_path: str):
    with open(file_path, "rb") as f:
        data = f.read()
        b64 = base64.b64encode(data).decode()
        md = f"""
            <audio controls autoplay="true">
            <source src="data:audio/mp3;base64,{b64}" type="audio/mp3">
            </audio>
            """
        st.markdown(
            md,
            unsafe_allow_html=True,
        )


st.write("# Auto-playing Audio!")

autoplay_audio("local_audio.mp3")
4 Likes

I’m a bit late but thanks a lot for the solution. It worked for me perfectly!

One problem I’m facing is that whenever the text to be converted to speech changes, the speech output isnt changing
automatically. I’ve to refresh the page by some way to play the audio file again. Is there a way to resolve that? That would be really helpful!

Thanks!

Hi @ayanatherate, could you share a code snippet that shows this issue? Are you using st.experimental_memo to cache the autoplay_audio function, perhaps?

I think what you probably need is the “callback”, which is called when something changes.

This is a callback demo I wrote using langchain. when you get a stream answer from ChatGPT, the callback is called once for each token. when the answer accumulates to one sentence, I send azure’s text to speech to generate speech and play it.
I haven’t tried google text-to-speech, I think you can set an onchange in the text_input of the user input, and then call the callback.

Another trick here is to use html to include the base64 of the audio so that it can be played with st.markdown. Maybe this is what you need to autoplay audio

audio_base64 = base64.b64encode(audio_stream).decode('utf-8')
audio_tag = f'<audio autoplay="true" src="data:audio/wav;base64,{audio_base64}">'
st.markdown(audio_tag, unsafe_allow_html=True)
2 Likes

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.