Issue Processing Audio Recordings with streamlit_mic_recorder and librosa

Josue_Becerra · December 19, 2024, 5:08pm

Hello community!

I’m facing an issue when using the streamlit_mic_recorder function together with librosa to transcribe audio recordings in a Streamlit-based project. The basic flow is as follows:

I record audio from a microphone using streamlit_mic_recorder.
After the recording is completed, I try to process the audio with librosa to convert it into an audio array and then transcribe it using a speech recognition model (Whisper).

Here is the basic code for recording and processing:

import torch
from transformers import pipeline
import librosa
from io import BytesIO
import numpy as np
from pydub import AudioSegment
from SystemResources.GestorIADashBoard.ModuloBimochat.Recursos.utils import load_config
config = load_config()

def convert_bytes_to_array(audio_bytes):
    audio_bytes = BytesIO(audio_bytes)
    audio, sample_rate = librosa.load(audio_bytes)
    print(sample_rate)
    return audio

def transcribe_audio(audio_bytes):
    device = "cpu"
    pipe = pipeline(
        task="automatic-speech-recognition",
        model=config["whisper_model"],
        chunk_length_s=30,
        device=device,
    )   

    audio_array = convert_bytes_to_array(audio_bytes)

    print(f"Audio array size: {audio_array.shape}")
    print(f"Model vocabulary size: {pipe.model.config.vocab_size}")
    print(f"Suppress tokens before sanitization: {pipe.model.config.suppress_tokens}")

    pipe.model.config.suppress_tokens = [
        token for token in pipe.model.config.suppress_tokens if token < pipe.model.config.vocab_size
    ]
    print(f"Suppress tokens after sanitization: {pipe.model.config.suppress_tokens}")

    prediction = pipe(audio_array, batch_size=1)["text"]
    print(prediction)

    return prediction

The error I get is as follows:

Error Analysis:

The error is coming from the librosa library, specifically from the librosa.load function. The message indicates an issue with opening the audio file, stating that the format is not recognized. This happens because librosa is trying to load a BytesIO object as an audio file directly, but the internal structure is not compatible with the formats that librosa supports (e.g., WAV, MP3, etc.).

Possible Solutions:

Check the audio format:
It is possible that the audio format recorded by streamlit_mic_recorder is not compatible with librosa. You could try converting the recorded audio to a supported format such as WAV or PCM before passing it to librosa.Here is an example of how to do this using pydub:

from io import BytesIO

def convert_bytes_to_wav(audio_bytes):
    audio = AudioSegment.from_file(BytesIO(audio_bytes))
    wav_io = BytesIO()
    audio.export(wav_io, format="wav")
    wav_io.seek(0)
    return wav_io

Then, you can modify the convert_bytes_to_array function to use this conversion:

def convert_bytes_to_array(audio_bytes):
    audio_wav = convert_bytes_to_wav(audio_bytes)
    audio, sample_rate = librosa.load(audio_wav)
    print(sample_rate)
    return audio

This error occurs due to a format incompatibility when loading the audio file into librosa. By converting the audio to a valid format such as WAV or PCM, it should be possible to load and process the file correctly. If the issue persists, check the dependencies and the state of the audio data before passing it to librosa.

I hope this solution is helpful! Has anyone else encountered similar issues when using streamlit_mic_recorder and librosa?

Topic		Replies	Views
Streamlit Not Working for Librosa Community Cloud streamlit-cloud , audio , librosa	18	1013	February 19, 2024
Unable to load m4a files with librosa.load() in streamlit run Using Streamlit windows	3	3800	May 26, 2023
Compatible audio libraries with Streamlit - For running uploaded file through keras model Using Streamlit keras	1	605	October 22, 2022
Speeach-to-Text query. Audio processing retriggers on every state change or user interaction with Streamlit app Using Streamlit session-state , debugging	3	80	November 18, 2024
Can someone recommend a reliable audio recorder that would work on streamlit Using Streamlit	5	1952	April 8, 2024

Issue Processing Audio Recordings with streamlit_mic_recorder and librosa

Error Analysis:

Possible Solutions:

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies