Video transcription app

I have created a Video to text transcription app in local machine . The transcript has to get downloaded in the system when we click on download. But after uploading a video entire transcript is not getting downloaded. Some of the text is getting truncated. Can anybody help me out with this. The code is below as follows

import streamlit as st
from pydub import AudioSegment,silence
import ffmpeg
import os
import speech_recognition as sr
AudioSegment.converter = “C:\ffmpeg\bin\ffmpeg.exe”
AudioSegment.ffmpeg = “C:\ffmpeg\bin\ffmpeg.exe”
AudioSegment.ffprobe =“C:\ffmpeg\bin\ffmpeg.exe”
import os
import tempfile
import shutil
import io
import tempfile
from flask import send_file
recognizer = sr.Recognizer()
final_result = “”
def video_to_transcript(video_file):
# Step 1: Convert video to audio
audio_file = “temp_audio.wav”
audio = AudioSegment.from_file(video_file, format=“mp4”)
audio.export(audio_file, format=“wav”)

# Step 2: Load the audio file using pydub
audio = AudioSegment.from_wav(audio_file)


# Step 3: Perform speech recognition using Google Speech Recognition
recognizer = sr.Recognizer()
with sr.AudioFile(audio_file) as source:
    audio_data = recognizer.record(source)

try:
    transcript = recognizer.recognize_google(audio_data, language="en-US")
except sr.UnknownValueError:
    transcript = "Speech Recognition could not understand the audio."
except sr.RequestError as e:
    transcript = f"Could not request results from Google Speech Recognition service; {e}"

# Remove the temporary audio file
os.remove(audio_file)

return transcript
#return final_result

def main():
st.title(“Video to Transcript Converter”)
st.write(“Upload a video file and convert it to a transcript.”)

video_file = st.file_uploader("Upload a video file", type=["mp4", "avi", "mkv"])

if video_file is not None:
    # Convert video to transcript
    transcript = video_to_transcript(video_file)
    print(transcript)

    # Display the transcript to the user
    
    with st.expander("View Transcript"):
        button = st.download_button( label="Download Transcript",
                                    data = transcript,
                                    file_name="transcript.txt",)
        if button:
                envir_var = os.environ
                user_loc = envir_var.get('USERPROFILE')
                loc = user_loc+"\Downloads\\transcript.txt"
                with open(loc,'w') as video_file:
                    video_file.write(transcript)

if name == “main”:
main()

Hello, welcome to the forum!

First thing, would you mind re-posting your code within a code block (```) so that it’s easier to read and test?

My best guess from what I can see is that there is an issue with the way you are trying to both use the download button and ALSO overwrite the downloaded file – what happens if you get rid of the last if button: block entirely? Simply using the download_button with the transcript passed as the data should be sufficient.

Yeah Sure . I will post the code again in a code block. If I get rid of the if button: block, then also the code behaves in the same way.

import streamlit as st
from pydub import AudioSegment,silence
import ffmpeg
import os
import speech_recognition as sr
AudioSegment.converter = "C:\\ffmpeg\\bin\\ffmpeg.exe"
AudioSegment.ffmpeg = "C:\\ffmpeg\\bin\\ffmpeg.exe"
AudioSegment.ffprobe ="C:\\ffmpeg\\bin\\ffmpeg.exe"
import os
import tempfile
import shutil
import io
import tempfile
recognizer = sr.Recognizer()
final_result = ""
def video_to_transcript(video_file):
    # Step 1: Convert video to audio
    audio_file = "temp_audio.wav"
    audio = AudioSegment.from_file(video_file, format="mp4")
    audio.export(audio_file, format="wav")

    # Step 2: Load the audio file using pydub
    audio = AudioSegment.from_wav(audio_file)
  

    # Step 3: Perform speech recognition using Google Speech Recognition
    recognizer = sr.Recognizer()
    with sr.AudioFile(audio_file) as source:
        audio_data = recognizer.record(source)

    try:
        transcript = recognizer.recognize_google(audio_data, language="en-US")
    except sr.UnknownValueError:
        transcript = "Speech Recognition could not understand the audio."
    except sr.RequestError as e:
        transcript = f"Could not request results from Google Speech Recognition service; {e}"

    # Remove the temporary audio file
    os.remove(audio_file)

    return transcript
    #return final_result



def main():
    st.title("Video to Transcript Converter")
    st.write("Upload a video file and convert it to a transcript.")

    video_file = st.file_uploader("Upload a video file", type=["mp4", "avi", "mkv"])

    if video_file is not None:
        # Convert video to transcript
        transcript = video_to_transcript(video_file)
        print(transcript)

        # Display the transcript to the user
        
        with st.expander("View Transcript"):
            button = st.download_button( label="Download Transcript",
                                        data = transcript,
                                        file_name="transcript.txt",)
            if button:
                    envir_var = os.environ
                    user_loc = envir_var.get('USERPROFILE')
                    loc = user_loc+"\Downloads\\transcript.txt"
                    with open(loc,'w') as video_file:
                        video_file.write(transcript)
    
       

          
if __name__ == "__main__":
    main()


Please test this code once with any English video available on Youtube and suggest solutions. Complete transcript of the video doesn’t gets downloaded once we click on the download button.

I simplified the code a bit, and it worked fine for me

import streamlit as st
from pydub import AudioSegment
import speech_recognition as sr
import os

recognizer = sr.Recognizer()
final_result = ""


@st.cache_data
def video_to_transcript(video_file) -> str:
    # Step 1: Convert video to audio
    audio_file = "temp_audio.wav"
    audio = AudioSegment.from_file(video_file, format="mp4")
    audio.export(audio_file, format="wav")

    # Step 2: Load the audio file using pydub
    audio = AudioSegment.from_wav(audio_file)

    # Step 3: Perform speech recognition using Google Speech Recognition
    recognizer = sr.Recognizer()
    with sr.AudioFile(audio_file) as source:
        audio_data = recognizer.record(source)

    try:
        transcript = recognizer.recognize_google(audio_data, language="en-US")
    except sr.UnknownValueError:
        transcript = "Speech Recognition could not understand the audio."
    except sr.RequestError as e:
        transcript = (
            f"Could not request results from Google Speech Recognition service; {e}"
        )

    # Remove the temporary audio file
    os.remove(audio_file)

    return transcript
    # return final_result


def main():
    st.title("Video to Transcript Converter")
    st.write("Upload a video file and convert it to a transcript.")

    video_file = st.file_uploader("Upload a video file", type=["mp4", "avi", "mkv"])

    if video_file is not None:
        # Convert video to transcript
        transcript = video_to_transcript(video_file)

        # Display the transcript to the user
        with st.expander("View Transcript"):
            st.download_button(
                label="Download Transcript",
                data=transcript,
                file_name="transcript.txt",
            )
            st.write(transcript)


if __name__ == "__main__":
    main()

I did find that I had to use relatively short videos, or else I got an error from “Google Speech Recognition service”, so I tried this video Schrödinger's Cat - YouTube, downloaded the 360p version, and the transcript was visible, and downloaded fine.

Thank you for the prompt reply. The code is working. But i think its happening only with my version. In my laptop still complete transcript is not getting downloaded.

It generated only 1 line of code as transcript. Even after upgrading the streamlit version issue remains the same. Is this anything related to the laptop model or something?

I’m sorry, I don’t really have any idea. I would look in the log for any error messages, and then look around on the libraries you are using (pydub and speech_recognition especially) to see if there are any references to these issues.

You could also try deploying your app on Community Cloud Streamlit Community Cloud - Streamlit Docs and see how it works on there.

I would guess that (though I am not sure) you would need to have a packages.txt file with ffmpeg listed in it, so that it would get installed on Cloud.

Yes thanks for the help. I tried deploying it on the web by including a packages.txt file, by the same issue is happening. I will figure it out something, since there’s no problem as the code is working fine . There’s some problem in the libraries or the version.

hii i am running the same code on my local machine but getting this error

Are you including those hard-coded paths to ffmpeg?

AudioSegment.converter = “C:\ffmpeg\bin\ffmpeg.exe”

If so, that would definitely not work when deployed. Have you tried removing them?
Do you have a link to your repo and the app?
What do the logs in the app say when it is deployed?

@sudipta_paul Are you using the hard-coded paths to ffmpeg.exe? That might be causing an issue.

yes the ffmpeg.exe were causing an issue . Now the app is working fine. Thanks for the help @blackary . @sudipta_paul Please try commenting the ffmpeg.exe lines and try to run the code. It works fine

2 Likes

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.