Efficient Management of Subprocesses in Streamlit

Hello everyone!

I have a question that I hope to clarify properly. Within my program, I have 2 processes that can occur. Within each process, depending on the user’s chosen option, there are 2 subprocesses that can be executed. When a subprocess is chosen, the entire main process is started from the beginning, followed by the desired subprocess.

From the user’s perspective, they upload an audio file and receive a transcription. They are then presented with 2 options: paraphrasing or summarizing. When they choose one of the two options, the transcription process is repeated, even though the transcription is stored in a variable and this repetition should not occur.

The code is written correctly, verified, and there is no mistake from that point of view. I have been working on Streamlit for about a month, and I’m writing here with the idea that maybe I’m missing something regarding how Streamlit works, and perhaps there is a way to efficiently separate the subprocesses without repeating other previous processes.


                        if length_in_minutes > MAX_AUDIO_LENGTH_MINUTES:
                            if not st.session_state.get("transcription_done", False):
                                # Audio file segmentation
                                chunks_audio= split_audio(BytesIO(uploaded_file.getvalue()))
                                ........
                                if st.session_state["current_py"] >= cost_for_all_chunks:
                                    final_transcription = ""

                                    # Saving and Transcribing Each Segment
                                    for i, chunk in enumerate(chunks_audio):
                                        for attempt in range(max_retries):
                                            try:
                                                # Temporary Segment Saving
                                                temp_filename = f'{au.temp_dir.name}/audio_part_{i}.mp3'
                                                chunk.export(temp_filename, format="mp3")
                                                # Transcribing the Segment
                                                au.transcribe_audio_from_file(temp_filename)
                                                # Concatenating to the Final Transcription
                                                final_transcription += au.transcription

                                                # Break out of retry loop if successful
                                                break
                                            except Exception as e:
                                                error_message = str(e)
                                                if "currently overloaded" in error_message and attempt < max_retries - 1:
                                                    # Wait before trying again
                                                    time.sleep(retry_delay)
                                                else:
                                                    # After max retries, handle the error
                                                    st.error(f"A apărut o eroare: {e}")
                                                    break

                                    # Saving the Final Transcription in session_state
                                    st.session_state[unique_transcription_key] = final_transcription

                                    # Displaying the Final Transcription
                                    st.text_area("Transcriere", value=st.session_state[unique_transcription_key], height=200)

                                    download_link = create_download_link(st.session_state[unique_transcription_key],
                                                                         f"transcriere_{timestamp}_{random_string}.txt")
                                    st.markdown(download_link, unsafe_allow_html=True)

                                    transcription = st.session_state[unique_transcription_key]
                                    max_chunk_size = 2500  # Numărul maxim de caractere pe care îl poate gestiona API-ul
                                    chunks_text = [transcription[i:i + max_chunk_size] for i in
                                              range(0, len(transcription), max_chunk_size)]
                                    .................

                                    st.warning('''
                                                ℹ️ Reformularea textului presupune reluarea intregii transcrieri, ceea ce implica utilizarea de resurse si, ca urmare, costurile pot fi similare sau chiar mai mari decat cele ale transcrierii initiale.

                                                ⌛️ Daca optati pentru reformulare sau rezumare, trebuie sa luati in considerare faptul ca acest proces poate dura mai mult, in functie de lungimea transcrierii. Asigurati-va ca aveti suficient timp pentru a finaliza acest proces. 
                                            ''')
                                    progress_bar = st.progress(0)
                                    # Initializing Variables to Indicate Continuing with Operations
                                    continue_rephrase = True
                                    # # Checking if the 'Paraphrase'('Reformulează') Button is Pressed
                                    if st.button('Reformulează', key="rephrase_button3"):
                                        .................

                                        if st.session_state["current_py"] >= cost_for_all_chunks_text:
                                            ....................
                                            final_response = ""
                                            for index, chunk in enumerate(chunks_text):
                                                for attempt in range(max_retries):
                                                    try:
                                                        # # Using Retry Function to Make the Request
                                                        rephrased_chunk = au.rephrase_chatbot_response(chunk)
                                                        final_response += rephrased_chunk + " "
                                                        progress = (index + 1) / total_chunks
                                                        progress_bar.progress(progress)
                                                        # If successful, we break the retry loop
                                                        break
                                                    except Exception as e:
                                                        error_message = str(e)
                                                        if "currently overloaded" in error_message and attempt < max_retries - 1:
                                                            # Waiting before attempting again
                                                            time.sleep(retry_delay)
                                                        else:
                                                            # After maximum retries, handling the error
                                                            st.error(f"A apărut o eroare: {e}")
                                                            break
                                            st.write(f"***EthicalPy Bot (Reformulare):*** {final_response}")
                                        else:
                                            st.write("Afiseaza mesaj de eroare")
                                            st.error(
                                                f"Nu ai suficienti PY pentru a reformula. Ai nevoie de minimum {cost_for_all_chunks_text} PY.")
                                            continue_rephrase = False


                                    # Here is the logic for the 'Summarize' button

I apologize if you don’t understand certain words, but the platform is intended for users from Romania. If there are any uncertainties, please feel free to ask, and I can provide clarification. Thank you!

I don’t quite understand your code so I can’t tell you why it doesn’t work as you expect, but you can certainly store intermediate results in session_state and avoid having to compute them in each rerun.

I made up the UI based on your description and wrote dummy functions transcript(), paraphrase() and summarize() that do nothing useful but they take a while to return so that you can tell whether they are being executed or not.

import time

import streamlit as st


def transcribe(file):
    time.sleep(5)
    return f"Transcription of {file.name}"


def paraphrase(transcription):
    time.sleep(5)
    return f"Paraphrasis of \"{transcription}\""


def summarize(transcription):
    time.sleep(5)
    return f"Summary of \"{transcription}\""


def on_file_uploaded():
    st.session_state.transcription = None
    st.session_state.paraphrasis = None
    st.session_state.summary = None


st.file_uploader(
    label="Upload file",
    key="uploaded",
    on_change=on_file_uploaded,
)

if st.session_state.uploaded is None:
    st.stop()

if st.session_state.get("transcription") is None:
    with st.spinner("Transcribing..."):
        st.session_state.transcription = transcribe(file=st.session_state.uploaded)
st.text(st.session_state.transcription)

if st.button("Paraphrase"):
    with st.spinner("Paraphrasing..."):
        st.session_state.paraphrasis = paraphrase(st.session_state.transcription)
if st.session_state.get("paraphrasis") is not None:
    st.text(st.session_state.paraphrasis)

if st.button("Summarize"):
    with st.spinner("Summarizing..."):
        st.session_state.summary = summarize(st.session_state.transcription)
if st.session_state.get("summary") is not None:
    st.text(st.session_state.summary)
1 Like