Multiprocessing leads to frequent app crashes

ein_io · December 7, 2023, 2:22am

i am a streamlit newbie. I am using multiprocessing in my app. When without it the app is very robust, but when the multiprocessing is used it often leads to errors reported. I started to specify the number of processes to allocate (e.g. Pool(processes=4) – this helps a bit but still the app is unstable – sometimes is finishes execution and sometimes it returns errors.

Are there any clear guidelines as to how many processes can be used and how to properly allocated resources to avoid app crashes (when the multiprocessing is used.

Thank you!

ferdy · December 7, 2023, 10:43am

Do you have a sample code that we can play around. Something that we can reproduce your issue. Just a minimal code.

ein_io · December 8, 2023, 1:46am

I am running this function twice in a row. It launches functions which invoke responce from openai. Sometimes everything is fine. But mostly the second run returns in empty list. I am well under the token limit of openai

And my question is what us the maximum amount of workers that will guaranteed work and not invoke some kind of resource limit breach on streamlit

def execute_with_futures(function, input):
    all_responses = {}  # Initialize an empty dictionary to collect all responses

    with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:

        futures = {
            executor.submit(function, list(input[i])): i
            for i, entity in enumerate(input)
        }

        for future in concurrent.futures.as_completed(futures):
            index = futures[future]
            try:
                all_responses[index] = future.result()
            except Exception as e:
                all_responses[index] = f"Error in OpenAI API call: {e}"

    # Convert the dictionary to a list, ensuring the order is preserved
    ordered_responses = [all_responses[i] for i in range(len(input))]

    return ordered_responses

ferdy · December 8, 2023, 3:58am

You mentioned crashes. Is there any error message?

andrew-weisman · December 8, 2023, 4:08am

@ein_io Without seeing a minimal example of your issues, I know that in order to get multiprocessing to work I had to, for one, ensure my script has a __name__ == '__main__' block. Try following the template here to see if that fixes it.

ein_io · December 8, 2023, 8:13am

Andrew – thank you so so much – seems to have helped indeed (although per its design the 'name…" block is meant to allow using the script to import functions without executing the would be main code) !!!

andrew-weisman · December 8, 2023, 8:40am

No problem, glad it seems to be helping! Agreed on the general point of __name__ == '__main__', but FYI this seems to be a multiprocessing (or related) library restriction per this documentation (see “Safe importing of main module”).

jeanoscar · January 12, 2024, 1:35am

Greetings,
I am having an issue a little similar to this one. The community help on other topics related to multiprocessing help me understand a lot about data sharing across processes. Now I have another problem.
I am trying to use a session_state to update a st.bar_chart.
but I keep getting this error 'AttributeError: st.session_state has no attribute "data". Did you forget to initialize it? and the app with spin indefinitely until I shut it down.
I can attest that if I comment out the st.bar_chart line everything under the __main__ block will work just fine in seconds.
my ultimate intend is to use the multiprocessing step to frequently pull the data and pass it to the char.
Not sure of what I am missing here.

any help is greatly appreciated.


def open_file(file_path, out_name):
    
    out_name.df=pd.read_csv(file_path)


if ('data' not in st.session_state):

    st.session_state['data']=pd.DataFrame() #empty data frame
    
st.write(' Bar Char ')
st.bar_chart(st.session_state.data, x='x_label',y='y_label') 
	



if __name__=="__main__":

  
    manager=multiprocessing.Manager()
    out_data=manager.Namespace()
    
    p1=multiprocessing.Process(target=open_file,args=(data_path ,out_data) )
    p1.start()
    p1.join()
    st.session_state.data=out_data.df
    st.write('Multiprocessing New Data:',out_data.df)

.

system · July 10, 2024, 1:35am

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multiprocessing Pool in streamlit app Using Streamlit	2	1002	December 11, 2023
Multiprocessing issue in Streamlit Using Streamlit	2	1531	April 30, 2024
Error running app -> resources limits Using Streamlit	2	320	September 11, 2021
App crash even no one uses it Using Streamlit	8	560	March 9, 2024
Concurrency Performance Stress Test Using Streamlit discussion	1	390	September 15, 2024

Multiprocessing leads to frequent app crashes

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies