Multiprocessing issue in Streamlit

Hello,
I’m trying to use streamlit on windows to develop private data science project locally, however there are issues when using multiprocessing.

As the diagram shows:

I have 3 users using the system concurrently, each user may spawn up to 4 heavy processes (each process takes 2 minutes of execution time)

As the diagram shows, some of the processes do finish execution successfully while others do not.

Looking at the server logs: the following error is produced when it fails to launch a process X for user Y. (its really random)

Can't pickle <function do_heavy_work at 0x000002C0B7342040>: it's not the same object as __main__.do_heavy_work

Unfortunately, the project is private so I’m unable to share the code, However I was hoping for a direction for debugging this issue, if you will.

For multiprocessing: I do use ProcessPoolExecutor from concurrent.futures
I’m using Streamlit, version 1.26.0 and Python 3.9.10
Inside the do_heavy_work() function I make use of session_state dictionary to store results for every user session and i don’t use caching.

  • This issue doesn’t not arise when there is a single user running the system spawning any number of processes.
  • This issue doesn’t also arise when there is 20 seconds gap between each user working on the application.
  • The issue only arises when all users at the same time start the do_heavy_work jobs.

Any debugging direction would be appreciated, or if the whole concept isn’t supported in streamlit i would be happy to know.
Thanks.

1 Like

Hi,

i did something simular and the problem could be, that with the with each call on streamlit new variables are created. For me in works if i use no stramlit in spwaned processes and just a shared dict which i save via the singleton caching.

For you also queues could work.

Hope this helps! Kind regards