Multiprocessing issue in Streamlit

Hello,
Iโ€™m trying to use streamlit on windows to develop private data science project locally, however there are issues when using multiprocessing.

As the diagram shows:

I have 3 users using the system concurrently, each user may spawn up to 4 heavy processes (each process takes 2 minutes of execution time)

As the diagram shows, some of the processes do finish execution successfully while others do not.

Looking at the server logs: the following error is produced when it fails to launch a process X for user Y. (its really random)

Can't pickle <function do_heavy_work at 0x000002C0B7342040>: it's not the same object as __main__.do_heavy_work

Unfortunately, the project is private so Iโ€™m unable to share the code, However I was hoping for a direction for debugging this issue, if you will.

For multiprocessing: I do use ProcessPoolExecutor from concurrent.futures
Iโ€™m using Streamlit, version 1.26.0 and Python 3.9.10
Inside the do_heavy_work() function I make use of session_state dictionary to store results for every user session and i donโ€™t use caching.

  • This issue doesnโ€™t not arise when there is a single user running the system spawning any number of processes.
  • This issue doesnโ€™t also arise when there is 20 seconds gap between each user working on the application.
  • The issue only arises when all users at the same time start the do_heavy_work jobs.

Any debugging direction would be appreciated, or if the whole concept isnโ€™t supported in streamlit i would be happy to know.
Thanks.

1 Like

Hi,

i did something simular and the problem could be, that with the with each call on streamlit new variables are created. For me in works if i use no stramlit in spwaned processes and just a shared dict which i save via the singleton caching.

For you also queues could work.

Hope this helps! Kind regards

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.