Does streamlit work on mainthread?

I am implementing Scrapy with Streamlit, so the thing is Scrapy uses signals and it only works on main thread where as when running this command in my streamlit code st.text(threading.current_thread().name) it is known that streamlit runs on ScriptRunner.scriptThread therefore signals are not working on this thread. The error I am getting is
ValueError: signal only works in main thread.

How am I calling scrapy:
I defined a function in amazon.py file where my class is defined, fucntion is as following

def run_prog(baseUrl,s_date,e_date,min_l,max_l):
    if "twisted.internet.reactor" in sys.modules:
        del sys.modules["twisted.internet.reactor"]

    print(threading.current_thread().name)
    cmdline.execute(f'scrapy crawl amazonreviews -a parameters={{"baseUrl":"{baseUrl}","start_date":"{s_date}","end_date":"{e_date}","min_l":"{min_l}","max_l":"{max_l}"}} -O amzn.json'.split())

I am using cmdline.execute to run the scrapy, this function is later called in my streamlit code by importing this function like this

config= json.loads(st.session_state['json_obj'])
        s_date=config['start_datetime']
        e_date=config['end_datetime']
        min_l=config['min_comment_len']
        max_l=config['max_comment_len']
        links=config['url_input']
        for link in links:
            run_prog(link,s_date,e_date,min_l,max_l)

The issue is when i run the amazon.py file standalone like a python file it works but when i implement it on streamlit and run via a function call it doesn’t work and throws me signal error as signal only works in main thread and streamlit runs on different thread. I looked it up on stackoverflow and there are methods which say using CrawlerRunner solves it but i think this is more of issue of signals and them not working in other thread.

Is there any way to run streamlit on main thread? or Is there any way to change the current thread to main thread?

Specs: Python-3.7.9, Streamlit-1.10.0, Scrapy-2.6.1

in your streamlit code:

import os
os.system('python run.py')#The file to start the crawler project

and run.py content:

# -*- coding: utf-8 -*-
from scrapy import cmdline
cmdline.execute('scrapy crawl test'.split())#test is the crawler name

Hi @hemil.mehta, welcome to Streamlit!

The Streamlit web server runs on the main thread, but whenever your Streamlit app is run, it’s always done on a different thread. There’s no way around this - it’s central to Streamlit’s execution model.

But! I think @hello521 has the right idea: if you have Python code that must run on the main thread, you can execute it in a separate process. Then you’ll need to do some work to gather the results of your scrapy process in your Streamlit app.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.