Facing ValueError: signal only works in main thread

Summary

Hey,
I am getting the error described above, and I know that there are already threads concerned with that problem, but to the best of my knowledge it has never been resolved.

I am supposed to update code, which used to work but does not anymore. The program crawls a webpage to which it gets the URL, does some computations and displays the crawled text using streamlit. The app starts perfectly fine, I can insert the URL in the respective field but when I want to start the computation, I get this error message:

Error Message:

File "/home/folder/check_env/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
    exec(code, module.__dict__)
File "/home/folder/app.py", line 130, in <module>
    article, title = extract_text_from_url(URL)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/folder/app.py", line 77, in extract_text_from_url
    tra_article = trafilatura.extract(downloaded)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/folder/check_env/lib/python3.11/site-packages/trafilatura/core.py", line 1054, in extract
    signal(SIGALRM, timeout_handler)
File "/home/anaconda3/lib/python3.11/signal.py", line 56, in signal
    handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The peculiar thing is that the error occurs when executing the line: article, title = extract_text_from_url(URL), which - when executing it without streamlit - works out fine.

This is the code to the function:

def extract_text_from_url(URL):
    resp = requests.get(URL)
    if resp.ok:
        article = ""
        downloaded = trafilatura.fetch_url(URL)
        title = trafilatura.extract_metadata(downloaded).title
        tra_article = trafilatura.extract(downloaded)
        para_list = tra_article.split('\n')
        for para in para_list:
            if para.count('.') > 1:
                article += para
                article += '\n'
        #parser = PlaintextParser.from_string(doc, Tokenizer(LANGUAGE))
        #parser = TextBlob(text)
        return article, title
    else:
        st.sidebar.error('Ungültige URL!! Bitte geben Sie eine gültige URL ein.')

Debug info

  • Streamlit version: 1.25.0
  • Python version: 3.11.3
  • Using Conda
  • executing my “app.py” with the command streamlit run app.py
  • executing the code via pycharm IDE on a Ubuntu machine but I also tried starting it from the terminal with the command mentioned above and it leads to the same error.

I am still new to this, so if you need any further information please let me know:) Thanks for the help!:slight_smile:

Hey @pauls33,

Can you share a runnable code snippet so that we can try to reproduce the error you’re seeing?

Hey @Caroline,
thanks for your reply:) this is the snippet of code I am trying to execute:

import streamlit as st
import validators
import requests
import trafilatura

st.set_page_config(page_title="Demo")
URL = st.sidebar.text_input('put in url', "")
URL = URL.strip()
    
button_clicked = st.sidebar.button("🔍")

isValid = bool(validators.url(URL))

def extract_text_from_url(URL):
    resp = requests.get(URL)
    if resp.ok:
        article = ""
        downloaded = trafilatura.fetch_url(URL)
        title = trafilatura.extract_metadata(downloaded).title
        tra_article = trafilatura.extract(downloaded)
        return tra_article, title
    
    else:
        st.sidebar.error('This is not an URL. Put in valid URL.')
    
# MAIN-PAGE display
if button_clicked:
    if isValid:

        # TEXT EXTRACTION                
        article, title = extract_text_from_url(URL)      
            
    else:
        st.sidebar.error('This is not an URL. Put in valid URL.')

Hey, I figured that there might be a conflict in how trafilatura and streamlit handle thread management. Although this might be cheating I finally fixed it by changing the library with which I extract the text given the URL from trafilatura to newspaper3k. This is my executable code:

import streamlit as st
import validators
import requests
import trafilatura
from newspaper import Article

st.set_page_config(page_title="Demo")
URL = st.sidebar.text_input('put in url', "")
URL = URL.strip()
    
button_clicked = st.sidebar.button("🔍")

isValid = bool(validators.url(URL))

def fetch_url(url):
    article = Article(url)
    article.download()
    article.parse()
    return article

def extract_text_from_url(URL):
    resp = requests.get(URL)
    if resp.ok:
        article = ""
        fetched_article = fetch_url(URL)
        title = fetched_article.title
        article = fetched_article.text
        return article, title
    
    else:
        st.sidebar.error('This is not an URL. Put in valid URL.')
    
# MAIN-PAGE display
if button_clicked:
    if isValid:

        # TEXT EXTRACTION                
        article, title = extract_text_from_url(URL)      
            
    else:
        st.sidebar.error('This is not an URL. Put in valid URL.')

I am also getting error like this
Traceback (most recent call last):
File “C:\Python311\Lib\site-packages\ipykernel\kernelapp.py”, line 699, in initialize
self.init_signal()
File “C:\Python311\Lib\site-packages\ipykernel\kernelapp.py”, line 543, in init_signal
signal.signal(signal.SIGINT, signal.SIG_IGN)
File “C:\Python311\Lib\signal.py”, line 56, in signal
handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: signal only works in main thread of the main interpreter

Tell me the solution.

Hello,
I am facing this error please help me to me to understand the error.
You can now view your Streamlit app in your browser.

Local URL: http://localhost:8501
Network URL: http://172.16.1.24:8501

NOTE: When using the ipython kernel entry point, Ctrl-C will not work.

To exit, you will have to explicitly quit this process, by either sending
“quit” from a client, or using Ctrl-\ in UNIX-like environments.

To read more about this, see Can not stop "ipython kernel" on windows · Issue #2049 · ipython/ipython · GitHub

To connect another client to this kernel, use:
–existing kernel-28184.json
[IPKernelApp] ERROR | Unable to initialize signal:
Traceback (most recent call last):
File “C:\Users\pooja.d\AppData\Local\anaconda3\Lib\site-packages\ipykernel\kernelapp.py”, line 701, in initialize
self.init_signal()
File “C:\Users\pooja.d\AppData\Local\anaconda3\Lib\site-packages\ipykernel\kernelapp.py”, line 545, in init_signal
signal.signal(signal.SIGINT, signal.SIG_IGN)
File “C:\Users\pooja.d\AppData\Local\anaconda3\Lib\signal.py”, line 56, in signal
handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: signal only works in main thread of the main interpreter