Summary
Hey,
I am getting the error described above, and I know that there are already threads concerned with that problem, but to the best of my knowledge it has never been resolved.
I am supposed to update code, which used to work but does not anymore. The program crawls a webpage to which it gets the URL, does some computations and displays the crawled text using streamlit. The app starts perfectly fine, I can insert the URL in the respective field but when I want to start the computation, I get this error message:
Error Message:
File "/home/folder/check_env/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
exec(code, module.__dict__)
File "/home/folder/app.py", line 130, in <module>
article, title = extract_text_from_url(URL)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/folder/app.py", line 77, in extract_text_from_url
tra_article = trafilatura.extract(downloaded)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/folder/check_env/lib/python3.11/site-packages/trafilatura/core.py", line 1054, in extract
signal(SIGALRM, timeout_handler)
File "/home/anaconda3/lib/python3.11/signal.py", line 56, in signal
handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The peculiar thing is that the error occurs when executing the line: article, title = extract_text_from_url(URL)
, which - when executing it without streamlit - works out fine.
This is the code to the function:
def extract_text_from_url(URL):
resp = requests.get(URL)
if resp.ok:
article = ""
downloaded = trafilatura.fetch_url(URL)
title = trafilatura.extract_metadata(downloaded).title
tra_article = trafilatura.extract(downloaded)
para_list = tra_article.split('\n')
for para in para_list:
if para.count('.') > 1:
article += para
article += '\n'
#parser = PlaintextParser.from_string(doc, Tokenizer(LANGUAGE))
#parser = TextBlob(text)
return article, title
else:
st.sidebar.error('UngĂĽltige URL!! Bitte geben Sie eine gĂĽltige URL ein.')
Debug info
- Streamlit version: 1.25.0
- Python version: 3.11.3
- Using Conda
- executing my “app.py” with the command
streamlit run app.py
- executing the code via pycharm IDE on a Ubuntu machine but I also tried starting it from the terminal with the command mentioned above and it leads to the same error.
I am still new to this, so if you need any further information please let me know:) Thanks for the help!