spaCy dependency parsing app - runs locally but not when deployed

emmac · June 13, 2023, 9:16pm

Dear Streamlit team and community,

I’ve been trying to deploy a very simple app that demonstrates dependency parsing with spaCy models, which works locally with torch==2.0.1.

Since my attempts to deploy with the same version of torch failed, and I gathered that pytorch is too large to be deployed, I’ve included a CPU version of torch in requirements.txt.
The models and dependencies are installed without errors, even torch==2.0.1+cpu is installed correctly, and the only issue I can see in the logs is:
Streamlit server consistently failed status checks
I don’t get any other hints as to what the problem could be, apart from a warning:

warning: missing-index-doctype
× The package index page being used does not have a proper HTML doctype declaration.
╰─> Problematic URL: https://download.pytorch.org/whl/torch_stable.html
note: This is an issue with the page at the URL mentioned above.
hint: You might need to reach out to the owner of that package index, to get this fixed. See https://github.com/pypa/pip/issues/10825 for context.

Is this warning preventing my app from running? Could you suggest what the issue might be?

This is the content of the requirements.txt.

pip==23.1.2
spacy==3.4.0
pydantic==1.9.2
spacy-transformers==1.1.2
-f https://download.pytorch.org/whl/torch_stable.html
torch==2.0.1+cpu
urllib3==1.26.15
streamlit==1.23.1
https://github.com/explosion/spacy-models/releases/download/de_core_news_md-3.4.0/de_core_news_md-3.4.0.tar.gz
https://github.com/explosion/spacy-models/releases/download/de_core_news_lg-3.4.0/de_core_news_lg-3.4.0.tar.gz
https://github.com/explosion/spacy-models/releases/download/de_dep_news_trf-3.4.0/de_dep_news_trf-3.4.0.tar.gz
typing_extensions<4.6.0

And the app itself:

import time
import spacy
from spacy import displacy
import streamlit as st
from streamlit import cache_resource

# Wrap the model loading with streamlit caching
@st.cache_resource()
def load_model(model_name):
    return spacy.load(model_name)

# Loading the models
nlp_md = load_model('de_core_news_md')
nlp_lg = load_model('de_core_news_lg')
nlp_trf = load_model('de_dep_news_trf')

pipelines = {"de_core_news_md": nlp_md, "de_core_news_lg": nlp_lg, "de_dep_news_trf": nlp_trf}

# List of sentences to process
sentences = ["Ich heisse Pippi Langstrumpf.", "Ich zeichne gern, aber ich spiele nicht gern Computer.", \
'Ich mag Schokolade, aber Spaghetti und Banane mag ich nicht.', 'Ist dieser Platz noch frei?',
'Darf ich mal durch?', 'Wie spät ist es?', 'Ich habe mich verlaufen.', 'Können Sie mir bitte sagen, wie ich zum Bahnhof komme?', \
'Wie viel kostet ein Ticket bis nach Hamburg?', 'Können Sie mir bitte helfen?', \
'Ich habe mein Portemonnaie verloren.', 'Das habe ich akustisch nicht verstanden.', \
'Wann hast du morgen Zeit?', 'Können wir das auf morgen verschieben?', 'Ich bin im Stress.', \
'Ich bin gestresst.', 'Ich habe keine Zeit.', 'Das wird schon klappen!', 'Störe ich gerade?', \
'Bitte warten Sie einen Moment.', 'Einen Moment bitte.', 'Was hast du heute vor?', 'Ich melde mich.', \
'Es ist ganz schön kalt hier.']

# Adding a title and some explanations
st.title('spaCy parser comparison (German)')
st.markdown("""
Streamlit dashboard to compare the parser of three spaCy pipelines for German:
```de_core_news_md```, ```de_core_news_lg``` and ```de_dep_news_trf```.
Select a sentence from the dropdown menu or input your own sentence in the sidebar.
Dependency tree and processing time will be displayed.
""")
st.markdown("---")

# Adding a selectbox for the sentences to the sidebar
selected_sentence = st.sidebar.selectbox('Select a pre-defined sentence', sentences)

# Adding a text input for the sentences to the sidebar
user_sentence = st.sidebar.text_input('Or type your own sentence')

# Choosing which sentence to analyze
sentence_to_analyze = user_sentence if user_sentence.strip() != "" else selected_sentence


for name, nlp in pipelines.items():
    start_time = time.time()
    doc = nlp(sentence_to_analyze)
    end_time = time.time()
    elapsed_time = end_time - start_time
    svg = displacy.render(doc, style='dep')

    st.markdown(f"**Pipeline**: {name}")
    st.markdown(f"**Elapsed time**: {elapsed_time:.2f} seconds")
    st.markdown(svg, unsafe_allow_html=True)
    st.markdown("---")  # Adds a separator for readability

This is the app URL:
https://spacy-dependency-tree-german.streamlit.app/

The github repo is here:
https://github.com/emma-carballal/streamlit_parsing_app/tree/main

Thank you in advance for your help!

Franky1 · June 14, 2023, 9:30am

Try this:

requirements.txt

spacy==3.4.0
https://github.com/explosion/spacy-models/releases/download/de_core_news_md-3.4.0/de_core_news_md-3.4.0.tar.gz
https://github.com/explosion/spacy-models/releases/download/de_core_news_lg-3.4.0/de_core_news_lg-3.4.0.tar.gz
https://github.com/explosion/spacy-models/releases/download/de_dep_news_trf-3.4.0/de_dep_news_trf-3.4.0.tar.gz
streamlit==1.23.1

emmac · June 14, 2023, 3:57pm

Thank you, @Franky1, for getting back to me on my question.
Probably you’re right and “less is more” in the requirements.txt

I started by including only the packages you suggest.
When deploying, I got an error that I was getting locally due to an incompatibility of pydantic with typing_extensions>=4.6.0, so typing_extensions<4.6.0 needs to be included in requirements.txt, as explained in https://zenodo.org/record/7970450.

Deployed again, but still getting❗️Streamlit server consistently failed status checks

Is it the fact that it’s installing a GPU version of torch the problem?

Collecting torch>=1.6.0
  Downloading torch-2.0.1-cp39-cp39-manylinux1_x86_64.whl (619.9 MB)

I don’t see any problems in the package installations. Are there any clues in the logs that could point me to a solution?
https://spacy-dependency-tree-german.streamlit.app/

Thanks again!

emmac · June 16, 2023, 10:29am

Hi again

Could anyone in the community who has deployed an app in Streamlit Community Cloud that uses Pytorch let me take a look at their Github repo?
I would be sooo grateful

Thank you!

system · December 13, 2023, 10:30am

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
SpaCy and LaBSE Models Missing After Streamlit Cloud Deploy Community Cloud streamlit-cloud , debugging	0	21	April 29, 2025
Error with Spacy Community Cloud spacy	3	351	July 17, 2023
Streamlit-Cloud will not download a spacy model during the building phase Community Cloud streamlit-cloud , debugging	0	30	February 20, 2025
Spacy models + SeleniumBase \| Streamlit Cloud Deployment Error "egg_info did not run successfully" Community Cloud spacy	2	1001	March 9, 2023
No module named 'en_core_web_sm' Community Cloud streamlit-cloud	8	18176	December 5, 2022

spaCy dependency parsing app - runs locally but not when deployed

requirements.txt

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies