SpaCy and LaBSE Models Missing After Streamlit Cloud Deploy

Marianiki · April 29, 2025, 7:40pm

Hi everyone,

I’m deploying a Streamlit app to Streamlit Cloud that uses SpaCy models (xx_ent_wiki_sm, nl_core_news_sm, fr_core_news_sm) and Sentence-Transformers (LaBSE) via PyTorch.
The SpaCy models are already listed in my requirements.txt using .whl links.

Locally everything works fine, but on Streamlit Cloud, the app fails during startup with the following error:

2025-04-28 17:59:27,024 — INFO — Use pytorch device_name: cpu
2025-04-28 17:59:27,024 — INFO — Load pretrained SentenceTransformer: sentence-transformers/LaBSE
Traceback (most recent call last):
  File "/mount/src/semartagger/pipeline.py", line 35, in <module>
    nlp_ner = spacy.load("xx_ent_wiki_sm")
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/adminuser/venv/lib/python3.12/site-packages/spacy/__init__.py", line 51, in load
    return util.load_model(
           ^^^^^^^^^^^^^^^^
  File "/home/adminuser/venv/lib/python3.12/site-packages/spacy/util.py", line 472, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'xx_ent_wiki_sm'. It doesn't seem to be a Python package or a valid path to a data directory.

It seems like Streamlit Cloud is either wiping runtime downloads or not persisting the models correctly between deployments.
I’m also concerned that loading the LaBSE model via Huggingface / PyTorch might cause similar issues in the future (if runtime downloads aren’t reliable).

I’m still quite new to Streamlit and deployment, so apologies if this is a silly or basic question.

Is there a clean way to preload both SpaCy and Huggingface models on Streamlit Cloud? Should I manually upload models into the GitHub repo to guarantee availability?

Any best practices for handling large models during deployment would be really appreciated.

Thanks so much!

For reference this is the GitHub repo

Topic		Replies	Views
spaCy dependency parsing app - runs locally but not when deployed Community Cloud pytorch , spacy	4	952	December 13, 2023
Streamlit-Cloud will not download a spacy model during the building phase Community Cloud streamlit-cloud , debugging	0	23	February 20, 2025
Loading and caching of models and mutable objects Using Streamlit cache , spacy	7	9492	November 19, 2021
How to load local NER model for visualisation using Streamlit? Using Streamlit spacy	1	1007	August 15, 2022
Spacy models + SeleniumBase \| Streamlit Cloud Deployment Error "egg_info did not run successfully" Community Cloud spacy	2	1000	March 9, 2023

SpaCy and LaBSE Models Missing After Streamlit Cloud Deploy

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies