Package resources result in FileNotFoundError under Streamlit.io

Greetings again.

The application relies on a resource file that’s included in the package. When trying to access it from the app hosted by Streamlit here we get this error:

FileNotFoundError: [Errno 2] No such file or directory: '/home/adminuser/venv/lib/python3.12/site-packages/ssscoring/resources/drop-zones-loc-elev.csv'

The requirements.txt file uses the publicly available package ssscoring==2.0.0 from PyPI. When downloading the wheel and unzipping the resource is present:

Archive:  ssscoring-2.0.0-py3-none-any.whl
  Length      Date    Time    Name
---------  ---------- -----   ----
      888  01-22-2025 22:27   ssscoring/__init__.py
     6909  01-22-2025 22:29   ssscoring/app.py
    25775  01-22-2025 22:07   ssscoring/calc.py
     4441  12-30-2024 11:37   ssscoring/cli.py
     2859  01-22-2025 22:07   ssscoring/constants.py
     2070  01-19-2025 01:38   ssscoring/datatypes.py
      707  08-28-2023 04:10   ssscoring/errors.py
     8506  01-22-2025 22:07   ssscoring/flysight.py
     8826  01-22-2025 22:07   ssscoring/notebook.py
    15193  01-22-2025 22:07   ssscoring/resources/drop-zones-loc-elev.csv
     1529  01-22-2025 22:30   ssscoring-2.0.0.dist-info/LICENSE.txt
     8564  01-22-2025 22:30   ssscoring-2.0.0.dist-info/METADATA
       91  01-22-2025 22:30   ssscoring-2.0.0.dist-info/WHEEL
       58  01-22-2025 22:30   ssscoring-2.0.0.dist-info/entry_points.txt
       10  01-22-2025 22:30   ssscoring-2.0.0.dist-info/top_level.txt
     1285  01-22-2025 22:30   ssscoring-2.0.0.dist-info/RECORD
---------                     -------
    87711                     16 files

The code that processes the resource works when running Streamlit in the local workstation (macOS, Linux), in a virtual environment. It also works fine when Dockerized using the latest Python Docker image. The FileNotFoundError only happens in the Streamlit deployment.

The code that uses the resource:

from importlib.resources import files # code is correct, this snippet missed the 's' earlier
from io import StringIO
RESOURCES = 'ssscoring.resources'
RESOURCE_NAME='drop-zones-loc-elev.csv'
.
.
@st.cache_data
def _initDropZonesFromResource(resourceName: str) -> pd.DataFrame:
    buffer = StringIO(files(RESOURCES).joinpath(resourceName).read_bytes().decode(FLYSIGHT_FILE_ENCODING))
    dropZones = pd.read_csv(buffer, sep=',')
    return dropZones
.
.
dropZones = _initDropZonesFromResource(RESOURCE_NAME)

requirements.txt:

bokeh==2.4.3
click
haversine
importlib-resources
jupyter_bokeh==3.0.4
numpy<1.26.4
psutil
ssscoring==2.0.0
streamlit

ssscoring==2.0.0 is fetched from PyPI. The app reported the same error when the package was pulled from the GitHub repository/branch.

Please advise on how to handle app resources in the Streamlit environment since Python best practice method results in that FileNotFound error.

Thanks in advance and a have a wonderful evening,

pr3d

Your code works for me in streamlit cloud, after fixing the obvious issues (removing the extra dots, importing files instead of file, import streamlit and pandas, defining FLYSIGHT_FILE_ENCODING).

Hi Goyo! Thanks for your reply. Questions:

  • What extra dots?
  • streamlit as st and pandas as pd are imported wherever needed
  • FLYSIGHT_FILE_ENCODING is defined in ssscoring.constants and imported where used
  • from importlib_resources import files is at the top of the app module and used in the _initDropZonesFromResource() function

Which files/modules did you test or modify to test? ssscoring==2.0.1 works fine if installed in a new, un-initialized Python virtual environment if pulled from PyPI. I can start streamlit fine in a macOS desktop + new virtualenv, or in a Python Docker container that pulls the latest ssscoring from PyPI.

The ssscoring.app module is in that package. Please advise where you made your changes because I can’t see any of the issues you described. The code I tested is in master or PyPI, version 2.0.1 - it’s the same that’s deployed to https:// ssscore.streamlit.app - it may be at 2.0.2 by the time you read this reply. It was in a working branch (0023-GUI-prototypes) and is now in the master branch.

I suspect you might’ve used the 1.9 version in master for your test, not 2.0 or the working branch used to test Streamlit deployments. I implemented a Python object with the DZs as a list of dictionaries (to convert to DataFrame upon loading) as a workaround until I could figure out how to deploy and load a package resource, per the help request. That’s what’s running now on Streamlit, defined in the ssscoring.dzdir module.

Even if you had looked at the 1.9 version I’m still curious about how you made it work. Could you please paste the changes here or in a Gist? I ask because none of the things you described is missing in 2.0.x or testing branches, as far as I can tell but the app can’t load the DZs directory as a resource - much appreciated!

Thanks in advance and cheers!

E

I just deployed to streamlit cloud the code and requirementsthat you posted above.

It looks like you are doing something different, but I am a bit confused abuout what it is.

I’m confused too because the code from where I took the snippets to show the problem area includes all the imports that you mentioned in your post. Did you use the code off GitHub or only the snippet I posted in the original question?

Edit: I saw that the ‘s’ in ‘files’ is missing in the snippet, but not in the actual code. It got lost while I copied off the source to the post here. Now I understand why you mentioned it was missing.

Streamlit forum doesn’t let me post the full link to GitHub for some reason. You can see the actual code at github_com/pr3d4t0r/SSScoring - the latest in master is 2.0.4, it shows the code in the first post of this thread and the workaround on lines 58-68 of ssscoring/app.py - I didn’t remove the code that caused the exception, only added the new function _initDropZonesFromObject(). Those are defined in ssscoring/dzdir.py

Thanks and cheers!

Do you have a deployed app that shows the issue? I visited https://ssscore.streamlit.app and didn’t see any errors, but it is deployed from a different repository.

I think you can post links as preformatted text, like https://ssscore.streamlit.app.

Hi again!

https://ssscore.streamlit.app/ now has the resource code enabled. I tested it in the Mac and Docker images, it works fine there but not in the deployed app.

You can see the actual code here:

https://github.com/pr3d4t0r/SSScoring/blob/0a79f2e8b70c5a7bdd368d1913941fdd1e2dd38d/ssscoring/app.py#L72

Thanks for looking at this,

E

Hi again - you can see the expected behavior if you compare it with the code in the actual branch and package:

# source ~/Python-virtualenv/bin/active in effect, of course:
cd /tmp && \
    git clone git@github.com:pr3d4t0r/SSScoring.git && \
    pushd SSScoring && \
        git checkout "XXXXX-streamlit_app-n-resources-test" && \
        pip install -e . && \
        streamlit run ssscoring/app.py && \
    popd

I’ll leave that Git branch active until I hear back from you – thanks and cheers!

This gives an ImportError while trying to import ssscores.resources, which can be fixed by adding an empty __init__.py.

Then you get the FileNotFound error, due to the csv file not being copied during installation. I was able to reproduce the issue in my linux box using python 3.13 (but that should not make a difference in this regard).

python -m venv sss
source ./sss/bin/activate
pip install ssscoring@git+https://github.com/goyodiaz/SSScoring@ec8003a
python
Python 3.13.1 (main, Dec  4 2024, 18:05:56) [GCC 14.2.1 20240910] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from importlib.resources import files
>>> path = files("ssscoring.resources").joinpath("drop-zones-loc-elev.csv")
>>> path
PosixPath('/home/goyo/dev/SSS/sss/lib/python3.13/site-packages/ssscoring/resources/drop-zones-loc-elev.csv')
>>> path.exists()
False
>>> path.parent
PosixPath('/home/goyo/dev/SSS/sss/lib/python3.13/site-packages/ssscoring/resources')
>>> for child in path.parent.iterdir():
...     print(child)
...     
/home/goyo/dev/SSS/sss/lib/python3.13/site-packages/ssscoring/resources/__init__.py
/home/goyo/dev/SSS/sss/lib/python3.13/site-packages/ssscoring/resources/__pycache__

Hola - thanks.

I tried that yesterday, with and without __init__.py in there. The documentation for importlib_resources makes it a point that __init__.py isn’t necessary for how they handle the resource extraction.

Were you able to get it to work, with or without the package initialization file, under Streamlit, using importlib_resources or similar?

Mac and Docker (Linux x86 and ARM64 containers) in which I tested this are Python 3.13.1:

(Python-3_13_1) |0 :) ciurana@nena SSScoring %> uname -a && python -V
Darwin nena.local 23.6.0 Darwin Kernel Version 23.6.0: Fri Nov 15 15:13:15 PST 2024; root:xnu-10063.141.1.702.7~1/RELEASE_ARM64_T6000 x86_64
Python 3.13.1

I have several 3.12.* Python releases laying around the file system. I tested with them after reading your post because they match what’s running in Streamlit. Here’s the result from them:

(Python-3_12_6) |0 :) ciurana@nena SSScoring %> cat test.py && python test.py
from importlib_resources import files
from io import StringIO
import platform
import pandas as pd
buffer = StringIO(files('ssscoring.resources').joinpath('drop-zones-loc-elev.csv').read_bytes().decode('utf-8'))
dropZones = pd.read_csv(buffer, sep=',')
assert isinstance(dropZones, pd.DataFrame)
print(platform.uname())
print(platform.python_version())
# output
uname_result(system='Darwin', node='nena.local', release='23.6.0', version='Darwin Kernel Version 23.6.0: Fri Nov 15 15:13:15 PST 2024; root:xnu-10063.141.1.702.7~1/RELEASE_ARM64_T6000', machine='x86_64')
3.12.6

Linux:

uname -a && python test.py 
Linux a7da2dfe40b5 6.10.14-linuxkit #1 SMP Fri Nov 29 17:22:03 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
uname_result(system='Linux', node='a7da2dfe40b5', release='6.10.14-linuxkit', version='#1 SMP Fri Nov 29 17:22:03 UTC 2024', machine='aarch64')
3.12.6

No errors, dropZones has a reference to the valid dataframe, OS and platform are visible. Please let me know if you were able to get this to run.

EDIT: I reverted the app on Streamlit over to 2.0.4 that uses the workaround. I’ll check on deploying a secondary, “staging and testing” app off the same repository if we need it for diagnostics.

Cheers and thanks!

No, I couldn’t make it work because the csv file is not copied upon installation. That should be dealt with in pyproject.toml, I guess. You are putting it in package-data, but for whatever reason it is not working.

Coolio, thanks. This looks now like a legitimate Streamlit bug. I’ll check how to file it.

I appreciate your help – have an awesome weekend!

pr3d4t0r

It doesn’t look like a streamlit bug. It is not specific to streamlit cloud, if that is what you mean. I installed ssscoring (from ec8003a) in my own linux computer and the csv file wasn’t there. I wrote down the detailed steps I followed above.

That’s weird because I’m installing the same and it works just fine, and the file is present in the ./ssscoring/resources directory. You can see it on GitHub.

I’ve published this with the workaround, but I do believe there’s a bug in how Streamlit treats resources. It seems to pull the files from GitHub but doesn’t look at pyproject.toml and that’s what causes the issue. That’s also what makes it think that the resources needs to have an __init__.py file - it doesn’t according to the combination of pyproject.toml and importlib_resources documentation.

If you pull the package from PyPI and open the wheel you’ll see that the resource file is there, and code that refers to it as a resource will find it fine. I will decide in the next few days how to describe this and open a bug report.

Thanks again for your help and cheers!

E

I can see in github what happens when you install ssscoring@ec8003a? How do I do that?

The two bugs I have seen so far can be replicated in my computer too.

When I asked for code that shows the issue you didn’t give me a pypi package but a github branch. If it can be reproduced with a pypi package, I can test that too. Just tell me what package I should try.

You need the resource file in the environment after the package has been installed. Having it in the wheel can help, but it is neither necessary nor sufficient.

Hi Goyo!

Thanks - I don’t know what you’re installing, and I think the instructions to get something newer got lost in all the messages. Here’s a super easy, three-lines in the shell, test to validate:

Python-3_13_1) |0 :) ciurana@nena SSScoringLit %> pip install ssscoring==2.0.15 > /dev/null && pip list | grep ssscoring                                                                                                        
ssscoring                                         2.0.15
(Python-3_13_1) |0 :) ciurana@nena SSScoringLit %> echo "from importlib_resources import files; from io import StringIO; print(StringIO(files('ssscoring.resources').joinpath('drop-zones-loc-elev.csv').read_bytes().decode('utf-8')).readline())" | python 
,dropZone,lat,lon,elevation,location,country

(Python-3_13_1) |0 :) ciurana@nena SSScoringLit %> pip uninstall -y ssscoring                                                                                                                                   
Found existing installation: ssscoring 2.0.15
Uninstalling ssscoring-2.0.15:
  Successfully uninstalled ssscoring-2.0.15

I scrapped this right off the terminal. Python 3.13.1 virtual environment, clean. It does the same thing under ARM, Apple Silicon, and x86 (Linux, macOS, Linux).

Thanks and cheers!

E

Ok, but that package also works in streamlit cloud, for me at least. So the issue went away?

No, it doesn’t work when hosted in Streamlit.app. The package works, but the resource can’t be read as a resource. I made a synthetic object with the resource and that’s what it loads, not the CSV. An extra step if we need to update the DZ directory because it goes like this:

DB -> CSV (raw) -> CSV (munged) -> converter -> dzdir.py

This is what we’d love to have:

DB -> CSV (raw) -> CSV (munged)

The munged CSV is the resource. I’d prefer to not make that synthetic object because it’s an annoying manual step that shouldn’t be necessary.

You can look at what the code does now here:

https://github.com/pr3d4t0r/SSScoring/blob/master/ssscoring/dzdir.py

I wrote a silly CSV-to-Python-dictionary converter to get around importlib_resources not working under Streamlit.app

Cheers!

It does seem to work for me. What do you do to reproduce the issue in streamlit cloud with ssscoring==2.0.15?

Hi!

I just set this up under 2.0.16 – it works now! I guess something was not Kosher in the previous versions. I have the code, side-by-side, between local and Streamlit.io and the resource is resolved without issues if the package comes as a wheel. It still can’t figure out the resource if it’s installed from GitHub source instead of PyPI/wheel.

I 100% appreciate your help, thanks for working with me on this. Have an awesome weekend.