ModuleNotFoundError: No module named 'sklearn' and 'matplotlib'

Please I have been getting this error over and over. I have read the discussions, read the documentation but still nothing changes.
this is the link to my files [GitHub - elijahnkuah/insurance: You have been appointed as the Lead Data Analyst to build a predictive model to determine if a building will have an insurance claim during a certain period or not. You will have to predict the probability of having at least one claim over the insured period of the building."""].
Below is the error

ModuleNotFoundError: No module named ‘sklearn’

Traceback:

File "/usr/local/lib/python3.7/site-packages/streamlit/script_runner.py", line 332, in _run_script
    exec(code, module.__dict__)File "/app/insurance/insurance_streamlit.py", line 3, in <module>
    from sklearn.metrics import roc_auc_score, log_loss

Usage: pip [options]


ERROR: Invalid requirement: pip install --upgrade pip

pip: error: no such option: --upgrade


WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

[manager] Processed dependencies!


  You can now view your Streamlit app in your browser.


  Network URL: http://10.12.81.54:8501

  External URL: http://34.82.55.159:8501


ERROR: Invalid requirement: 'pip install pipreqs' (from line 1 of requirements.txt)

WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

  Stopping...

[manager] Error checking Streamlit healthz: Get "http://localhost:8501/healthz": dial tcp 127.0.0.1:8501: connect: connection refused


  You can now view your Streamlit app in your browser.


  Network URL: http://10.12.81.54:8501

  External URL: http://34.82.55.159:8501


ERROR: Invalid requirement: 'pip install pipreqs' (from line 1 of requirements.txt)

WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

  Stopping...


  You can now view your Streamlit app in your browser.


  Network URL: http://10.12.81.54:8501

  External URL: http://34.82.55.159:8501


ERROR: Invalid requirement: 'pip install pipreqs' (from line 1 of requirements.txt)

WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

  Stopping...


  A new version of Streamlit is available.


  See what's new at https://discuss.streamlit.io/c/announcements


  Enter the following command to upgrade:

  $ pip install streamlit --upgrade



  You can now view your Streamlit app in your browser.


  Network URL: http://10.12.81.54:8501

  External URL: http://34.82.55.159:8501


Requirement already satisfied: altair==4.1.0 in /usr/local/lib/python3.7/site-packages (from -r requirements.txt (line 3)) (4.1.0)

ERROR: Could not find a version that satisfies the requirement anaconda-client==1.7.2

ERROR: No matching distribution found for anaconda-client==1.7.2

WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

  Stopping...

[manager] Error checking Streamlit healthz: Get "http://localhost:8501/healthz": dial tcp 127.0.0.1:8501: connect: connection refused


  You can now view your Streamlit app in your browser.


  Network URL: http://10.12.81.54:8501

  External URL: http://34.82.55.159:8501


Requirement already satisfied: altair==4.1.0 in /usr/local/lib/python3.7/site-packages (from -r requirements.txt (line 3)) (4.1.0)

ERROR: Could not find a version that satisfies the requirement anaconda-client==1.7.2

ERROR: No matching distribution found for anaconda-client==1.7.2

WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

  Stopping...

[manager] Error checking Streamlit healthz: Get "http://localhost:8501/healthz": dial tcp 127.0.0.1:8501: connect: connection refused


  You can now view your Streamlit app in your browser.


  Network URL: http://10.12.81.54:8501

  External URL: [2021-02-03 16:34:10.160225] http://34.82.55.159:8501


ERROR: Invalid requirement: 'pip install pipreqs' (from line 1 of requirements.txt)

WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

  Stopping...


  You can now view your Streamlit app in your browser.


  Network URL: [2021-02-03 16:35:53.621185] http://10.12.81.54:8501

  External URL: http://34.82.55.159:8501


Requirement already satisfied: altair==4.1.0 in /usr/local/lib/python3.7/site-packages (from -r requirements.txt (line 3)) (4.1.0)

ERROR: Could not find a version that satisfies the requirement anaconda-client==1.7.2

ERROR: No matching distribution found for anaconda-client==1.7.2

WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

  Stopping...


  You can now view your Streamlit app in your browser.


  Network URL: http://10.12.81.54:8501

  External URL: http://34.82.55.159:8501

Hey @Elijah_Nkuah,

First, Welcome to the Streamlit community!!! :partying_face: :partying_face: :partying_face: :tada:

Now let’s get down to business! :face_with_monocle:

Looking at your app you have 13 distinct packages that you import in whole or functions from them

the packages you import in your insurance_streamlit.py file are listed here

sklearn
streamlit
pandas
numpy
matplotlib
seaborn
xgboost
catboost
lightgbm
plotly
altair
base64
csv

BUT when it comes to your requirements.txt file, you have 253 specific dependancies that are being pip installed on your Streamlit Sharing app. When I say specific here, I mean they each have a version specifier (the == #.#.# part). This makes me think that you generated your pip requirements file automatically, but that you weren’t in a clean environment :question:

I think this is the main problem you’re experiencing, your requirements.txt file should ideally only have the packages you intend to use for this project. You can fix this by either:

  1. Creating a new environment on your computer and installing just the 13 packages you use in your app, and then re-generating your requirements file
  2. Creating your requirements.txt by hand by just copying in the list in the toggle above (for now let’s forget the version specifier)
Side Note: Torch Info

Also, I noticed your installing Torch (400 MB in size), our Sharing platform only allows 800 MB of space for each person’s app, which makes this a huge portion of the total space. Luckly, you are not using Torch and so I would highly recommend removing at least this requirement! :nerd_face:

Let me know if this works!
Marisa

2 Likes

Thank you very much. It worked
I removed the unneeded libraries in requirements.txt

1 Like

Please kindly explain it into details for me. I know my files are not up to that specific size.

Also, I noticed your installing Torch (400 MB in size), our Sharing platform only allows 800 MB of space for each person’s app, which makes this a huge portion of the total space.

Hey :wave:

So Streamlit Sharing only allows 400 MB 800 MB of space for each person’s deployed app.
Update: sorry I realized there was a typo it’s 800 MB of space

When you were importing all of those dependancies from your requirements.txt file that you didn’t need, they are actually taking up all of the available space!

Your files themselves aren’t that big, but by the time the script had finished importing all your 250 packages, there was no room for anything else!

Thanks why after you changed your requirements.txt to use just the ones your import in your script, it fixed the issue! :tada:

Happy Streamlit-ing!
Marisa

2 Likes

OH I SEE
Thanks for the clarity

1 Like

Hi Marisa,

I have the same problem that “ModuleNotFoundError: No module named ‘sklearn’”. I originally used pipreqs to create the requirements.txt. I tried manually create it without the version specifiers) after reading your comments. However, it still doesn’t work for me. Could you please let me know how I could fix the issue? My github repo for this project is this. Thank you in advance!

@sarazong
Are we talking about streamlit sharing or your local machine?

If local computer:

  • Check your locally installed packages with pip freeze
  • Are you using a virtualenv ?

If streamlit sharing:

  • The requirements.txt must be in the root folder of the github repository as far as I know.

Hey @sarazong,

@Franky1 is right! I have taken a look at your GitHub repo and (assuming you’re deploying on Sharing) your requirements.txt file is in a subdirectory of your repo and not the root directory. Move your requirements from Metis_project5/streamlit/requirements.txt to Metis_project5/requirements.txt and you should be good to go!

Happy Streamlit-ing!
Marisa

Thank you @Franky1

I moved the requirement file and it worked. Thank you :)! However, I ran into another problem sharing my dashboard. It worked perfectly fine locally so I am not sure why the following error occur:
FileNotFoundError: [Errno 2] No such file or directory: ‘data/data_for_streamlit.pkl’

Of course - because your renamed the folder to datasets/

There is a data folder inside my streamlit folder containing the data for the dashboard. The datasets folder outside of the streamlit folder contains data for other parts of the project. I renamed the outside folder since I thought having two folders with the same name in the project might be the cause of the problem–but it didn’t help.

Yes that could be the problem. For the Streamlit Sharing runtime, the root folder of Github is probably also the start folder for the Python interpreter.
I would clean up the GitHub repo:

  • put the content of streamlit/ into the root folder.
  • put everything else (pdf’s, ipynb’s etc.) into subfolders or even better out of this repo altogether.

Or make a new Github repo just for your streamlit application.

@Marisa_Smith
Is it possible to use something like .dockerignore to exclude certain files in the Github repository from the deployment to Streamlit Sharing?

This isn’t the issue
as @Franky1 mentions, what this is indicating is that your relative reference to the file isn’t valid from where the streamlit app started from. To fix this, you can use pathlib to programmatically inspect your environment, or you can change the relative reference to be more explicit.

Best,
Randy

@Franky1 Thank you for your help! I think creating a separate repo for my dashboard would be the last thing to try if I cannot figure out another way.

1 Like

@randyzwitch Thank you for the tips, will look into this!

1 Like

Hello! I’m having the same problem here. My code works perfectly on my local machine but when I try to deploy it on Streamlit sharing, the following error pops up →
ModuleNotFoundError: No module named ‘twint’. The link for my repo is this. My requirements.txt file is already placed in the top directory.

I am quite sure that something goes wrong during the deployment of your requirements.txt.
There should be corresponding error messages in the log!?

Edit:
I think the latest available twint version in PyPI is version 2.1.20