No module named 'sklearn' even added in requirements.txt

Hi! I am new to the community and trying to deploy my machine learning project.
Like many people, I faced the error when calling pickle and
ModuleNotFoundError: No module named ‘sklearn’
The main file of the project is “application.py”.
Can anyone help me?

Link to public deployed app:
https://swd-salary-prediction.streamlit.app/

Here is my GitHub repository:

Streamlit and Python Versions:
Streamlit: 1.3.2
Python 3.12.0

Error message (repeated many times):

2024-01-21 01:55:06.567 503 GET /script-health-check (10.12.2.5) 145.40ms
2024-01-21 01:55:11.405 Uncaught app exception
Traceback (most recent call last):
  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
    exec(code, module.__dict__)
  File "/mount/src/swd-salary-prediction/application.py", line 2, in <module>
    from predict_page import show_predict_page
  File "/mount/src/swd-salary-prediction/predict_page.py", line 10, in <module>
    data = load_model()
  File "/mount/src/swd-salary-prediction/predict_page.py", line 7, in load_model
    data = pickle.load(file)
ModuleNotFoundError: No module named 'sklearn'
2024-01-21 01:55:11.515 503 GET /script-health-check (10.12.2.5) 112.63ms
2024-01-21 01:55:16.377 Uncaught app exception
Traceback (most recent call last):
  File "/home/adminuser/venv/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
    exec(code, module.__dict__)
  File "/mount/src/swd-salary-prediction/application.py", line 2, in <module>
    from predict_page import show_predict_page
  File "/mount/src/swd-salary-prediction/predict_page.py", line 10, in <module>
    data = load_model()
  File "/mount/src/swd-salary-prediction/predict_page.py", line 7, in load_model
    data = pickle.load(file)
ModuleNotFoundError: No module named 'sklearn'
2024-01-21 01:55:16.493 503 GET /script-health-check (10.12.2.5) 119.12ms

Hi @aaronwu001

It seems that you’re loading in a serialized/pickled file of an ML model built from the Jupyter notebook (.ipynb). The only import statement for Scikit-learn was import sklearn in the application.py file.

However, in the Jupyter notebook, several functions from Sckit-learn was used:

from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import GridSearchCV

Can you make sure to also include the relevant import statements above from which the serialized model is using or which functions are still used in the predict and explore py files.

Hope this helps!

Hi @dataprofessor Thank you for the help!

I copied the functions you listed from jupyter notebook to the files ‘predict_page.py’ and ‘explore_page.py’.

Is this what you meant?

Unfortunately, I get the totally same error.
Do you mind taking a look one more time?

  • remove pickle from the requirements file
  • do a reboot of the app

That worked. Thank you so much, especially for the reminder to reboot the app!

Hi,

Thanks again for the help.

However, although the app can now be seen online, when clicking “calculate” to run the machine learning models, an error occur:

AttributeError: ‘DecisionTreeRegressor’ object has no attribute ‘monotonic_cst’

This does not happen on local host, can anyone check the code or have an idea what might be causing the error?

Thank you!