I successfully run the regression example from the quick start with streamlit locally. I use python 3.8 on windows 10.
This is my requirements.txt.
alembic==1.9.4
altair==4.2.2
asttokens==2.2.1
attrs==22.2.0
backcall==0.2.0
backports.zoneinfo==0.2.1
blinker==1.5
blis==0.7.9
Boruta==0.3
cachetools==5.3.0
catalogue==1.0.2
certifi==2022.12.7
charset-normalizer==3.0.1
click==8.1.3
cloudpickle==2.2.1
colorama==0.4.6
colorlover==0.3.0
comm==0.1.2
contourpy==1.0.7
cufflinks==0.17.3
cycler==0.11.0
cymem==2.0.7
Cython==0.29.14
databricks-cli==0.17.4
debugpy==1.6.6
decorator==5.1.1
docker==6.0.1
entrypoints==0.4
executing==1.2.0
Flask==2.2.3
fonttools==4.38.0
funcy==1.18
gensim==3.8.3
gitdb==4.0.10
GitPython==3.1.31
greenlet==2.0.2
htmlmin==0.1.12
idna==3.4
ImageHash==4.3.1
imbalanced-learn==0.7.0
importlib-metadata==5.2.0
importlib-resources==5.12.0
ipykernel==6.21.2
ipython==8.10.0
ipywidgets==8.0.4
itsdangerous==2.1.2
jedi==0.18.2
Jinja2==3.1.2
joblib==1.2.0
jsonschema==4.17.3
jupyter-client==8.0.3
jupyter-core==5.2.0
jupyterlab-widgets==3.0.5
kiwisolver==1.4.4
kmodes==0.12.2
lightgbm==3.3.5
llvmlite==0.37.0
Mako==1.2.4
Markdown==3.4.1
markdown-it-py==2.1.0
MarkupSafe==2.1.2
matplotlib==3.7.0
matplotlib-inline==0.1.6
mdurl==0.1.2
mlflow==2.1.1
mlxtend==0.21.0
multimethod==1.9.1
murmurhash==1.0.9
nest-asyncio==1.5.6
networkx==3.0
nltk==3.8.1
numba==0.54.1
numexpr==2.8.4
numpy==1.20.3
oauthlib==3.2.2
packaging==22.0
pandas==1.5.3
pandas-profiling==3.6.6
parso==0.8.3
patsy==0.5.3
phik==0.12.3
pickleshare==0.7.5
Pillow==9.4.0
pkgutil-resolve-name==1.3.10
plac==1.1.3
platformdirs==3.0.0
plotly==5.13.0
preshed==3.0.8
prompt-toolkit==3.0.36
protobuf==3.20.3
psutil==5.9.4
pure-eval==0.2.2
pyarrow==10.0.1
pycaret==2.3.10
pydantic==1.10.5
pydeck==0.8.0
Pygments==2.14.0
PyJWT==2.6.0
pyLDAvis==3.4.0
Pympler==1.0.1
pynndescent==0.5.8
pyod==1.0.7
pyparsing==3.0.9
pyrsistent==0.19.3
python-dateutil==2.8.2
pytz==2022.7.1
pytz-deprecation-shim==0.1.0.post0
PyWavelets==1.4.1
pywin32==305
PyYAML==5.4.1
pyzmq==25.0.0
querystring-parser==1.2.4
regex==2022.10.31
requests==2.28.2
rich==13.3.1
scikit-learn==0.23.2
scikit-plot==0.3.7
scipy==1.5.4
seaborn==0.12.2
semver==2.13.0
shap==0.41.0
six==1.16.0
slicer==0.0.7
smart-open==6.3.0
smmap==5.0.0
spacy==2.3.9
SQLAlchemy==1.4.46
sqlparse==0.4.3
srsly==1.0.6
stack-data==0.6.2
statsmodels==0.13.5
streamlit==1.18.1
tabulate==0.9.0
tangled-up-in-unicode==0.2.0
tenacity==8.2.1
textblob==0.17.1
thinc==7.4.6
threadpoolctl==3.1.0
toml==0.10.2
toolz==0.12.0
tornado==6.2
tqdm==4.64.1
traitlets==5.9.0
typeguard==2.13.3
typing-extensions==4.5.0
tzdata==2022.7
tzlocal==4.2
umap-learn==0.5.3
urllib3==1.26.14
validators==0.20.0
visions==0.7.5
waitress==2.1.2
wasabi==0.10.1
watchdog==2.2.1
wcwidth==0.2.6
websocket-client==1.5.1
Werkzeug==2.2.3
widgetsnbextension==4.0.5
wordcloud==1.8.2.2
ydata-profiling==4.0.0
yellowbrick==1.5
zipp==3.14.0
Those packages are from pycaret==2.3.10 and streamlit 1.18.1.
Important, whenever the package is common to both pycaret and streamlit, follow the pycaret package version. For example, numpy is common, I use numpy==1.20.3 because this is the version that is required by pycaret.
Install the packages in requirements.txt with:
pip install -r requirements.txt
code
main.py
import streamlit as st
from pycaret.datasets import get_data
from pycaret.regression import *
if __name__ == '__main__':
data = get_data('insurance')
st.write('#### Data set')
st.dataframe(data, height=200)
is_do_regression = st.button('Do regression')
if is_do_regression:
# Set silent to true to disable asking questions.
s = setup(data, target='charges', silent=True)
best = compare_models() # get best model
save_model(best, 'my_best_pipeline')
st.write('#### Metrics')
st.dataframe(pull(), height=200)
# Analyzes the performance of a trained model on the test set.
# Only available in Notebook.
# evaluate_model(best)
# Plots
plot_model(best, plot='residuals', display_format='streamlit')
plot_model(best, plot='feature', display_format='streamlit')
plot_model(best, plot='error', display_format='streamlit')
# Predicts label on the holdout set.
pred_holdout = predict_model(best)
st.write('#### Predictions from holdout set')
st.dataframe(pred_holdout, height=200)
# Predicts label on the data.
# predictions = predict_model(best, data=data)
# Test the saved model.
loaded_model = load_model('my_best_pipeline')
predictions = predict_model(loaded_model, data=data)
st.write('#### Predictions from data set')
st.dataframe(predictions, height=200)
pycaret actually supports streamlit plot.
command
streamlit run main.py
Output
After pressing the button.
This is a nice package, plotting is cool. Metrics comparison from different models is a nice feature too.
I have not yet tried running it from the cloud.
References
https://pycaret.readthedocs.io/en/latest/index.html
https://www.datacamp.com/tutorial/guide-for-automating-ml-workflows-using-pycaret