House price prediction streamlit app (with lgbm pipeline backend) returns error on local host

Hello everyone, I trained a LGBM model (pipeline-based) for house price prediction. When I use the saved model in google colab for predicting the price of a new house, it works fine. But when I use it in sreamlit, it returns an error. The error seems to be related to creating the dataframe of the new house features. I have searched and tried different solutions but each one returns an error, while the same code in colab works fine.

I guess the problem is in the “make_column_transform” part of the “PipelineModel.ipynb” file. (Since I doubted if the problem is about the dataframe created in the “mainApp.py”; I used several methods for this purpose which can be observed in the “mainApp.py” file. But each method faced with an error.)

The files are available on my Github:

PipelineModel.ipynb —> training and saving the pipeline, based on lgbm and make-column-transformer. The transformer applies oneHotEncoder on the Address column.

finalized_pipe_model.joblib —> final model for using in the backend of the web app

newHousePrediction.ipynb —> using the final model with a completely new house and predicting the price

mainApp.py —> using the final model as backend of the streamlit web app

As I run the mainApp.py file, the following error is returned:

AttributeError: ‘str’ object has no attribute ‘transform’
Traceback:
File “C:\Users\b_sad\anaconda3\envs\test\Lib\site-packages\streamlit\scriptrunner\script_runner.py”, line 557, in _run_script
exec(code, module.dict)
File “mainApp.py”, line 43, in
predictionX = model.predict(X)
^^^^^^^^^^^^^^^^
File “C:\Users\b_sad\anaconda3\envs\test\Lib\site-packages\sklearn\pipeline.py”, line 602, in predict
Xt = transform.transform(Xt)
^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\b_sad\anaconda3\envs\test\Lib\site-packages\sklearn\utils_set_output.py”, line 295, in wrapped
data_to_wrap = f(self, X, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\b_sad\anaconda3\envs\test\Lib\site-packages\sklearn\compose_column_transformer.py”, line 1014, in transform
Xs = self._call_func_on_transformers(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\b_sad\anaconda3\envs\test\Lib\site-packages\sklearn\compose_column_transformer.py”, line 823, in _call_func_on_transformers
return Parallel(n_jobs=self.n_jobs)(jobs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\b_sad\anaconda3\envs\test\Lib\site-packages\sklearn\utils\parallel.py”, line 67, in call
return super().call(iterable_with_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\b_sad\anaconda3\envs\test\Lib\site-packages\joblib\parallel.py”, line 1863, in call
return output if self.return_generator else list(output)
^^^^^^^^^^^^
File “C:\Users\b_sad\anaconda3\envs\test\Lib\site-packages\joblib\parallel.py”, line 1792, in _get_sequential_output
res = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\b_sad\anaconda3\envs\test\Lib\site-packages\sklearn\utils\parallel.py”, line 129, in call
return self.function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\b_sad\anaconda3\envs\test\Lib\site-packages\sklearn\pipeline.py”, line 1283, in _transform_one
res = transformer.transform(X, **params.transform)
^^^^^^^^^^^^^^^^^^^^^

Hi @Beheshte_Sadeghi_Sab . You need to mention req.txt file name as requirements.txt. After changing the filename just reboot the app. Hope it works

Happy Streamlit-ing :balloon:

1 Like

I tried your mainApp.py and it worked.

I used these lib versions.

scikit-learn==1.2.2
lightgbm==4.3.0
streamlit==1.31.1

Thanks for your guidance, but it didn’t work

Could you please tell me your version of pandas and seaborn? As I applied the modifications you said, an error related to pandas came up which did not come up before.

Pandas is already included in the streamlit==1.31.1. What I have is pandas==1.5.2

I don’t use seaborn. I am only testing the model.

Prediction code.

import streamlit as st
import pandas as pd
import joblib

model = joblib.load('finalized_pipe_model.joblib')

# Columns = ['Area', 'Room', 'Parking', 'Warehouse', 'Elevator', 'Address']
df = pd.read_csv('housePrice.csv')

Address = st.selectbox("Address", df.Address.unique())
Area = st.number_input("Area", 30, 20000000000)
Room = st.number_input("Number of rooms", 0, 5)
Parking = st.radio("Parking", ["Yes", "No"]) == "Yes"
Elevator = st.radio("Elevator", ["Yes", "No"]) == "Yes"
Warehouse = st.radio("Warehouse", ["Yes", "No"]) == "Yes"

if st.button('Predict'):
    output = {
        'Area': [Area],
        'Room': [Room],
        'Parking': [Parking],
        'Warehouse': [Warehouse],
        'Elevator': [Elevator],
        'Address': [Address]
    }
    X = pd.DataFrame(output)
    predictionX = model.predict(X)
    st.success(f'The price of this house is {predictionX}!') 

Sample output