🚗 Used Car Picker Streamlit App!

Over the last week, I had the chance to work on my first regression problem.

It was on a used-car dataset and I wanted to predict profit/loss on reselling one of those cars depending on its usage.

I built a Streamlit app to interact with the data and results (which also helped me uncover and fix a major issue with my data processing pipeline… details in the comments). This is also my first ML-powered app.

You can find the app here:

Coming to the issue I faced…

The way I approached this was by training the model on the original scaled dataset. And for predictions, I would add the usage years to the age and distance to the miles columns, and then predict on this dataset. The results would be the resale value and the difference between the original price and resale value is the profit/loss.

However, I noticed that changing these values didn’t affect the model output whatsoever. After a considerable time checking my code for any obvious variable/dataframe misuse, I figured it was because of the scaling applied. Since the test dataset is also scaled, modifying columns by a constant value will completely cancel out the constant.

So the only fix which I could think of was to remove the scaler, and use a model which doesn’t need scaled datasets. Ended up using RandomForestRegressor.

Anything which I could have done differently?