Use sklearn’s Extra Trees to predict on used car prices

Crystal X
5 min readSep 3, 2024

I really do enjoy working on Kaggle competitions because they help me to improve my skill set in the field of machine learning. The competition that I am working on in this blog post is season 4 episode 9, and the task is to predict the prices of used cars.

In order to complete the competition, I tried a few models, including sklearn’s linear regression. Tensorflow’s linear regression, and sklearn’s extra trees. I finally settled on sklearn’s extra trees with feature selection.

I have written the program in Python, using Kaggle’s Jupyter Notebook, and saved it to my Kaggle account.

When I created the Jupyter Notebook, the first thing that I did was to import the libraries that I would need to run it, being:-

  1. Numpy to create numpy arrays and perform numerical computations,
  2. Pandas to create series and dataframes, and to also process data,
  3. Os to go into the operating system of the computer and retrieve all the files in the competition,
  4. Scipy to perform scientific calculations,
  5. Sklearn to provide machine learning functionality,
  6. Matplotlib to visualise the data, and
  7. Seaborn to statistically visualise the data.

--

--

Crystal X

I have over five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.