Member-only story

Is sklearn’s Random Forest Classifier better than one written from scratch?

Crystal X
3 min readAug 24, 2022

--

In a previous post I compared a made from scratch decision tree with sklearn’s off the shelf version and found that the made from scratch version outperformed sklearn’s version using the the iris dataset as a comparator. The link to my previous post on decision trees can be found here:- https://medium.com/mlearning-ai/is-a-made-from-scratch-decision-tree-better-than-sklearns-off-the-shelf-version-4ebf3499835c

The random forest model is simply a collection of decision trees, so it is necessary to understand how a decision tree works before the basics of this algorithm can be fully understood. In this post, therefore, I intend to conduct a test to see which model works best; the made from scratch version or sklearn’s off the shelf version.

Both the decision tree and random forest are composed of several functions to operate.

The functions that comprise the decision tree classifier include:-

  1. Fit
  2. Predict
  3. Grow tree
  4. Best criteria
  5. Information gain
  6. Split
  7. Traverse tree
  8. Most common label
  9. Accuracy

--

--

Crystal X
Crystal X

Written by Crystal X

I have over five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.

No responses yet