Research paper review: Why do tree based models still outperform deep learning on tabular data?

Tracyrenee
4 min readJun 13, 2024

In a recent post I performed a code review on a Kaggle playground competition using sklearn’s ExtraTreesClassifier. That post can be found here:- https://medium.com/@tracyrenee61/sometimes-sklearn-outperforms-tensorflow-when-making-predictions-on-tabular-data-7fa997f662dc

It is worth noting, however, that before I used the ExtraTreesClassifier to make predictions, I had tried a neural network. The problem that I found with the neural network, however, is the fact that it only achieved 35% accuracy. I wondered why ExtraTrees Classifier outperformed the neural network, so I decided to investigate this.

When I Googled my question, I came across a research paper, entitled “Why do tree-based models still outperform deep learning on tabular data?” This is a paper that attempted to answer my question, so I decided to read it to see if I can obtain any extra insight into the dynamics of machine and deep learning.

After having read the paper, a review of the topics covered in it is found below:-

Deep learning’s superiority on tabular data is not clear. Results show that tree based models remain state of the art on medium sized data even without accounting for their superior speed. The authors of the research have…

--

--

Tracyrenee

I have five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.