Member-only story

Use seaborn to create a barplot of feature importances in sklearn’s Random Forest model

Crystal X
3 min readJun 13, 2023

--

Although I have been studying Tensorflow and have been using it for most of my machine learning work, there are some instances when sklearn can outperform the deep learning library. For instance, I recently worked on a census dataset and found that sklearn’s Random Forest Classifier outperformed Tensorflow when it came to making predictions on salaries. The dataset in question can be found here:- https://www.kaggle.com/competitions/minim-al-census-income/leaderboard

In that particular competition, I began my work using Tensorflow’s sequential model to make predictions, but I also used sklearn’s Random Forest Classifier too. To my surprise, Random Forest made significantly better predictions than Tensorflow. The lesson learned, therefore, is to try out several different models on a dataset and select the one that affords the highest accuracy.

One thing that sklearn can do that I have not observed in Tensorflow is to print out the feature importance of the dataset. Feature importance in machine learning refers to a technique used to determine the relevance or contribution of each feature in a predictive model. It helps us understand which features have the most significant impact on the model’s predictions or outcomes.

--

--

Crystal X
Crystal X

Written by Crystal X

I have over five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.

No responses yet