Member-only story

Use Python to calculate class weights in machine learning

Crystal X
3 min readJul 3, 2024

--

There is no guarantee that the dataset will have a balanced target. Some datasets have highly imbalance targets, and these targets must be balanced to assure an accurate prediction.

The diagram below is a histogram of a highly imbalanced target that I was recently working on. As can be seen below, there are significantly more zeros than there are ones, which means the model will have difficulty making an accurate prediction:-

Whenever working on an imbalanced target, it is important to find ways to balance that dataset so that the model used can make accurate predictions. If using a model from Python’s sklearn library, it will likely have a class_weight with the option of tuning it to ‘balanced’. Sometimes a model that is not in the sklearn library is used and that is when it will be necessary to tune the class_weight hyperparameter manually.

The code below shows how to compute the class weight using sklearn’s compute_class_weight function. As can be seen in the code, the function must also be tuned and it is important to know the correct code, otherwise the function will not work:-

--

--

Crystal X
Crystal X

Written by Crystal X

I have over five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.

No responses yet