Make predictions on the probability of getting a loan

Crystal X
4 min readOct 6, 2024

In the past couple of weeks I have been st udying probabilities, so it was a pleasant surprise that this month’s Kaggle competition also concerned probabilities.

The problem statement for this month’s Kaggle competition was to calculate the probability that someone would be given a loan. The dataset used had both categorical and numerical values, so the categorical values needed to be encoded before they could be put into the model. The target had a very large class imbalance, so I used a tree based model, ExtraTreesClassifier, in an attempt to address that issue.

I wrote the algorithm in Kaggle’s free Jupyter Notebook, whichI saved to my account.

Once I had created the Jupyter Notebook, I imported the libraries that would be needed to execute it, being:-

  1. Numpy to create numpy arrays and carry out numerical computations,
  2. Pandas to create dataframes and process data,
  3. Os to go into the operating system,
  4. Scipy to perform statistical and scientific computations,
  5. Sklearn to provide machine learning functionality,
  6. Matplotlib to visualise the data, and
  7. Seaborn to statistically visualise the data.

--

--

Crystal X

I have over five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.