Member-only story

Use statsmodels to predict on insurance premiums

Crystal X
4 min readDec 24, 2024

--

I am a little late in entering Kaggle’s December 2024 playground competition because I was tied up studying Bayesian statistics. I did, however, manage to complete a ten hour video that I was studying and then enter the competition last night.

I have decided to solve this linear regression problem using statsmodels, which is Python’s statistical library.

I have written the program using Kaggle’s free online Jupyter Notebook and it is stored in my Kaggle account.

The first thing that I did to execute the program was to import the libraries that I would need to execute it, being:-

  1. Numpy to make numerical calculations,
  2. Pandas to create dataframes and process the data,
  3. Os to go into the computer’s operating system,
  4. Scipy to carry out scientific calculations on the data,
  5. Sklearn to provide machine learning functionality,
  6. Statsmodels to provide the model to the program,
  7. Matplotlib to visualise the data, and
  8. Seaborn to statistically visualise the data.

--

--

Crystal X
Crystal X

Written by Crystal X

I have over five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.

No responses yet