Member-only story
I am a little late in entering Kaggle’s December 2024 playground competition because I was tied up studying Bayesian statistics. I did, however, manage to complete a ten hour video that I was studying and then enter the competition last night.
I have decided to solve this linear regression problem using statsmodels, which is Python’s statistical library.
I have written the program using Kaggle’s free online Jupyter Notebook and it is stored in my Kaggle account.
The first thing that I did to execute the program was to import the libraries that I would need to execute it, being:-
- Numpy to make numerical calculations,
- Pandas to create dataframes and process the data,
- Os to go into the computer’s operating system,
- Scipy to carry out scientific calculations on the data,
- Sklearn to provide machine learning functionality,
- Statsmodels to provide the model to the program,
- Matplotlib to visualise the data, and
- Seaborn to statistically visualise the data.