Member-only story
How I won my fifth bronze medal by entering a Kaggle competition
This afternoon, after having been given six fillings and a CT scan, costing me quite a substantial sum of money, I was quite chuffed when I received an email from Kaggle, telling me that I had won my fifth bronze medal on their October 2021 tabular competition.
I had written the program to make predictions on this dataset over the course of a Friday night and Saturday morning. Because the datasets were so huge, I had to reduce the million rows of data to a quarter of its size just to keep the program from crashing the system.
I also used the function, SelectKBest, in Python’s machine learning library, sklearn, to reduce the number of features from 287 to 60.
I also used a Pytorch classifier to make predictions on the now reduced dataset. I scored just slightly higher using Pytorch than I did when I made predictions using sklearn’s HistGradientBoostingClassifier. It is always a good idea to made predictions on several different models and then to select the one that attains the best score.
And finally, I used Kaggle’s graphic processing unit, or GPU, to speed up the training of the model.
I am very pleased to have won five bronze medals on Kaggle’s competitions. My next big challenge now is to actually win a competition!
The code for this post can be found in its entirety in my personal Kaggle account, the link being:- https://www.kaggle.com/tracyporter/oct-21-tabular-pytorch-selectkbest/settings