Member-only story
I didn’t get around to entering Kaggle’s monthly competition because I have been busy studying other things, like statistics. I did, however, put my statistics studies away to enter the season 5 episode 1 playground competition before the cutoff date.
The object of this competition is to count the number of stickers sold in various stores in various countries over a period of time.
I wrote the script in Python using a Jupyter Notebook and saving it in my Kaggle account.
The first thing that I did after creating the Jupyter notebook in Kaggle was to import the libraries that I would need to execute it and make predictions on the number of stickers sold. The libraries that I imported are:-
- Pandas to create dataframes and process data,
- Numpy to create numpy arrays and perform numerical computations,
- Os to go into the operating system of the Kaggle website,
- Sklearn to provide machine learning functionality,
- Pylab and scipy to print off qq plots,
- Matplotlib to visualise the data, and
- Seaborn to statistically visualise the data.