Member-only story
How I used seaborn to analyse Kaggle’s Titanic competition dataset
Whilst studying statistics, I was surprised to learn that Python’s graphics library, seaborn, is used for statistics. I also noted that seaborn has a number of datasets loaded into the website, the Titanic dataset being one of them.
Because the Titanic dataset is so famous, I decided to have a go at analysing this data with seaborn. I had used seaborn in the past, but on this occasion, I decided to use the code that I have found in a tutorial on the subject. After I had analysed and predicted on the dataset in the Kaggle website, I was very surprised to have been awarded a bronze medal by Kaggle for my efforts.
The code below illustrates all of the analytical tools that I used to analyse the Titanic dataset using the seaborn library:-
The screenshot below shows what the layout of the train dataframe looked like when I loaded and read it into the program:-
Once I loaded and read in the three datasets used in the program, being train, test and submission, I began to analyse the train dataset because it was labelled (whereas the test dataset was not labelled).