Member-only story
In my last post I carried out an exploratory data analysis (EDA) of the Galton dataset in Python, and that post can be read here:- https://medium.com/@tracyrenee61/perform-an-exploratory-data-analysis-eda-on-the-galton-dataset-fb5ba5f900ec
In this post I am going to perform an EDA in Excel, but plot some of the graphs using the Python add-in in Excel. The reason for this is because the matplotlib and seaborn libraries make it possible to create some really superb graphs in Python, which is something that Excel does not yet have the capacity for.
Because I have carried out an EDA using both Excel and the Python plug-in, I had to make the first page of the workbook the importing of the libraries. Excel recognises the first page of the workbook as the libraries page, which is why this page has to be the first.
I have put the main libraries in cell A! And the libraries pertaining to statsmodels, Python’s statistical library, in cell A2:-
I obtained the csv file from a site on the internet, which I copied and pasted onto the second page of the workbook. I used the Text to Columns function, found in the Insert tab of the ribbon, to convert the csv file to a column for each column in the workbook. The delimiter is a space, so this needs to be identified when converting the text to columns:-