Member-only story
Use Python to calculate the residuals in the Leinhardt dataset
The Leinhardt is a dataset that describes infant mortality in 105 countries worldwide.
In this blog post I intend to perform linear regression on the dataset to find the residuals that are produced during the regression.
Residuals are the differences between the observed values and the values predicted by the regression model. They represent the error or the deviation of the observed data from the model’s predictions.
Residuals are important because:-
- Residuals help in assessing how well the regression model fits the data. Smaller residuals indicate a better fit.
- Residuals are used to check the assumptions of linear regression, such as linearity, homoscedasticity, and normality of errors.
- Residual plots can reveal patterns that might indicate issues with the model, such as non-linearity or outliers.
I have written the program for this linear regression model in Python and have used Google Colab to accomplish it. For those individuals who have not used Google Colab in a while, the platform now has Gemini attached to it, which assists in writing code and correcting errors.
The first thing that I did when I created the program was to import the libraries that I…