Member-only story
The Empirical Rule (sometimes called the three sigma rule) is a rule in statistics that states that, for normally distributed data, almost all of the data will fall within three deviations either side of the mean.
More specifically:-
68% of the data will fall within the first deviation,
95% of the data will fall within the second deviation, and
99.7% of the data will fall within the third deviation.
Rather than calculate the Empirical Rule by hand, I decided to use Python to do the job for me. In order to easily calculate the values, it is first important to know how to calculate the mean and the standard deviation of the data. I have previously used Python to write scripts to determine the mean and standard deviation, and these formulas can be found in the attached Google Colab notebook:- https://colab.research.google.com/drive/18UF3voMAoVpBx1G4wLX7WqFsUwhk2sGd?usp=sharing
With the code for the mean and standard deviation known, it is a relatively easy matter to calculate the data that will appear in the first, second and third deviation of the dataset. The psuedocode for computing the three signal rule can be found here:-
- Define function, calc_empirical, which takes a list of numbers as input.