Member-only story
Statistics interview question: What is the difference between a population and a sample?
Statistics is a mathematical body of science that pertains to the collection, analysis, interpretation or explanation, and presentation of data. Statistics is one of the foundational subjects of any data science curriculum or degree. Since statistics is a keystone of data science, it is important to study this subject in order to become a proficient data scientist.
One of the first things that will be covered in the field of statistics is to be able to distinguish between a population and a sample, as it is fundamental to many aspects of data analysis and inference.
The population refers to the entire group of individuals or items that we are interested in, while that sample is a subset of the population selected for study. A sample is a representative portion of the population from which we collect data in order to make inferences or draw conclusions about the entire population.
The population is denoted by N, while the sample is denoted by n.
Examples of when samples are used are:-
- The population is too large to collect data.
- The data collected is not reliable.
- The population is hypothetical and unlimited in size. In this instance a test group will be used.
Characteristics of a sample are:-
- Satisfy all different variations in the population as well as a well defined selection criterion.
- Be unbiased on the properties of the objects being selected.
- The objects are fairly and randomly selected.