Member-only story

Use Python to create a randomised csv file and perform a hypothesis test on it

Crystal X
3 min readFeb 10, 2025

--

I have been studying hypothesis tests fervently for a couple of months now. One problem that I have encountered in my studies is that it is difficult to find the appropriate datasets to use. One solution to this conundrum is to make one’s own datasets. It is a relatively easy matter to create a randomised dataset in Python using only its inbuilt modules.

The csv file that I have created below is the randomised ages of men and women MBA students.

The csv file is created in Python using the following methodology:-

  1. Import Python’s inbuilt modules, being csv, random, statistics.
  2. Define a function that will accept the number of entries, the minimum age, and the maximum age, and will use a for loop to return the ages corresponding with the number of entries.
  3. Define the variable, num_entries, and set it to the value of 100.
  4. Define the variables ages_men and ages_men, which are created by running the function, generate_ages.
  5. Use Python’s inbuilt csv module to create the csv file and write the ages of the men and women onto each row of the file.

--

--

Crystal X
Crystal X

Written by Crystal X

I have over five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.

No responses yet