Member-only story
Use Python to create a randomised csv file and perform a hypothesis test on it
I have been studying hypothesis tests fervently for a couple of months now. One problem that I have encountered in my studies is that it is difficult to find the appropriate datasets to use. One solution to this conundrum is to make one’s own datasets. It is a relatively easy matter to create a randomised dataset in Python using only its inbuilt modules.
The csv file that I have created below is the randomised ages of men and women MBA students.
The csv file is created in Python using the following methodology:-
- Import Python’s inbuilt modules, being csv, random, statistics.
- Define a function that will accept the number of entries, the minimum age, and the maximum age, and will use a for loop to return the ages corresponding with the number of entries.
- Define the variable, num_entries, and set it to the value of 100.
- Define the variables ages_men and ages_men, which are created by running the function, generate_ages.
- Use Python’s inbuilt csv module to create the csv file and write the ages of the men and women onto each row of the file.