Member-only story

Use Python to create your own csv file for hypothesis testing

Crystal X
3 min readFeb 9, 2025

--

I have been studying hypothesis testing for several weeks now, and one thing that I have had difficulty with is finding suitable datasets to practice on.

I therefore have decided to use Python to create my own randomised datasets to be used in hypothesis testing.

I have devised two ways to create csv files using Python.

The first method is to use Python and two of its very popular libraries, pandas and numpy. Pandas is used to create the dataframe and manipulate the data, and numpy is used to create the randomised data.

The process for creating the csv file using pandas and numpy is as follows:-

  1. Import the pandas and numpy libraries.
  2. Set the random seed in numpy to 0 or any other desired number.
  3. Define the variables that will form the columns od the dataset, being the data, company a, company b, and the difference between company a and company b.
  4. Use pandas to create the dataframe out of the variables that have previously been defined..
  5. Convert the dataframe to a csv file.

--

--

Crystal X
Crystal X

Written by Crystal X

I have over five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.

No responses yet