Member-only story

Interview Question: Extract records where the sepal length is greater than 5 and the sepal width is greater than 3 in the Iris dataset

Crystal X
2 min readMar 23, 2023

--

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

Pandas has many functions that create series and dataframes, and manipulate the data contained in those structures. One of the great things that pandas can do is to filter data. Filtering in a dataframe is one of the most common operations in cleaning data.

In the interview question, the interviewee is asked to extract records where the sepal length is greater than 5 and the sepal width is greater than 3.

The shape of the iris dataset is150 rows by 5 columns of data. Four of the columns are features and the final column is the target:-

The code below shows how to filter the dataframe in accordance with the interview question. The question is really two questions in one, so each question must be divided by brackets to differentiate it:-

The resulting filter is a new dataframe with 47 rows by 5 columns of data:-

--

--

Crystal X
Crystal X

Written by Crystal X

I have over five decades experience in the world of work, being in fast food, the military, business, non-profits, and the healthcare sector.

No responses yet