Member-only story
Interview question: How can I get the number of null values present in each column of a dataframe?
Checking for null values is one of the first things that a person striving to learn any of the data science roles must learn.
Pandas, the library that is used for data processing, has two functions in particular to check for null values in a dataframe, being isnull() and isna().
isnull()
The isnull() function in pandas checks for missing values in an array-like object.
The code for checking for missing values using isnull() is:-
num_null = data.isnull().sum()
num_null
isna()
The isna() function in pandas checks for missing values as well.
The code for checking for missing values using isna() is:-
num_na = data.isna().sum()
num_na
Summary
Pandas is the easiest way to check for null values in a dataset, and the two above methods can be used in other ways as well. For example, the code to find the total number of values in a dataframe is:-
tot_na = data.isna().sum().sum()
tot_na
It is also possible to find out precisely which cells have null values using the following code:-
cell_na = data.isna()
cell_na
So there you have it, using pandas the data scientist can use two functions to find null values and use those functions with combinations of the function sum() to find out which null values are in a present in a dataframe, column, or cell.