{"id":5560,"date":"2021-05-16T10:45:40","date_gmt":"2021-05-16T05:15:40","guid":{"rendered":"https:\/\/python-programs.com\/?p=5560"},"modified":"2021-11-22T18:40:51","modified_gmt":"2021-11-22T13:10:51","slug":"python-count-nan-and-missing-values-in-dataframe-using-pandas","status":"publish","type":"post","link":"https:\/\/python-programs.com\/python-count-nan-and-missing-values-in-dataframe-using-pandas\/","title":{"rendered":"Python: Count Nan and Missing Values in Dataframe Using Pandas"},"content":{"rendered":"

Method to count Nan and missing value in data frames using pandas<\/h3>\n

In this article, we will discuss null values in data frames and calculate them in rows, columns, and in total. Let discuss nan or missing values in the dataframe.<\/p>\n

NaN or Missing values<\/h3>\n

The full form of NaN is Not A Number<\/code>.It is used to represent missing values in the dataframe. Let see this with an example.<\/p>\n

students= {'students': ['Raj', 'Rahul', 'Mayank', 'Ajay', 'Amar'],\r\n      'Marks':[90,np.nan,87,np.nan,19]}\r\ndf = pd.DataFrame(num, columns=['students','Marks'])\r\nprint(df)<\/pre>\n

Output<\/p>\n

       students  Marks\r\n0      Raj   90.0\r\n1    Rahul    NaN\r\n2   Mayank   87.0\r\n3     Ajay    NaN\r\n4     Amar   19.0<\/pre>\n

Here we see that there are NaN inside the Marks column that is used to represent missing values.<\/p>\n

Reason to count Missing values or NaN values in Dataframe<\/h3>\n

One of the main reasons to count missing values is that missing values in any dataframe affects the accuracy of prediction. If there are more missing values in the dataframe then our prediction or result highly effect. Hence we calculate missing values. If there are the high count of missing values we can drop them else we can leave them as it is in dataframe.<\/p>\n

Method to count NaN or missing values<\/h3>\n

To use count or missing value first we use a function isnull(). This function replaces all NaN value with True and non-NaN values with False which helps us to calculate the count of NaN or missing values. Let see this with the help of an example.<\/p>\n

students= {'students': ['Raj', 'Rahul', 'Mayank', 'Ajay', 'Amar'],\r\n      'Marks':[90,np.nan,87,np.nan,19]}\r\ndf = pd.DataFrame(num, columns=['students','Marks'])\r\ndf.isnull()<\/pre>\n

Output<\/p>\n

  students  Marks\r\n0      False   False\r\n1     False   True\r\n2     False   False\r\n3     False   NaN\r\n4     False   True<\/pre>\n

We get our dataframe something like this. Now we can easily calculate the count of NaN or missing values in the dataframe.<\/p>\n

Count NaN or missing values in columns<\/h3>\n

With the help of .isnull().sum() method, we can easily calculate the count of NaN or missing values. isnull() method converts NaN values to True and non-NaN values to false and then the sum() method calculates the number of false in respective columns. Let see this with an example.<\/p>\n

students= {'students': ['Raj', 'Rahul', 'Mayank', 'Ajay', 'Amar'],\r\n      'Marks':[90,np.nan,87,np.nan,19]}\r\ndf = pd.DataFrame(num, columns=['students','Marks'])\r\ndf.isnull().sum()<\/pre>\n

Output<\/p>\n

students    0\r\nMarks       2\r\ndtype: int64<\/pre>\n

As we also see in the dataframe that we have no NaN or missing values in the students column but we have 2 in the Marks column.<\/p>\n

Count NaN or missing values in Rows<\/h3>\n

For this, we can iterate through each row using for loop and then using isnull().sum() method calculates NaN or missing values in all the rows. Let see this with an example.<\/p>\n

students= {'students': ['Raj', 'Rahul', 'Mayank', 'Ajay', 'Amar'],\r\n      'Marks':[90,np.nan,87,np.nan,19]}\r\ndf = pd.DataFrame(num, columns=['students','Marks'])\r\nfor i in range(len(df.index)) :\r\n    print(\"Nan in row \", i , \" : \" ,  df.iloc[i].isnull().sum())<\/pre>\n

Output<\/p>\n

Nan in row  0  :  0\r\nNan in row  1  :  1\r\nNan in row  2  :  0\r\nNan in row  3  :  1\r\nNan in row  4  :  0<\/pre>\n

Count total NaN or missing values in dataframe<\/h3>\n

In the above two examples, we see how to calculate missing values or NaN in rows or columns. Now we see how to calculate the total missing value in the dataframe For this we have to simply use isnull().sum().sum() method and we get our desired output. Let see this with help of an example.<\/p>\n

students= {'students': ['Raj', 'Rahul', 'Mayank', 'Ajay', 'Amar'],\r\n      'Marks':[90,np.nan,87,np.nan,19]}\r\ndf = pd.DataFrame(num, columns=['students','Marks'])\r\nprint(\"Total NaN values: \",df.isnull().sum().sum())<\/pre>\n

Output<\/p>\n

Total NaN values:  2<\/pre>\n

So these are the methods tp count NaN or missing values in dataframes.<\/p>\n

Want to expert in the python programming language? Exploring\u00a0Python Data Analysis using Pandas<\/a>\u00a0tutorial changes your knowledge from basic to advance level in python concepts.<\/p>\n

Read more Articles on Python Data Analysis Using Padas<\/strong><\/p>\n