{"id":4403,"date":"2023-10-25T11:42:25","date_gmt":"2023-10-25T06:12:25","guid":{"rendered":"https:\/\/python-programs.com\/?p=4403"},"modified":"2023-11-10T11:58:18","modified_gmt":"2023-11-10T06:28:18","slug":"pandas-get-unique-values-in-columns-of-a-dataframe-in-python","status":"publish","type":"post","link":"https:\/\/python-programs.com\/pandas-get-unique-values-in-columns-of-a-dataframe-in-python\/","title":{"rendered":"Pandas : Get unique values in columns of a Dataframe in Python"},"content":{"rendered":"

How to get unique values in columns of a Dataframe in Python ?<\/h2>\n

To find the Unique values in a Dataframe we can use-<\/p>\n

    \n
  1. series.unique(<\/em><\/strong>self)- <\/em><\/strong>Returns a numpy array of Unique values<\/li>\n
  2. series.nunique(<\/em><\/strong>self, axis=0, dropna=True )- <\/em><\/strong>Returns the count of Unique values along different axis.(If axis = 0 i.e. default value, it checks along the columns.If axis = 1, it checks along the rows)<\/li>\n<\/ol>\n

    To test these functions let\u2019s use the following data-<\/p>\n

     \u00a0\u00a0\u00a0 Name\u00a0\u00a0    Age\u00a0\u00a0\u00a0\u00a0   City\u00a0         Experience\r\n\r\na\u00a0\u00a0\u00a0\u00a0 jack\u00a0      34.0\u00a0\u00a0   Sydney\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0   5\r\nb\u00a0\u00a0\u00a0\u00a0 Riti\u00a0       31.0\u00a0\u00a0\u00a0   Delhi\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0     7\r\nc\u00a0\u00a0\u00a0\u00a0 Aadi\u00a0     16.0\u00a0\u00a0\u00a0\u00a0\u00a0  NaN\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0      11\r\nd\u00a0\u00a0\u00a0 Mohit\u00a0   31.0\u00a0\u00a0\u00a0    Delhi\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0     7\r\ne\u00a0\u00a0\u00a0 Veena\u00a0\u00a0  NaN\u00a0\u00a0\u00a0   Delhi\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0     4\r\nf\u00a0  Shaunak\u00a0 35.0\u00a0\u00a0   Mumbai\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 5\r\ng\u00a0\u00a0\u00a0 Shaun\u00a0   35.0\u00a0   Colombo\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 11<\/pre>\n

    Finding unique values in a single column :<\/h3>\n

    To get the unique value(here age) we use the unique( )<\/code> function on the column<\/p>\n

    CODE:-<\/p>\n

    #Program :\r\n\r\nimport numpy as np\r\nimport pandas as pd\r\n# Data list\r\nemp = [('jack', 34, 'Sydney', 5) ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ('Riti', 31, 'Delhi' , 7) ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ('Aadi', 16, np.NaN, 11) ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ('Mohit', 31,'Delhi' , 7) ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ('Veena', np.NaN, 'Delhi' , 4) ,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ('Shaunak', 35, 'Mumbai', 5 ),\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ('Shaun', 35, 'Colombo', 11)\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ]\r\n# Object of Dataframe class created\r\nempObj = pd.DataFrame(emp, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])\r\n# Obtain the unique values in column 'Age' of the dataframe\r\nuValues = empObj['Age'].unique()\r\n# empObj[\u2018Age\u2019] returns a series object of the column \u2018Age\u2019\r\nprint('The unique values in column \"Age\" are ')\r\nprint(uValues)<\/pre>\n
    Output :\r\nThe unique values in column \"Age\" are\r\n[34. 31. 16. nan 35.]<\/pre>\n

    Counting unique values in a single column :<\/h3>\n

    If we want to calculate the number of Unique values rather than the unique values, we can use the .nunique( )<\/code> function.<\/p>\n

    CODE:-<\/p>\n

    #Program :\r\n\r\nimport numpy as np\r\nimport pandas as pd\r\n# Data list\r\nemp = [('jack', 34, 'Sydney', 5) ,\r\n('Riti', 31, 'Delhi' , 7) ,\r\n('Aadi', 16, np.NaN, 11) ,\r\n('Mohit', 31,'Delhi' , 7) ,\r\n('Veena', np.NaN, 'Delhi' , 4) ,\r\n('Shaunak', 35, 'Mumbai', 5 ),\r\n('Shaun', 35, 'Colombo', 11)\r\n]\r\n# Object of Dataframe class created\r\nempObj = pd.DataFrame(emp, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])\r\n# Counting the\u00a0 unique values in column 'Age' of the dataframe\r\nuValues = empObj['Age'].nunique()\r\nprint('Number of unique values in 'Age' column :')\r\nprint(uValues)<\/pre>\n
    Output :\r\nNumber of unique values in 'Age' column :\r\n4<\/pre>\n

    Including NaN while counting the Unique values in a column :<\/h3>\n

    NaN\u2019s are not counted by default in the .nunique( )<\/code> function. To also include NaN we have to pass the dropna <\/em>argument<\/p>\n

    CODE:-<\/p>\n

    #Program :\r\n\r\nimport numpy as np\r\nimport pandas as pd\r\n# Data list\r\nemp = [('jack', 34, 'Sydney', 5) ,\r\n('Riti', 31, 'Delhi' , 7) ,\r\n('Aadi', 16, np.NaN, 11) ,\r\n('Mohit', 31,'Delhi' , 7) ,\r\n('Veena', np.NaN, 'Delhi' , 4) ,\r\n('Shaunak', 35, 'Mumbai', 5 ),\r\n('Shaun', 35, 'Colombo', 11)\r\n]\r\n# Object of Dataframe class created\r\nempObj = pd.DataFrame(emp, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])\r\n# Counting the unique values in column 'Age' also including NaN\r\nuValues = empObj['Age'].nunique(dropna=False)\r\nprint('Number of unique values in 'Age' column including NaN:)\r\nprint(uValues)<\/pre>\n
    Output :\r\nNumber of unique values in 'Age' column including NaN:\r\n5<\/pre>\n

    Counting unique values in each column of the dataframe :<\/h3>\n

    To count the number of Unique values in each columns<\/p>\n

    CODE:-<\/p>\n

    #Program :\r\n\r\nimport numpy as np\r\nimport pandas as pd\r\n# Data list\r\nemp = [('jack', 34, 'Sydney', 5) ,\r\n('Riti', 31, 'Delhi' , 7) ,\r\n('Aadi', 16, np.NaN, 11) ,\r\n('Mohit', 31,'Delhi' , 7) ,\r\n('Veena', np.NaN, 'Delhi' , 4) ,\r\n('Shaunak', 35, 'Mumbai', 5 ),\r\n('Shaun', 35, 'Colombo', 11)\r\n]\r\n# Object of Dataframe class created\r\nempObj = pd.DataFrame(emp, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])\r\n# Counting the unique values in each column\r\nuValues = empObj.nunique()\r\nprint('In each column the number of unique values are')\r\nprint(uValues)<\/pre>\n
    Output :\r\nIn each column the number of unique values are\r\nName\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 7\r\nAge\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 4\r\nCity\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 4\r\nExperience\u00a0\u00a0\u00a0 4\r\ndtype: int64<\/pre>\n

    To include the NaN, just pass dropna into the function.<\/p>\n

    Get Unique values in multiple columns :<\/h3>\n

    To get unique values in multiple columns, we have to pass all the contents of columns as a series object into the .unique( )<\/code> function<\/p>\n

    CODE:-<\/p>\n

    #program :\r\n\r\nimport numpy as np\r\nimport pandas as pd\r\n# Data list\r\nemp = [('jack', 34, 'Sydney', 5) ,\r\n('Riti', 31, 'Delhi' , 7) ,\r\n('Aadi', 16, np.NaN, 11) ,\r\n('Mohit', 31,'Delhi' , 7) ,\r\n('Veena', np.NaN, 'Delhi' , 4) ,\r\n('Shaunak', 35, 'Mumbai', 5 ),\r\n('Shaun', 35, 'Colombo', 11)\r\n]\r\n# Object of Dataframe class created\r\nempObj = pd.DataFrame(emp, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])\r\n# Obtain the Unique values in multiple columns i.e. Name & Age\r\nuValues = (empObj['Name'].append(empObj['Age'])).unique()\r\nprint('The unique values in column \"Name\" & \"Age\" :')\r\nprint(uValues)<\/pre>\n
    Output :\r\nThe unique values in column \"Name\" & \"Age\" :\r\n['jack' 'Riti' 'Aadi' 'Mohit' 'Veena' 'Shaunak' 'Shaun' 34.0 31.0 16.0 nan\r\n35.0]<\/pre>\n

    Want to expert in the python programming language? Exploring\u00a0Python Data Analysis using Pandas<\/a>\u00a0tutorial changes your knowledge from basic to advance level in python concepts.<\/p>\n

    Read more Articles on Python Data Analysis Using Padas \u2013 Select items from a Dataframe<\/strong><\/p>\n