{"id":26289,"date":"2022-01-03T09:18:41","date_gmt":"2022-01-03T03:48:41","guid":{"rendered":"https:\/\/python-programs.com\/?p=26289"},"modified":"2022-01-03T09:18:41","modified_gmt":"2022-01-03T03:48:41","slug":"in-python-how-do-you-get-unique-values-from-a-dataframe","status":"publish","type":"post","link":"https:\/\/python-programs.com\/in-python-how-do-you-get-unique-values-from-a-dataframe\/","title":{"rendered":"In Python, How do you get Unique Values from a Dataframe?"},"content":{"rendered":"
Pandas DataFrames really amazing. DataFrames in Python makes data manipulation very user-friendly.<\/p>\n
Pandas allow you to import large datasets and then manipulate them effectively. CSV data can be easily imported into a Pandas DataFrame.<\/p>\n
What are Python Dataframes?<\/strong><\/p>\n Dataframes are two-dimensional labeled data structures with columns of various types. Often, the dataset is too large, and it is impossible to examine the entire dataset at once. Instead, we’d like to see the Dataframe’s summary. DataFrame is a data structure offered by the Pandas module to cope with large datasets with several dimensions, such as large csv or excel files.<\/p>\n Because we may store a huge volume of data in a data frame, we frequently encounter situations where we need to find the unique data values from a dataset that may contain redundant or repeated values.<\/p>\n This is where the pandas.dataframe.unique()<\/strong> function comes in.<\/p>\n The pandas.unique() function returns the dataset’s unique values.<\/p>\n It basically employs a hash table-based technique to return the non-redundant values from the set of values existing in the data frame\/series data structure.<\/p>\n For Example:<\/p>\n Let dataset values = 5, 6, 7, 5, 2, 6<\/p>\n The output we get by applying unique function = 5, 6, 7,2<\/p>\n We were able to readily find the dataset’s unique values this way.<\/p>\n Syntax:<\/strong><\/p>\n When dealing with 1-Dimensional data, the above syntax comes in handy. It symbolizes or represents the unique value among the 1-Dimensional data values (Series data structure).<\/p>\n But what if the data has more than one dimension, such as rows and columns? Yes, we have a solution for it in the syntax below\u2013<\/p>\n Syntax For Multidimensional data:<\/strong><\/p>\n The above\u00a0syntax allows us to extract unique values from a specific column of a dataset.<\/p>\n It is preferable for the data to be of the categorical type in order for the unique function to produce accurate results. Furthermore, the data is displayed in the order in which it appears in the dataset.<\/p>\n Example<\/strong><\/p>\n Approach:<\/strong><\/p>\n Below is the implementation:<\/strong><\/p>\n Output:<\/strong><\/p>\n Import the dataset first as shown below:<\/p>\n Importing the Dataset:<\/strong><\/p>\n Import the dataset into a Pandas Dataframe.<\/p>\n Approach:<\/strong><\/p>\n Below is the implementation:<\/strong><\/p>\n This will save the dataset in the variable ‘cereal_dataset ‘ as a DataFrame.<\/p>\n pandas.dataframe.nunique() function:<\/strong><\/p>\n The unique values present in each column of the dataframe are represented by the pandas.dataframe.nunique()<\/strong> function.<\/p>\n Apply nunique() function to the given dataset to get all the unique values present in each column of the dataframe.<\/p>\n Example:<\/strong><\/p>\n Output:<\/strong><\/p>\n The below is the code to represent the unique values in the column ‘vitamins’.<\/p>\n Example<\/strong><\/p>\n Output:<\/strong><\/p>\n <\/p>\n","protected":false},"excerpt":{"rendered":" Pandas DataFrames really amazing. DataFrames in Python makes data manipulation very user-friendly. Pandas allow you to import large datasets and then manipulate them effectively. CSV data can be easily imported into a Pandas DataFrame. What are Python Dataframes? Dataframes are two-dimensional labeled data structures with columns of various types. DataFrames can be used for a …<\/p>\n
\nDataFrames can be used for a wide range of analyses.<\/p>\n
\nWe can get the first five rows of the dataset as well as a quick statistical summary of the data. Aside from that, we can gain information\u00a0about the types of columns in our dataset.<\/p>\npandas.unique() Function in Python<\/h4>\n
pandas.unique(data)<\/pre>\n
pandas.dataframe.column-name.unique()<\/pre>\n
unique() function with Pandas Series<\/h5>\n
\n
# Import pandas module using the import keyword\r\nimport pandas\r\n# Give the list as static input and store it in a variable.\r\ngvn_lst = [5, 6, 7, 5, 2, 6]\r\n# Pass the given list as an argument to the pandas.Series() function and\r\n# store it in another variable.\r\n# Since the list has only one dimension, we turned it to a series data structure.\r\ndata_frme = pandas.Series(gvn_lst)\r\n# Pass the above data as an argument to the pandas.unique() function to\r\n# get all the unique values from the given list(data).\r\n# Store it in another variable\r\nuniqval_lst = pandas.unique(data_frme)\r\n# Print all Unique elements from the given list\r\nprint(\"The all Unique elements from the given list = \")\r\nprint(uniqval_lst)\r\n<\/pre>\n
The all Unique elements from the given list = \r\n[5 6 7 2]<\/pre>\n
unique() function with Pandas DataFrame<\/h5>\n
\n
# Import pandas module as pd using the import keyword\r\nimport pandas as pd\r\n# Import dataset using read_csv() function by passing the dataset name as\r\n# an argument to it.\r\n# Store it in a variable.\r\ncereal_dataset = pd.read_csv('cereal.csv')\r\n<\/pre>\n
cereal_dataset.nunique()<\/pre>\n
# Import pandas module as pd using the import keyword\r\nimport pandas as pd\r\n# Import dataset using read_csv() function by pasing the dataset name as\r\n# an argument to it.\r\n# Store it in a variable.\r\ncereal_dataset = pd.read_csv('cereal.csv')\r\n# Apply nunique() function to the given dataset to get all the unique\r\n# values present in each column of the dataframe.\r\ncereal_dataset.nunique()\r\n<\/pre>\n
name 77\r\nmfr 7\r\ntype 2\r\ncalories 11\r\nprotein 6\r\nfat 5\r\nsodium 27\r\nfiber 13\r\ncarbo 22\r\nsugars 17\r\npotass 36\r\nvitamins 3\r\nshelf 3\r\nweight 7\r\ncups 12\r\nrating 77\r\ndtype: int64<\/pre>\n
cereal_dataset.vitamins.unique()<\/pre>\n
# Import pandas module as pd using the import keyword\r\nimport pandas as pd\r\n# Import dataset using read_csv() function by pasing the dataset name as\r\n# an argument to it.\r\n# Store it in a variable.\r\ncereal_dataset = pd.read_csv('cereal.csv')\r\n# Apply unique() function to the vitamins column in the given dataset to \r\n# get all the unique values in the column 'vitamins'.\r\ncereal_dataset.vitamins.unique()<\/pre>\n
array([ 25, 0, 100])<\/pre>\n