How to check if a value exists in a DataFrame using in and not in operator | isin() in Python ?
In this article we are going to discuss different ways to check if a value is present in the dataframe.
We will be using the following dataset as example
Name Age City Marks 0 Jill 16 Tokyo 154 1 Rachel 38 Texas 170 2 Kirti 39 New York 88 3 Veena 40 Texas 190 4 Lucifer 35 Texas 59 5 Pablo 30 New York 123 6 Lionel 45 Colombia 189
Check if a single element exists in DataFrame using in & not in operators :
Dataframe class has a member Dataframe.values
that gives us all the values in an numpy representation. We will be using that with the in and not operator to check if the value is present in the dataframe.
Using in operator to check if an element exists in dataframe :
We will be checking if the value 190 exists in the dataset using in
operator.
#Program : import pandas as pd #Example data employees = [ ('Jill', 16, 'Tokyo', 154), ('Rachel', 38, 'Texas', 170), ('Kirti', 39, 'New York', 88), ('Veena', 40, 'Texas', 190), ('Lucifer', 35, 'Texas', 59), ('Pablo', 30, 'New York', 123), ('Lionel', 45, 'Colombia', 189)] # DataFrame object was created dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks']) if 190 in dfObj.values: print('Element exists in Dataframe')
Output : Element exists in Dataframe
Using not in operator to check if an element doesn’t exists in dataframe :
We will be checking if ‘Leo’ is present in the dataframe using not
operator.
# Program : import pandas as pd #Example data employees = [ ('Jill', 16, 'Tokyo', 154), ('Rachel', 38, 'Texas', 170), ('Kirti', 39, 'New York', 88), ('Veena', 40, 'Texas', 190), ('Lucifer', 35, 'Texas', 59), ('Pablo', 30, 'New York', 123), ('Lionel', 45, 'Colombia', 189)] # DataFrame object was created dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks']) if 'Leo' not in dfObj.values: print('Element does not exists in Dataframe')
Output : Element does not exists in Dataframe
Checking if multiple elements exists in DataFrame or not using in operator :
To check for multiple elements, we have to write a function.
# Program : import pandas as pd def checkForValues(_dfObj, listOfValues): #The function would check for the list of values in our dataset result = {} #Iterate through the elementes for elem in listOfValues: # Check if the element exists in the dataframe values if elem in _dfObj.values: result[elem] = True else: result[elem] = False # Returns a dictionary containig the vvalues and their existence in boolean return result #Example data employees = [ ('Jill', 16, 'Tokyo', 154), ('Rachel', 38, 'Texas', 170), ('Kirti', 39, 'New York', 88), ('Veena', 40, 'Texas', 190), ('Lucifer', 35, 'Texas', 59), ('Pablo', 30, 'New York', 123), ('Lionel', 45, 'Colombia', 189)] # DataFrame object was created dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks']) #Check for the existence of values Tresult = checkForValues(dfObj, [30, 'leo', 190]) print('The values existence inside the dataframe are ') print(Tresult)
Output : The values existence inside the dataframe are {30: True, 'leo': False, 190: True}
Rather than writing a whole function, we can also achieve this using a smaller method using dictionary comprehension.
# Program : import pandas as pd #Example data employees = [ ('Jill', 16, 'Tokyo', 154), ('Rachel', 38, 'Texas', 170), ('Kirti', 39, 'New York', 88), ('Veena', 40, 'Texas', 190), ('Lucifer', 35, 'Texas', 59), ('Pablo', 30, 'New York', 123), ('Lionel', 45, 'Colombia', 189)] # DataFrame object was created dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks']) listOfValues = [30, 'leo', 190] #using dictionary comprehension check for given values result = {elem: True if elem in dfObj.values else False for elem in listOfValues} print('The values existence inside the dataframe are ') print(result)
Output : The values existence inside the dataframe are {30: True, 'leo': False, 190: True}
Checking if elements exists in DataFrame using isin() function :
We can also check if a value exists inside a dataframe or not using the isin( )
function.
Syntax : DataFrame.isin(self, values)
Where,
- Values : It takes the values to check for inside the dataframe.
Checking if a single element exist in Dataframe using isin() :
# Program : import pandas as pd #Example data employees = [ ('Jill', 16, 'Tokyo', 154), ('Rachel', 38, 'Texas', 170), ('Kirti', 39, 'New York', 88), ('Veena', 40, 'Texas', 190), ('Lucifer', 35, 'Texas', 59), ('Pablo', 30, 'New York', 123), ('Lionel', 45, 'Colombia', 189)] # DataFrame object was created dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks']) #Checking ofr a single element susing isin( ) function boolDf = dfObj.isin([38]) print(boolDf)
Output : Name Age City Marks 0 False False False False 1 False True False False 2 False False False False 3 False False False False 4 False False False False 5 False False False False 6 False False False False
Here the isin( )
operator returned a boolean dataframe of the same number of elements, where the elements that matched our values is True and rest all are false.
We can add this to the any( )
function that only shows the true values and pass it into another any( )
function making it a series to pinpoint if our element exists or not.
# Program : import pandas as pd #Example data employees = [ ('Jill', 16, 'Tokyo', 154), ('Rachel', 38, 'Texas', 170), ('Kirti', 39, 'New York', 88), ('Veena', 40, 'Texas', 190), ('Lucifer', 35, 'Texas', 59), ('Pablo', 30, 'New York', 123), ('Lionel', 45, 'Colombia', 189)] # DataFrame object was created dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks']) # Check if the value is inisde the dataframe using both isin() and any( ) function result = dfObj.isin(['Lionel']).any().any() if result: print('The value exists inside the datframe')
Output : Any of the Element exists in Dataframe
Checking if any of the given values exists in the Dataframe :
# Program : import pandas as pd #Example data employees = [ ('Jill', 16, 'Tokyo', 154), ('Rachel', 38, 'Texas', 170), ('Kirti', 39, 'New York', 88), ('Veena', 40, 'Texas', 190), ('Lucifer', 35, 'Texas', 59), ('Pablo', 30, 'New York', 123), ('Lionel', 45, 'Colombia', 189)] # DataFrame object was created dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks']) # Check if all the values are inisde the dataframe using both isin() and any( ) function result = dfObj.isin([81, 'Lionel', 190,]).any().any() if result: print('Any of the Element exists in Dataframe')
Output : Any of the Element exists in Dataframe
This program only prints if any of the given values are existing inside the dataframe.
Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.
Read more Articles on Python Data Analysis Using Padas – Find Elements in a Dataframe