How to check if a value exists in a DataFrame using in and not in operator | isin() in Python ?
In this article we are going to discuss different ways to check if a value is present in the dataframe.
We will be using the following dataset as example
Name Age City Marks 0 Jill 16 Tokyo 154 1 Rachel 38 Texas 170 2 Kirti 39 New York 88 3 Veena 40 Texas 190 4 Lucifer 35 Texas 59 5 Pablo 30 New York 123 6 Lionel 45 Colombia 189
Check if a single element exists in DataFrame using in & not in operators :
Dataframe class has a member Dataframe.values that gives us all the values in an numpy representation. We will be using that with the in and not operator to check if the value is present in the dataframe.
Using in operator to check if an element exists in dataframe :
We will be checking if the value 190 exists in the dataset using in operator.
#Program :
import pandas as pd
#Example data
employees = [
('Jill', 16, 'Tokyo', 154),
('Rachel', 38, 'Texas', 170),
('Kirti', 39, 'New York', 88),
('Veena', 40, 'Texas', 190),
('Lucifer', 35, 'Texas', 59),
('Pablo', 30, 'New York', 123),
('Lionel', 45, 'Colombia', 189)]
# DataFrame object was created
dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks'])
if 190 in dfObj.values:
print('Element exists in Dataframe')
Output : Element exists in Dataframe
Using not in operator to check if an element doesn’t exists in dataframe :
We will be checking if ‘Leo’ is present in the dataframe using not operator.
# Program :
import pandas as pd
#Example data
employees = [
('Jill', 16, 'Tokyo', 154),
('Rachel', 38, 'Texas', 170),
('Kirti', 39, 'New York', 88),
('Veena', 40, 'Texas', 190),
('Lucifer', 35, 'Texas', 59),
('Pablo', 30, 'New York', 123),
('Lionel', 45, 'Colombia', 189)]
# DataFrame object was created
dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks'])
if 'Leo' not in dfObj.values:
print('Element does not exists in Dataframe')
Output : Element does not exists in Dataframe
Checking if multiple elements exists in DataFrame or not using in operator :
To check for multiple elements, we have to write a function.
# Program :
import pandas as pd
def checkForValues(_dfObj, listOfValues):
#The function would check for the list of values in our dataset
result = {}
#Iterate through the elementes
for elem in listOfValues:
# Check if the element exists in the dataframe values
if elem in _dfObj.values:
result[elem] = True
else:
result[elem] = False
# Returns a dictionary containig the vvalues and their existence in boolean
return result
#Example data
employees = [
('Jill', 16, 'Tokyo', 154),
('Rachel', 38, 'Texas', 170),
('Kirti', 39, 'New York', 88),
('Veena', 40, 'Texas', 190),
('Lucifer', 35, 'Texas', 59),
('Pablo', 30, 'New York', 123),
('Lionel', 45, 'Colombia', 189)]
# DataFrame object was created
dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks'])
#Check for the existence of values
Tresult = checkForValues(dfObj, [30, 'leo', 190])
print('The values existence inside the dataframe are ')
print(Tresult)
Output :
The values existence inside the dataframe are
{30: True, 'leo': False, 190: True}Rather than writing a whole function, we can also achieve this using a smaller method using dictionary comprehension.
# Program :
import pandas as pd
#Example data
employees = [
('Jill', 16, 'Tokyo', 154),
('Rachel', 38, 'Texas', 170),
('Kirti', 39, 'New York', 88),
('Veena', 40, 'Texas', 190),
('Lucifer', 35, 'Texas', 59),
('Pablo', 30, 'New York', 123),
('Lionel', 45, 'Colombia', 189)]
# DataFrame object was created
dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks'])
listOfValues = [30, 'leo', 190]
#using dictionary comprehension check for given values
result = {elem: True if elem in dfObj.values else False for elem in listOfValues}
print('The values existence inside the dataframe are ')
print(result)
Output :
The values existence inside the dataframe are
{30: True, 'leo': False, 190: True}Checking if elements exists in DataFrame using isin() function :
We can also check if a value exists inside a dataframe or not using the isin( ) function.
Syntax : DataFrame.isin(self, values)
Where,
- Values : It takes the values to check for inside the dataframe.
Checking if a single element exist in Dataframe using isin() :
# Program :
import pandas as pd
#Example data
employees = [
('Jill', 16, 'Tokyo', 154),
('Rachel', 38, 'Texas', 170),
('Kirti', 39, 'New York', 88),
('Veena', 40, 'Texas', 190),
('Lucifer', 35, 'Texas', 59),
('Pablo', 30, 'New York', 123),
('Lionel', 45, 'Colombia', 189)]
# DataFrame object was created
dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks'])
#Checking ofr a single element susing isin( ) function
boolDf = dfObj.isin([38])
print(boolDf)
Output : Name Age City Marks 0 False False False False 1 False True False False 2 False False False False 3 False False False False 4 False False False False 5 False False False False 6 False False False False
Here the isin( ) operator returned a boolean dataframe of the same number of elements, where the elements that matched our values is True and rest all are false.
We can add this to the any( ) function that only shows the true values and pass it into another any( ) function making it a series to pinpoint if our element exists or not.
# Program :
import pandas as pd
#Example data
employees = [
('Jill', 16, 'Tokyo', 154),
('Rachel', 38, 'Texas', 170),
('Kirti', 39, 'New York', 88),
('Veena', 40, 'Texas', 190),
('Lucifer', 35, 'Texas', 59),
('Pablo', 30, 'New York', 123),
('Lionel', 45, 'Colombia', 189)]
# DataFrame object was created
dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks'])
# Check if the value is inisde the dataframe using both isin() and any( ) function
result = dfObj.isin(['Lionel']).any().any()
if result:
print('The value exists inside the datframe')
Output : Any of the Element exists in Dataframe
Checking if any of the given values exists in the Dataframe :
# Program :
import pandas as pd
#Example data
employees = [
('Jill', 16, 'Tokyo', 154),
('Rachel', 38, 'Texas', 170),
('Kirti', 39, 'New York', 88),
('Veena', 40, 'Texas', 190),
('Lucifer', 35, 'Texas', 59),
('Pablo', 30, 'New York', 123),
('Lionel', 45, 'Colombia', 189)]
# DataFrame object was created
dfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Marks'])
# Check if all the values are inisde the dataframe using both isin() and any( ) function
result = dfObj.isin([81, 'Lionel', 190,]).any().any()
if result:
print('Any of the Element exists in Dataframe')
Output : Any of the Element exists in Dataframe
This program only prints if any of the given values are existing inside the dataframe.
Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.
Read more Articles on Python Data Analysis Using Padas – Find Elements in a Dataframe