How to drop columns in DataFrame by label Names or by Index Positions in Python ?
In this article, we are going to demonstrate how to drop columns in a dataframe by their labels or index. So, let’s start exploring the topic in detail.
In dataframe there is a function drop()
which can be used to drop columns.
Syntax – DataFrame.drop
(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
Where,
- labels : The label or the labels we want to delete are passed here.
- index : The index of the position to be deleted
- columns : If it is
True
it deletes the columns, else the rows - axis : If axis is 1 it considers the columns like the columns. If 0, it works on rows.
- inplace : If it is
False
then it doesn’t modify the dataframe, it returns a new one. If it is set toTrue
it modifies the dataframe.
We will be using the following dataset as example :
 Regd    Name  Age     City Exp a   10    Jill 16.0    Tokyo  10 b   11  Rachel 38.0    Texas   5 c   12   Kirti 39.0 New York   7 d   13   Veena 40.0    Texas  21 e   14 Lucifer  NaN    Texas  30 f   15   Pablo 30.0 New York   7 g   16  Lionel 45.0 Colombia  11
Delete a Single column in DataFrame by Column Name :
We can delete a single column just by passing its name into the function. Let’s try deleting ‘Age
’ column from the Dataframe.
Let’s see the program how to implement this.
#Program : import numpy as np import pandas as pd # Example data students = [(10,'Jill', 16, 'Tokyo', 10), (11,'Rachel', 38, 'Texas', 5), (12,'Kirti', 39, 'New York', 7), (13,'Veena', 40, 'Texas', 21), (14,'Lucifer', np.NaN, 'Texas', 30), (15,'Pablo', 30, 'New York', 7), (16,'Lionel', 45, 'Colombia', 11) ] #Creating a dataframe object dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) #Modifying the dataframe and storing it into a new object modDfObj = dfObj.drop('Age' , axis='columns') print(modDfObj)
Output :  Regd    Name     City     Exp a   10    Jill      Tokyo   10 b   11   Rachel    Texas     5 c   12    Kirti    New York   7 d   13   Veena    Texas     21 e   14   Lucifer    Texas     30 f   15    Pablo   New York   7 g   16   Lionel  Colombia  11
Drop Multiple Columns by Label Names in DataFrame :
To delete multiple columns by name we just have to pass all the names as a list into the function. Let’s try deleting ‘Age’ and ‘Exp’
Let’s see the program how to implement this.
#Program : import numpy as np import pandas as pd # Example data students = [(10,'Jill', 16, 'Tokyo', 10), (11,'Rachel', 38, 'Texas', 5), (12,'Kirti', 39, 'New York', 7), (13,'Veena', 40, 'Texas', 21), (14,'Lucifer', np.NaN, 'Texas', 30), (15,'Pablo', 30, 'New York', 7), (16,'Lionel', 45, 'Colombia', 11) ] #Creating a dataframe object dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) #Modifying the dataframe without the columns and storing it into a new object modDfObj = dfObj.drop(['Age' , 'Exp'] , axis='columns') print(modDfObj)
Output : Regd   Name     City a   10    Jill    Tokyo b   11  Rachel    Texas c   12   Kirti New York d   13   Veena    Texas e   14 Lucifer    Texas f   15   Pablo New York g   16  Lionel Colombia
Drop Columns by Index Position in DataFrame :
In case we know the index position of the columns we want to drop, we can pass them into the function. Let’s try deleting the same two columns as above but with their index position.
Let’s see the program how to implement this.
#Program : import numpy as np import pandas as pd # Example data students = [(10,'Jill', 16, 'Tokyo', 10), (11,'Rachel', 38, 'Texas', 5), (12,'Kirti', 39, 'New York', 7), (13,'Veena', 40, 'Texas', 21), (14,'Lucifer', np.NaN, 'Texas', 30), (15,'Pablo', 30, 'New York', 7), (16,'Lionel', 45, 'Colombia', 11) ] #Creating a dataframe object dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) #Modifying the dataframe without the columns and storing it into a new object by passsing the index of the columns modDfObj = dfObj.drop([dfObj.columns[2] , dfObj.columns[4]] , axis='columns') print(modDfObj)
Output :  Regd   Name     City a   10    Jill    Tokyo b   11  Rachel    Texas c   12   Kirti New York d   13   Veena    Texas e   14 Lucifer    Texas f   15   Pablo New York g   16  Lionel Colombia
Drop Columns in Place :
In case we don’t want a new dataframe object to hold the modified values, but want to store it in the same object, we can do it by passing inplace= True
. Let’s use the previous example for this.
Let’s see the program how to implement this.
#Program : import numpy as np import pandas as pd # Example data students = [(10,'Jill', 16, 'Tokyo', 10), (11,'Rachel', 38, 'Texas', 5), (12,'Kirti', 39, 'New York', 7), (13,'Veena', 40, 'Texas', 21), (14,'Lucifer', np.NaN, 'Texas', 30), (15,'Pablo', 30, 'New York', 7), (16,'Lionel', 45, 'Colombia', 11) ] #Creating a dataframe object dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) #Modifying the dataframe without the columns and storing it into the same object dfObj.drop([dfObj.columns[2] , dfObj.columns[4]] , axis='columns',inplace = True) print(dfObj)
Output :   Regd    Name     City a   10     Jill     Tokyo b   11    Rachel    Texas c   12    Kirti    New York d   13    Veena    Texas e   14   Lucifer    Texas f   15    Pablo  New York g   16   Lionel   Colombia
Drop Column If Exists :
In case the column/row does not exist we can do a check beforehand to avoid further bugs in the program. We can do so by using the same function drop( )
, it checks for the columns and if it is not found it returns KeyError
which we can handle by an if-else condition.
#program : import numpy as np import pandas as pd # Example data students = [(10,'Jill', 16, 'Tokyo', 10), (11,'Rachel', 38, 'Texas', 5), (12,'Kirti', 39, 'New York', 7), (13,'Veena', 40, 'Texas', 21), (14,'Lucifer', np.NaN, 'Texas', 30), (15,'Pablo', 30, 'New York', 7), (16,'Lionel', 45, 'Colombia', 11) ] #Creating a dataframe object dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) #Checking for a non-existent column if 'Last Name' in dfObj.columns : dfObj.drop('Last Name' , axis='columns') print(dfObj) else : print("The column was not found")
Output : The column was not found
Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.
Read more Articles on Python Data Analysis Using Padas – Remove Contents from a Dataframe