Python Pandas : Drop columns in DataFrame by label Names or by Index Positions

How to drop columns in DataFrame by label Names or by Index Positions in Python ?

In this article, we are going to demonstrate how to drop columns in a dataframe by their labels or index. So, let’s start exploring the topic in detail.

In dataframe there is a function drop() which can be used to drop columns.

Syntax – DataFrame.drop

(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

Where,

  • labels : The label or the labels we want to delete are passed here.
  • index : The index of the position to be deleted
  • columns : If it is True it deletes the columns, else the rows
  • axis : If axis is 1 it considers the columns like the columns. If 0, it works on rows.
  • inplace : If it is False then it doesn’t modify the dataframe, it returns a new one. If it is set to True it modifies the dataframe.

We will be using the following dataset as example :

     Regd     Name        Age      City           Exp
a    10         Jill             16.0     Tokyo         10
b    11         Rachel      38.0     Texas           5
c    12         Kirti          39.0      New York    7
d    13        Veena       40.0     Texas           21
e    14        Lucifer      NaN     Texas           30
f     15        Pablo        30.0     New York     7
g    16       Lionel        45.0     Colombia    11

Delete a Single column in DataFrame by Column Name :

We can delete a single column just by passing its name into the function. Let’s try deleting ‘Age’ column from the Dataframe.

Let’s see the program how to implement this.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [(10,'Jill',    16,     'Tokyo',    10),
            (11,'Rachel',  38,     'Texas',     5),
            (12,'Kirti',   39,     'New York',  7),
            (13,'Veena',   40,     'Texas',    21),
            (14,'Lucifer', np.NaN, 'Texas',    30),
            (15,'Pablo',   30,     'New York',  7),
            (16,'Lionel',  45,     'Colombia', 11) ]
#Creating a dataframe object
dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) 
#Modifying the dataframe and storing it into a new object
modDfObj = dfObj.drop('Age' , axis='columns')
print(modDfObj)
Output :
    Regd     Name      City         Exp
a    10        Jill           Tokyo       10
b    11      Rachel     Texas          5
c    12       Kirti       New York    7
d    13      Veena     Texas          21
e    14      Lucifer     Texas         30
f    15       Pablo     New York    7
g    16      Lionel    Colombia    11

Drop Multiple Columns by Label Names in DataFrame :

To delete multiple columns by name we just have to pass all the names as a list into the function. Let’s try deleting ‘Age’ and ‘Exp’

Let’s see the program how to implement this.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [(10,'Jill',    16,     'Tokyo',    10),
            (11,'Rachel',  38,     'Texas',     5),
            (12,'Kirti',   39,     'New York',  7),
            (13,'Veena',   40,     'Texas',    21),
            (14,'Lucifer', np.NaN, 'Texas',    30),
            (15,'Pablo',   30,     'New York',  7),
            (16,'Lionel',  45,     'Colombia', 11) ]
#Creating a dataframe object
dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) 
#Modifying the dataframe without the columns and storing it into a new object
modDfObj = dfObj.drop(['Age' , 'Exp'] , axis='columns')
print(modDfObj)
Output :
   Regd   Name     City
a    10     Jill          Tokyo
b    11   Rachel     Texas
c    12    Kirti        New York
d    13    Veena     Texas
e    14   Lucifer     Texas
f    15    Pablo      New York
g    16   Lionel     Colombia

Drop Columns by Index Position in DataFrame :

In case we know the index position of the columns we want to drop, we can pass them into the function. Let’s try deleting the same two columns as above but with their index position.

Let’s see the program how to implement this.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [(10,'Jill',    16,     'Tokyo',    10),
            (11,'Rachel',  38,     'Texas',     5),
            (12,'Kirti',   39,     'New York',  7),
            (13,'Veena',   40,     'Texas',    21),
            (14,'Lucifer', np.NaN, 'Texas',    30),
            (15,'Pablo',   30,     'New York',  7),
            (16,'Lionel',  45,     'Colombia', 11) ]
#Creating a dataframe object
dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) 
#Modifying the dataframe without the columns and storing it into a new object by passsing the index of the columns
modDfObj = dfObj.drop([dfObj.columns[2] , dfObj.columns[4]] ,  axis='columns')
print(modDfObj)
Output :
    Regd   Name      City
a    10     Jill           Tokyo
b    11    Rachel     Texas
c    12     Kirti         New York
d    13    Veena      Texas
e    14    Lucifer     Texas
f    15     Pablo       New York
g    16    Lionel      Colombia

Drop Columns in Place :

In case we don’t want a new dataframe object to hold the modified values, but want to store it in the same object, we can do it by passing inplace= True. Let’s use the previous example for this.

Let’s see the program how to implement this.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [(10,'Jill',    16,     'Tokyo',    10),
            (11,'Rachel',  38,     'Texas',     5),
            (12,'Kirti',   39,     'New York',  7),
            (13,'Veena',   40,     'Texas',    21),
            (14,'Lucifer', np.NaN, 'Texas',    30),
            (15,'Pablo',   30,     'New York',  7),
            (16,'Lionel',  45,     'Colombia', 11) ]
#Creating a dataframe object
dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) 
#Modifying the dataframe without the columns and storing it into the same object 
dfObj.drop([dfObj.columns[2] , dfObj.columns[4]] ,  axis='columns',inplace = True)
print(dfObj)
Output :
     Regd     Name      City
a    10          Jill         Tokyo
b    11       Rachel     Texas
c    12        Kirti       New York
d    13       Veena     Texas
e    14      Lucifer     Texas
f    15       Pablo    New York
g    16     Lionel     Colombia

Drop Column If Exists :

In case the column/row does not exist we can do a check beforehand to avoid further bugs in the program. We can do so by using the same function drop( ) , it checks for the columns and if it is not found it returns KeyError which we can handle by an if-else condition.

#program :

import numpy as np
import pandas as pd

# Example data
students = [(10,'Jill',    16,     'Tokyo',    10),
            (11,'Rachel',  38,     'Texas',     5),
            (12,'Kirti',   39,     'New York',  7),
            (13,'Veena',   40,     'Texas',    21),
            (14,'Lucifer', np.NaN, 'Texas',    30),
            (15,'Pablo',   30,     'New York',  7),
            (16,'Lionel',  45,     'Colombia', 11) ]
#Creating a dataframe object
dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) 
#Checking for a non-existent column
if 'Last Name' in dfObj.columns :
    dfObj.drop('Last Name' ,  axis='columns')
    print(dfObj)
else :
    print("The column was not found")
Output :
The column was not found

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Remove Contents from a Dataframe