How to select rows and columns by Name or Index in DataFrame using loc and iloc in Python ?
We will discuss several methods to select rows and columns in a dataframe. To select rows or columns we can use loc( )
, iloc( )
or using the [ ] operator
.
To demonstrate the various methods we will be using the following dataset :
Name Score City
0 Jill 16.0 Tokyo
1 Rachel 38.0 Texas
2 Kirti 39.0 New York
3 Veena 40.0 Texas
4 Lucifer NaN Texas
5 Pablo 30.0 New York
6 Lionel 45.0 Colombia
Method-1 : DataFrame.loc | Select Column & Rows by Name
We can use the loc( )
function to select rows and columns.
Syntax :
dataFrame.loc[<ROWS RANGE> , <COLUMNS RANGE>]
We have to enter the range of rows or columns, and it will select the specified range.
If we don’t give a value and pass ‘:’ instead, it will select all the rows or columns.
Select a Column by Name in DataFrame using loc[ ] :
As we need to select a single column only, we have to pass ‘:’
in row range place.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [
('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'])
#Selecting the 'Score solumn'
columnD = dfObj.loc[:,'Score']
print(columnD)
Output :
0 16.0
1 38.0
2 39.0
3 40.0
4 NaN
5 30.0
6 45.0
Name: Score, dtype: float64
Select multiple Columns by Name in DataFrame using loc[ ] :
To select multiple columns, we have to pass the column names as a list into the function.
So, let’s see the implementation of it.
#Program
import pandas as pd
import numpy as np
#data
students = [('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting multiple columns i.e 'Name' and 'Score' column
columnD = dfObj.loc[:,['Name','Score']]
print(columnD)
Output :
Name Score
a Jill 16.0
b Rachel 38.0
c Kirti 39.0
d Veena 40.0
e Lucifer NaN
f Pablo 30.0
g Lionel 45.0
Select a single row by Index Label in DataFrame using loc[ ] :
Just like the column, we can also select a single row by passing its name and in place of column range passing ‘:’
.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting a single row i.e 'b' row
selectData = dfObj.loc['b',:]
print(selectData)
Output :
Name Rachel
Score 38.0
City Texas
Name: b, dtype: object
Select multiple rows by Index labels in DataFrame using loc[ ] :
To select multiple rows we have to pass the names as a list into the function.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting multiple rows i.e 'd' and 'g'
selectData = dfObj.loc[['d','g'],:]
print(selectData)
Output :
Name Score City
d Veena 40.0 Texas
g Lionel 45.0 Colombia
Select multiple row & columns by Labels in DataFrame using loc[ ] :
To select multiple rows and columns we have to pass the list of rows and columns we want to select into the function.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting multiple rows and columns i.e 'd' and 'g' rows and 'Name' , 'City' column
selectData = dfObj.loc[['d','g'],['Name','City']]
print(selectData)
Output :
Name City
d Veena Texas
g Lionel Colombia
Method-2 : DataFrame.iloc | Select Column Indexes & Rows Index Positions
We can use the iloc( )
function to select rows and columns. It is quite similar to loc( )
function .
Syntax-
dataFrame.iloc
[<ROWS INDEX RANGE> , <COLUMNS INDEX RANGE>]
The function selects rows and columns in the dataframe by the index position we pass into the program. And just as like loc( ) if ‘:
’ is passed into the function, all the rows/columns are selected.
Select a single column by Index position :
We have to pass the index of the column with ‘:’
in place of the row index.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [
('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting a single column at the index 2
selectData = dfObj.iloc[:,2]
print(selectData)
Output :
a Tokyo
b Texas
c New York
d Texas
e Texas
f New York
g Colombia
Name: City, dtype: object
Select multiple columns by Indices in a list :
To select multiple columns by indices we just pass the indices as series into the column value.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [
('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting multiple columns at the index 0 & 2
selectData = dfObj.iloc[:,[0,2]]
print(selectData)
Output :
Name City
a Jill Tokyo
b Rachel Texas
c Kirti New York
d Veena Texas
e Lucifer Texas
f Pablo New York
g Lionel Colombia
Select multiple columns by Index range :
To select multiple columns by index range we just pass the indices as series into the column value.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [
('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting multiple columns from the index 1 to 3
selectData = dfObj.iloc[:,1:3]
print(selectData)
Output :
Score City
a 16.0 Tokyo
b 38.0 Texas
c 39.0 New York
d 40.0 Texas
e NaN Texas
f 30.0 New York
g 45.0 Colombia
Select single row by Index Position :
Just like columns we can pass the index and select the row.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [
('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting a single row with index 2
selectData = dfObj.iloc[2,:]
print(selectData)
Output :
Name Kirti
Score 39.0
City New York
Name: c, dtype: object
Select multiple rows by Index positions in a list :
To do this we can pass the indices of positions to select into the function.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [
('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting multiple rows by passing alist i.e. 2 & 5
selectData = dfObj.iloc[[2,5],:]
print(selectData)
Output :
Name Score City
c Kirti 39.0 New York
f Pablo 30.0 New York
Select multiple rows by Index range :
To select a range of rows we pass the range separated by a ‘:’ into the function.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [
('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting multiple rows by range i.e. 2 to 5
selectData = dfObj.iloc[2:5,:]
print(selectData)
Output :
Name Score City
c Kirti 39.0 New York
d Veena 40.0 Texas
e Lucifer NaN Texas
Select multiple rows & columns by Index positions :
To select multiple rows and columns at once, we pass the indices directly into function.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [
('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Selecting multiple rows and columns
selectData = dfObj.iloc[[1,2],[1,2]]
print(selectData)
Output :
Score City
b 38.0 Texas
c 39.0 New York
Method-3 : Selecting Columns in DataFrame using [ ] operator
The [ ]
operator selects the data according to the name provided to it. However, when a non-existent label is passed into it, it sends a KeyError
.
Select a Column by Name :
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [
('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Select a single column name using [ ]
selectData = dfObj['Name']
print(selectData)
Output :
a Jill
b Rachel
c Kirti
d Veena
e Lucifer
f Pablo
g Lionel
Name: Name, dtype: object
Select multiple columns by Name :
To select multiple columns we just pass a list of their names into [ ]
.
So, let’s see the implementation of it.
#Program :
import pandas as pd
import numpy as np
#data
students = [
('Jill', 16, 'Tokyo',),
('Rachel', 38, 'Texas',),
('Kirti', 39, 'New York'),
('Veena', 40, 'Texas',),
('Lucifer', np.NaN, 'Texas'),
('Pablo', 30, 'New York'),
('Lionel', 45, 'Colombia',)]
#Creating the dataframe object
dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g'])
#Select multiple columns using [ ]
selectData = dfObj[['Name','City']]
print(selectData)
Output :
Name City
a Jill Tokyo
b Rachel Texas
c Kirti New York
d Veena Texas
e Lucifer Texas
f Pablo New York
g Lionel Colombia
Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.
Read more Articles on Python Data Analysis Using Padas – Select items from a Dataframe