How to select rows and columns by Name or Index in DataFrame using loc and iloc in Python ?
We will discuss several methods to select rows and columns in a dataframe. To select rows or columns we can use loc( )
, iloc( )
or using the [ ] operator
.
To demonstrate the various methods we will be using the following dataset :
  Name   Score    City 0    Jill      16.0    Tokyo 1   Rachel   38.0   Texas 2   Kirti     39.0    New York 3   Veena   40.0   Texas 4   Lucifer   NaN   Texas 5   Pablo    30.0    New York 6   Lionel   45.0    Colombia
Method-1 : DataFrame.loc | Select Column & Rows by Name
We can use the loc( )
function to select rows and columns.
Syntax :
dataFrame.loc[<ROWS RANGE> , <COLUMNS RANGE>]
We have to enter the range of rows or columns, and it will select the specified range.
If we don’t give a value and pass ‘:’ instead, it will select all the rows or columns.
Select a Column by Name in DataFrame using loc[ ] :
As we need to select a single column only, we have to pass ‘:’
in row range place.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City']) #Selecting the 'Score solumn' columnD = dfObj.loc[:,'Score'] print(columnD)
Output : 0Â Â Â 16.0 1Â Â Â 38.0 2Â Â Â 39.0 3Â Â Â 40.0 4Â Â Â Â NaN 5Â Â Â 30.0 6Â Â Â 45.0 Name: Score, dtype: float64
Select multiple Columns by Name in DataFrame using loc[ ] :
To select multiple columns, we have to pass the column names as a list into the function.
So, let’s see the implementation of it.
#Program import pandas as pd import numpy as np #data students = [('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple columns i.e 'Name' and 'Score' column columnD = dfObj.loc[:,['Name','Score']] print(columnD)
Output :   Name   Score a    Jill     16.0 b  Rachel   38.0 c   Kirti     39.0 d   Veena   40.0 e Lucifer   NaN f   Pablo   30.0 g  Lionel   45.0
Select a single row by Index Label in DataFrame using loc[ ] :
Just like the column, we can also select a single row by passing its name and in place of column range passing ‘:’
.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting a single row i.e 'b' row selectData = dfObj.loc['b',:] print(selectData)
Output : Name    Rachel Score     38.0 City     Texas Name: b, dtype: object
Select multiple rows by Index labels in DataFrame using loc[ ] :
To select multiple rows we have to pass the names as a list into the function.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple rows i.e 'd' and 'g' selectData = dfObj.loc[['d','g'],:] print(selectData)
Output : Name Score     City d  Veena  40.0    Texas g Lionel  45.0 Colombia
Select multiple row & columns by Labels in DataFrame using loc[ ] :
To select multiple rows and columns we have to pass the list of rows and columns we want to select into the function.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple rows and columns i.e 'd' and 'g' rows and 'Name' , 'City' column selectData = dfObj.loc[['d','g'],['Name','City']] print(selectData)
Output : Name     City d  Veena    Texas g Lionel Colombia
Method-2 : DataFrame.iloc | Select Column Indexes & Rows Index Positions
We can use the iloc( )
function to select rows and columns. It is quite similar to loc( )
function .
Syntax-
dataFrame.iloc
[<ROWS INDEX RANGE> , <COLUMNS INDEX RANGE>]
The function selects rows and columns in the dataframe by the index position we pass into the program. And just as like loc( ) if ‘:
’ is passed into the function, all the rows/columns are selected.
Select a single column by Index position :
We have to pass the index of the column with ‘:’
in place of the row index.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting a single column at the index 2 selectData = dfObj.iloc[:,2] print(selectData)
Output : a      Tokyo b      Texas c    New York d      Texas e      Texas f    New York g    Colombia Name: City, dtype: object
Select multiple columns by Indices in a list :
To select multiple columns by indices we just pass the indices as series into the column value.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple columns at the index 0 & 2 selectData = dfObj.iloc[:,[0,2]] print(selectData)
Output :    Name    City a   Jill      Tokyo b   Rachel    Texas c   Kirti     New York d   Veena   Texas e   Lucifer    Texas f   Pablo    New York g   Lionel    Colombia
Select multiple columns by Index range :
To select multiple columns by index range we just pass the indices as series into the column value.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple columns from the index 1 to 3 selectData = dfObj.iloc[:,1:3] print(selectData)
Output :   Score     City a  16.0     Tokyo b  38.0     Texas c  39.0     New York d  40.0     Texas e   NaN    Texas f   30.0     New York g  45.0     Colombia
Select single row by Index Position :
Just like columns we can pass the index and select the row.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting a single row with index 2 selectData = dfObj.iloc[2,:] print(selectData)
Output : Name       Kirti Score       39.0 City    New York Name: c, dtype: object
Select multiple rows by Index positions in a list :
To do this we can pass the indices of positions to select into the function.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple rows by passing alist i.e. 2 & 5 selectData = dfObj.iloc[[2,5],:] print(selectData)
Output :   Name  Score     City c  Kirti   39.0    New York f  Pablo  30.0    New York
Select multiple rows by Index range :
To select a range of rows we pass the range separated by a ‘:’ into the function.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple rows by range i.e. 2 to 5 selectData = dfObj.iloc[2:5,:] print(selectData)
Output : Name Score     City c   Kirti  39.0 New York d   Veena  40.0    Texas e Lucifer   NaN    Texas
Select multiple rows & columns by Index positions :
To select multiple rows and columns at once, we pass the indices directly into function.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Selecting multiple rows and columns selectData = dfObj.iloc[[1,2],[1,2]] print(selectData)
Output : Score     City b  38.0    Texas c  39.0 New York
Method-3 : Selecting Columns in DataFrame using [ ] operator
The [ ]
operator selects the data according to the name provided to it. However, when a non-existent label is passed into it, it sends a KeyError
.
Select a Column by Name :
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Select a single column name using [ ] selectData = dfObj['Name'] print(selectData)
Output : a      Jill b    Rachel c     Kirti d     Veena e   Lucifer f     Pablo g    Lionel Name: Name, dtype: object
Select multiple columns by Name :
To select multiple columns we just pass a list of their names into [ ]
.
So, let’s see the implementation of it.
#Program : import pandas as pd import numpy as np #data students = [ ('Jill', 16, 'Tokyo',), ('Rachel', 38, 'Texas',), ('Kirti', 39, 'New York'), ('Veena', 40, 'Texas',), ('Lucifer', np.NaN, 'Texas'), ('Pablo', 30, 'New York'), ('Lionel', 45, 'Colombia',)] #Creating the dataframe object dfObj = pd.DataFrame(students, columns=['Name','Score','City'], index=['a','b','c','d','e','f','g']) #Select multiple columns using [ ] selectData = dfObj[['Name','City']] print(selectData)
Output :    Name    City a   Jill      Tokyo b   Rachel    Texas c   Kirti     New York d   Veena    Texas e   Lucifer   Texas f    Pablo    New York g   Lionel    Colombia
Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.
Read more Articles on Python Data Analysis Using Padas – Select items from a Dataframe
- Select Rows in a Dataframe based on conditions
- Get minimum values in rows or columns & their index position in Dataframe
- Get unique values in columns of a Dataframe
- Select first or last N rows in a Dataframe using head() & tail()
- Get a list of column and row names in a DataFrame
- Get DataFrame contents as a list of rows or columns (list of lists)