iloc[] Function in Python:
Python is a fantastic language for data analysis, owing to its fantastic ecosystem of data-centric Python packages. Pandas is one of these packages, and it greatly simplifies data import and analysis.
Pandas have a one-of-a-kind method for retrieving rows from a Data frame. When the index label of a data frame is something other than a numeric series of 0, 1, 2, 3….n, or when the user does not know the index label, the Dataframe.iloc[] method is used. Rows can be extracted by using an imaginary index position that is not visible in the data frame.
The Python iloc() function allows us to select a specific cell of a dataset, that is, to select a value from a set of values in a data frame or dataset that belongs to a specific row or column.
Using the index values assigned to it, we can retrieve a specific value from a row and column using the iloc() function.
Keep in mind that the iloc() function only accepts integer type values as index values for the values to be accessed and displayed.
As previously stated, boolean values cannot be used as an index to retrieve records. It must be supplied with integer values.
Syntax:
dataframe.iloc[]
For Example:
Let us take the first 5 rows of the dataset to understand the dataframe.iloc[] function
Apply head() function to the above dataset to get the first 5 rows.
# Import pandas module as pd using the import keyword import pandas as pd # Import dataset using read_csv() function by pasing the dataset name as # an argument to it. # Store it in a variable. cereal_dataset = pd.read_csv('cereal.csv') # Apply head() function to the above dataset to get the first 5 rows. cereal_dataset.head()
Output:
name | mfr | type | calories | protein | fat | sodium | fiber | carbo | sugars | potass | vitamins | shelf | weight | cups | rating | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 100% Bran | N | C | 70 | 4 | 1 | 130 | 10.0 | 5.0 | 6 | 280 | 25 | 3 | 1.0 | 0.33 | 68.402973 |
1 | 100% Natural Bran | Q | C | 120 | 3 | 5 | 15 | 2.0 | 8.0 | 8 | 135 | 0 | 3 | 1.0 | 1.00 | 33.983679 |
2 | All-Bran | K | C | 70 | 4 | 1 | 260 | 9.0 | 7.0 | 5 | 320 | 25 | 3 | 1.0 | 0.33 | 59.425505 |
3 | All-Bran with Extra Fiber | K | C | 50 | 4 | 0 | 140 | 14.0 | 8.0 | 0 | 330 | 25 | 3 | 1.0 | 0.50 | 93.704912 |
4 | Almond Delight | R | C | 110 | 2 | 2 | 200 | 1.0 | 14.0 | 8 | -1 | 25 | 3 | 1.0 | 0.75 | 34.384843 |
If you want to retrieve all of the data values from the 2nd index of each column of the dataset, do as shown below:
# Import pandas module as pd using the import keyword import pandas as pd # Import numpy module as np using the import keyword import numpy as np # Import os module using the import keyword import os # Import dataset using read_csv() function by pasing the dataset name as # an argument to it. # Store it in a variable. cereal_dataset = pd.read_csv('cereal.csv') # Apply iloc() function to the above dataset to get all of the data values # from the 2nd index of each column and print it. print(cereal_dataset.iloc[2])
Output:
name All-Bran mfr K type C calories 70 protein 4 fat 1 sodium 260 fiber 9 carbo 7 sugars 5 potass 320 vitamins 25 shelf 3 weight 1 cups 0.33 rating 59.4255 Name: 2, dtype: object
If you want to get the data values of 2, 3 and 4th rows, then do as below:
# Import pandas module as pd using the import keyword import pandas as pd # Import numpy module as np using the import keyword import numpy as np # Import os module using the import keyword import os # Import dataset using read_csv() function by pasing the dataset name as # an argument to it. # Store it in a variable. cereal_dataset = pd.read_csv('cereal.csv') # Apply iloc() function to the above dataset to get the data values of 2, 3 and 4th rows # using slicing (It excludes the last row i.e, 5) cereal_dataset.iloc[2:5]
Output:
name | mfr | type | calories | protein | fat | sodium | fiber | carbo | sugars | potass | vitamins | shelf | weight | cups | rating | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | All-Bran | K | C | 70 | 4 | 1 | 260 | 9.0 | 7.0 | 5 | 320 | 25 | 3 | 1.0 | 0.33 | 59.425505 |
3 | All-Bran with Extra Fiber | K | C | 50 | 4 | 0 | 140 | 14.0 | 8.0 | 0 | 330 | 25 | 3 | 1.0 | 0.50 | 93.704912 |
4 | Almond Delight | R | C | 110 | 2 | 2 | 200 | 1.0 | 14.0 | 8 | -1 | 25 | 3 | 1.0 | 0.75 | 34.384843 |
For columns:
If you want to get the data values of 2 and 3 rd columns, then do as below:
Syntax:
dataframe.iloc[:, startcolumn : endcolumn]
Example:
# Import pandas module as pd using the import keyword import pandas as pd # Import numpy module as np using the import keyword import numpy as np # Import os module using the import keyword import os # Import dataset using read_csv() function by pasing the dataset name as # an argument to it. # Store it in a variable. cereal_dataset = pd.read_csv('cereal.csv') # Apply iloc() function to the above dataset to get the data values of 2 and 3rd columns # using slicing (It excludes the last column i.e, 4) cereal_dataset.iloc[:,2:4]
Output:
type calories 0 C 70 1 C 120 2 C 70 3 C 50 4 C 110 ... ... ... 72 C 110 73 C 110 74 C 100 75 C 100 76 C 110
Brief Recall:
In this article, we learned about the Python iloc() function and how it works.
- It can be used to retrieve records from datasets based on index values.
- Using index as a parameter to the iloc() function, multiple records can be fetched.
- Only integer indexes are accepted as parameters by the iloc() function.