{"id":6480,"date":"2021-05-20T18:55:00","date_gmt":"2021-05-20T13:25:00","guid":{"rendered":"https:\/\/python-programs.com\/?p=6480"},"modified":"2021-11-22T18:45:26","modified_gmt":"2021-11-22T13:15:26","slug":"pandas-6-different-ways-to-iterate-over-rows-in-a-dataframe-update-while-iterating-row-by-row","status":"publish","type":"post","link":"https:\/\/python-programs.com\/pandas-6-different-ways-to-iterate-over-rows-in-a-dataframe-update-while-iterating-row-by-row\/","title":{"rendered":"Pandas: 6 Different ways to iterate over rows in a Dataframe & Update while iterating row by row"},"content":{"rendered":"
In this tutorial, we will review & make you understand six different techniques to iterate over rows. Later we will also explain how to update the contents of a Dataframe while iterating over it row by row.<\/p>\n
Let’s first create a dataframe which we will use in our example,<\/p>\n
import pandas as pd\r\nempoyees = [('Shikha', 34, 'Mumbai', 5) ,\r\n ('Rekha', 31, 'Delhi' , 7) ,\r\n ('Shishir', 16, 'Punjab', 11)\r\n ]\r\n# Create a DataFrame object\r\nempDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])\r\nprint(empDfObj)<\/pre>\nOutput:<\/strong><\/p>\n
Name\u00a0 Age\u00a0 \u00a0 City\u00a0 \u00a0 \u00a0 \u00a0 Experience\r\na\u00a0 Shikha 34\u00a0 \u00a0 \u00a0Mumbai\u00a0 5\r\nb Rekha\u00a0 \u00a031\u00a0 \u00a0 \u00a0Delhi\u00a0 \u00a0 \u00a0 7\r\nc Shishir\u00a0 16\u00a0 \u00a0 \u00a0Punjab\u00a0 \u00a011\r\n<\/pre>\n<\/a>Iterate over rows of a dataframe using DataFrame.iterrows()<\/h2>\n
Dataframe class implements a member function iterrows() i.e. DataFrame.iterrows(). Now, we will use this function to iterate over rows of a dataframe.<\/p>\n
DataFrame.iterrows()<\/h3>\n
DataFrame.iterrows() returns an iterator that iterator iterate over all the rows of a dataframe.<\/p>\n
For each row, it returns a tuple containing the index label and row contents as series.<\/p>\n
Let’s use it in an example,<\/p>\n
import pandas as pd\r\nempoyees = [('Shikha', 34, 'Mumbai', 5) ,\r\n ('Rekha', 31, 'Delhi' , 7) ,\r\n ('Shishir', 16, 'Punjab', 11)\r\n ]\r\n# Create a DataFrame object\r\nempDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])\r\n\r\nfor (index_label, row_series) in empDfObj.iterrows():\r\n print('Row Index label : ', index_label)\r\n print('Row Content as Series : ', row_series.values)<\/pre>\nOutput:<\/strong><\/p>\n
Row Index label : a\r\nRow Content as Series : ['Shikha' 34 'Mumbai' 5]\r\nRow Index label : b\r\nRow Content as Series : ['Rekha' 31 'Delhi' 7]\r\nRow Index label : c\r\nRow Content as Series : ['Shishir' 16 'Punjab' 11]\r\n<\/pre>\nNote:<\/strong><\/p>\n
\n
- Do Not Preserve the data types as iterrows() returns each row contents as series however it doesn’t preserve datatypes of values in the rows.<\/li>\n
- We can not able to do any modification while iterating over the rows by iterrows(). If we do some changes to it then our original dataframe would not be affected.<\/li>\n<\/ul>\n
<\/a>Iterate over rows of a dataframe using DataFrame.itertuples()<\/h2>\n
DataFrame.itertuples()<\/strong><\/p>\n
DataFrame.itertuples() yields a named tuple for each row containing all the column names and their value for that row.<\/p>\n
Let\u2019s use it,<\/p>\n
import pandas as pd\r\nempoyees = [('Shikha', 34, 'Mumbai', 5) ,\r\n ('Rekha', 31, 'Delhi' , 7) ,\r\n ('Shishir', 16, 'Punjab', 11)\r\n ]\r\n# Create a DataFrame object\r\nempDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])\r\n\r\n# Iterate over the Dataframe rows as named tuples\r\nfor namedTuple in empDfObj.itertuples():\r\n #Print row contents inside the named tuple\r\n print(namedTuple)<\/pre>\nOutput:<\/strong><\/p>\n
Pandas(Index='a', Name='Shikha', Age=34, City='Mumbai', Experience=5)\r\nPandas(Index='b', Name='Rekha', Age=31, City='Delhi', Experience=7)\r\nPandas(Index='c', Name='Shishir', Age=16, City='Punjab', Experience=11)\r\n<\/pre>\nSo we can see that for every row it returned a named tuple. we can access the individual value by indexing..like,<\/p>\n
For the first value,<\/p>\n
namedTuple[0]<\/code><\/p>\n
For the second value,<\/p>\n
namedTuple[1]<\/code><\/p>\n
Do Read:<\/span><\/p>\n
\n
- Python Pandas: Select Rows in DataFrame by conditions on multiple columns<\/a><\/li>\n
- Pandas: count rows in a dataframe | all or those only that satisfy a condition<\/a><\/li>\n<\/ul>\n
<\/a>Named Tuples without index<\/h3>\n
If we pass argument ‘index=False’ then it only shows the named tuple not the index column.<\/p>\n
import pandas as pd\r\nempoyees = [('Shikha', 34, 'Mumbai', 5) ,\r\n ('Rekha', 31, 'Delhi' , 7) ,\r\n ('Shishir', 16, 'Punjab', 11)\r\n ]\r\n# Create a DataFrame object\r\nempDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])\r\n\r\n# Iterate over the Dataframe rows as named tuples without index\r\nfor namedTuple in empDfObj.itertuples(index=False):\r\n # Print row contents inside the named tuple\r\n print(namedTuple)<\/pre>\nOutput:<\/strong><\/p>\n
Pandas(Name='Shikha', Age=34, City='Mumbai', Experience=5)\r\nPandas(Name='Rekha', Age=31, City='Delhi', Experience=7)\r\nPandas(Name='Shishir', Age=16, City='Punjab', Experience=11)\r\n<\/pre>\n<\/a>Named Tuples with custom names<\/h3>\n
If we don’t want to show Pandas name every time, we can pass custom names too:<\/p>\n
import pandas as pd\r\nempoyees = [('Shikha', 34, 'Mumbai', 5) ,\r\n ('Rekha', 31, 'Delhi' , 7) ,\r\n ('Shishir', 16, 'Punjab', 11)\r\n ]\r\n# Create a DataFrame object\r\nempDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])\r\n\r\n# Give Custom Name to the tuple while Iterating over the Dataframe rows\r\nfor row in empDfObj.itertuples(name='Employee'):\r\n # Print row contents inside the named tuple\r\n print(row)<\/pre>\nOutput:<\/strong><\/p>\n
Employee(Index='a', Name='Shikha', Age=34, City='Mumbai', Experience=5)\r\nEmployee(Index='b', Name='Rekha', Age=31, City='Delhi', Experience=7)\r\nEmployee(Index='c', Name='Shishir', Age=16, City='Punjab', Experience=11)\r\n<\/pre>\n<\/a>Iterate over rows in dataframe as Dictionary<\/h2>\n
Using this method we can iterate over the rows of the dataframe and convert them to the dictionary for accessing by column label using the same itertuples().<\/p>\n
import pandas as pd\r\nemployees = [('Shikha', 34, 'Mumbai', 5) ,\r\n ('Rekha', 31, 'Delhi' , 7) ,\r\n ('Shishir', 16, 'Punjab', 11)\r\n ]\r\n# Create a DataFrame object\r\nempDfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])\r\n\r\n# itertuples() yields an iterate to named tuple\r\nfor row in empDfObj.itertuples(name='Employee'):\r\n # Convert named tuple to dictionary\r\n dictRow = row._asdict()\r\n # Print dictionary\r\n print(dictRow)\r\n # Access elements from dict i.e. row contents\r\n print(dictRow['Name'] , ' is from ' , dictRow['City'])<\/pre>\nOutput:<\/strong><\/p>\n
{'Index': 'a', 'Name': 'Shikha', 'Age': 34, 'City': 'Mumbai', 'Experience': 5}\r\nShikha is from Mumbai\r\n{'Index': 'b', 'Name': 'Rekha', 'Age': 31, 'City': 'Delhi', 'Experience': 7}\r\nRekha is from Delhi\r\n{'Index': 'c', 'Name': 'Shishir', 'Age': 16, 'City': 'Punjab', 'Experience': 11}\r\nShishir is from Punjab\r\n<\/pre>\n<\/a>Iterate over rows in dataframe using index position and iloc<\/h2>\n
We will loop through the 0th index to the last row and access each row by index position using iloc[].<\/p>\n
import pandas as pd\r\nemployees = [('Shikha', 34, 'Mumbai', 5) ,\r\n ('Rekha', 31, 'Delhi' , 7) ,\r\n ('Shishir', 16, 'Punjab', 11)\r\n ]\r\n# Create a DataFrame object\r\nempDfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])\r\n\r\n# Loop through rows of dataframe by index i.e. from 0 to number of rows\r\nfor i in range(0, empDfObj.shape[0]):\r\n # get row contents as series using iloc{] and index position of row\r\n rowSeries = empDfObj.iloc[i]\r\n # print row contents\r\n print(rowSeries.values)<\/pre>\nOutput:<\/strong><\/p>\n
['Shikha' 34 'Mumbai' 5]\r\n['Rekha' 31 'Delhi' 7]\r\n['Shishir' 16 'Punjab' 11]\r\n<\/pre>\n<\/a>Iterate over rows in dataframe in reverse using index position and iloc<\/h2>\n
Using this we will loop through the last index to the 0th index and access each row by index position using iloc[].<\/p>\n
import pandas as pd\r\nemployees = [('Shikha', 34, 'Mumbai', 5) ,\r\n ('Rekha', 31, 'Delhi' , 7) ,\r\n ('Shishir', 16, 'Punjab', 11)\r\n ]\r\n# Create a DataFrame object\r\nempDfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])\r\n\r\n# Loop through rows of dataframe by index in reverse i.e. from last row to row at 0th index.\r\nfor i in range(empDfObj.shape[0] - 1, -1, -1):\r\n # get row contents as series using iloc{] and index position of row\r\n rowSeries = empDfObj.iloc[i]\r\n # print row contents\r\n print(rowSeries.values)<\/pre>\nOutput:<\/strong><\/p>\n
['Shishir' 16 'Punjab' 11]\r\n['Rekha' 31 'Delhi' 7]\r\n['Shikha' 34 'Mumbai' 5]\r\n<\/pre>\n<\/a>Iterate over rows in dataframe using index labels and loc[]<\/h2>\n
import pandas as pd\r\nemployees = [('Shikha', 34, 'Mumbai', 5) ,\r\n ('Rekha', 31, 'Delhi' , 7) ,\r\n ('Shishir', 16, 'Punjab', 11)\r\n ]\r\n# Create a DataFrame object\r\nempDfObj = pd.DataFrame(employees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c'])\r\n\r\n# loop through all the names in index label sequence of dataframe\r\nfor index in empDfObj.index:\r\n # For each index label, access the row contents as series\r\n rowSeries = empDfObj.loc[index]\r\n # print row contents\r\n print(rowSeries.values)<\/pre>\nOutput:<\/strong><\/p>\n
['Shikha' 34 'Mumbai' 5]\r\n['Rekha' 31 'Delhi' 7]\r\n['Shishir' 16 'Punjab' 11]\r\n<\/pre>\n<\/a>Update contents a dataframe While iterating row by row<\/h2>\n
As Dataframe.iterrows() returns a copy of the dataframe contents in a tuple, so updating it will have no effect on the actual dataframe. So, to update the contents of the dataframe we need to iterate over the rows of the dataframe using iterrows() and then access each row using at() to update its contents.<\/p>\n
Let\u2019s see an example,<\/p>\n
Suppose we have a dataframe i.e<\/p>\n
import pandas as pd\r\n\r\n\r\n# List of Tuples\r\nsalaries = [(11, 5, 70000, 1000) ,\r\n (12, 7, 72200, 1100) ,\r\n (13, 11, 84999, 1000)\r\n ]\r\n# Create a DataFrame object\r\nsalaryDfObj = pd.DataFrame(salaries, columns=['ID', 'Experience' , 'Salary', 'Bonus'])<\/pre>\nOutput:<\/strong><\/p>\n
ID Experience Salary Bonus\r\n0 11 5 70000 1000\r\n1 12 7 72200 1100\r\n2 13 11 84999 1000\r\n<\/pre>\nNow we will update each value in column \u2018Bonus\u2019 by multiplying it with 2 while iterating over the dataframe row by row.<\/p>\n
import pandas as pd\r\n\r\n\r\n# List of Tuples\r\nsalaries = [(11, 5, 70000, 1000) ,\r\n (12, 7, 72200, 1100) ,\r\n (13, 11, 84999, 1000)\r\n ]\r\n# iterate over the dataframe row by row\r\nsalaryDfObj = pd.DataFrame(salaries, columns=['ID', 'Experience' , 'Salary', 'Bonus'])\r\nfor index_label, row_series in salaryDfObj.iterrows():\r\n # For each row update the 'Bonus' value to it's double\r\n salaryDfObj.at[index_label , 'Bonus'] = row_series['Bonus'] * 2\r\nprint(salaryDfObj)\r\n<\/pre>\nOutput:<\/strong><\/p>\n
ID Experience Salary Bonus\r\n0 11 5 70000 2000\r\n1 12 7 72200 2200\r\n2 13 11 84999 2000<\/pre>\nWant to expert in the python programming language? Exploring\u00a0Python Data Analysis using Pandas<\/a>\u00a0tutorial changes your knowledge from basic to advance level in python concepts.<\/p>\n
Read more Articles on Python Data Analysis Using Padas<\/strong><\/p>\n
\n
- How to merge Dataframes using Dataframe.merge() in Python?<\/a><\/li>\n
- How to merge Dataframes on specific columns or on index in Python?<\/a><\/li>\n
- How to merge Dataframes by index using Dataframe.merge()?<\/a><\/li>\n
- Count NaN or missing values in DataFrame<\/a><\/li>\n
- Count rows in a dataframe | all or those only that satisfy a condition<\/a><\/li>\n
- Loop or Iterate over all or certain columns of a DataFrame<\/a><\/li>\n
- How to display full Dataframe i.e. print all rows & columns without truncation<\/a><\/li>\n<\/ul>\n
Conclusion:<\/h3>\n
So in this article, you have seen different ways to iterate over rows in a dataframe & update while iterating row by row. Keep following our BtechGeeks for more concepts of python and various programming languages too.<\/p>\n","protected":false},"excerpt":{"rendered":"
In this tutorial, we will review & make you understand six different techniques to iterate over rows. Later we will also explain how to update the contents of a Dataframe while iterating over it row by row. Iterate over rows of a dataframe using DataFrame.iterrows() Iterate over rows of a dataframe using DataFrame.itertuples() Named Tuples …<\/p>\n