{"id":5220,"date":"2023-10-28T19:08:27","date_gmt":"2023-10-28T13:38:27","guid":{"rendered":"https:\/\/python-programs.com\/?p=5220"},"modified":"2023-11-10T12:03:49","modified_gmt":"2023-11-10T06:33:49","slug":"python-add-column-to-dataframe-in-pandas-based-on-other-column-or-list-or-default-value","status":"publish","type":"post","link":"https:\/\/python-programs.com\/python-add-column-to-dataframe-in-pandas-based-on-other-column-or-list-or-default-value\/","title":{"rendered":"Python: Add column to dataframe in Pandas ( based on other column or list or default value)"},"content":{"rendered":"
In this tutorial, we are going to discuss different ways to add columns to the dataframe in pandas. Moreover, you can have an idea about the Pandas Add Column, Adding a new column to the existing DataFrame in Pandas and many more from the below explained various methods.<\/p>\n
Pandas is one such data analytics library created explicitly for Python to implement data manipulation and data analysis. The Pandas library made of specific data structures and operations to deal with numerical tables, analyzing data, and work with time series.<\/p>\n
Basically, there are three ways to add columns to pandas i.e., Using [] operator, using assign() function & using insert().<\/p>\n
We will discuss it all one by one.<\/p>\n
First, let’s create a dataframe object,<\/p>\n
import pandas as pd\r\n# List of Tuples\r\nstudents = [('Rakesh', 34, 'Agra', 'India'),\r\n ('Rekha', 30, 'Pune', 'India'),\r\n ('Suhail', 31, 'Mumbai', 'India'),\r\n ('Neelam', 32, 'Bangalore', 'India'),\r\n ('Jay', 16, 'Bengal', 'India'),\r\n ('Mahak', 17, 'Varanasi', 'India')]\r\n# Create a DataFrame object\r\ndf_obj = pd.DataFrame(students,\r\n columns=['Name', 'Age', 'City', 'Country'],\r\n index=['a', 'b', 'c', 'd', 'e', 'f'])\r\nprint(df_obj )<\/pre>\nOutput:<\/strong><\/p>\n
Name\u00a0 \u00a0 Age\u00a0 City\u00a0 \u00a0 \u00a0 \u00a0 Country\r\na\u00a0 Rakesh\u00a0 34\u00a0 \u00a0 \u00a0Agra\u00a0 \u00a0 \u00a0 \u00a0 India\r\nb\u00a0 Rekha\u00a0 \u00a030\u00a0 \u00a0 \u00a0Pune\u00a0 \u00a0 \u00a0 \u00a0 India\r\nc\u00a0 Suhail\u00a0 \u00a0 31\u00a0 \u00a0 Mumbai\u00a0 \u00a0India\r\nd\u00a0 Neelam 32\u00a0 \u00a0Bangalore India\r\ne\u00a0 Jay\u00a0 \u00a0 \u00a0 \u00a0 \u00a016\u00a0 \u00a0Bengal\u00a0 \u00a0 \u00a0 India\r\nf\u00a0 Mahak\u00a0 \u00a0 17\u00a0 Varanasi\u00a0 \u00a0 \u00a0India\r\n<\/pre>\nDo Check:<\/span><\/p>\n
\n
- Pandas: Delete last column of dataframe in python<\/a><\/li>\n
- Pandas: Loop or Iterate over all or certain columns of a dataframe<\/a><\/li>\n<\/ul>\n
<\/a>Add column to dataframe in pandas using [] operator<\/h3>\n
import pandas as pd\r\n# List of Tuples\r\nstudents = [('Rakesh', 34, 'Agra', 'India'),\r\n ('Rekha', 30, 'Pune', 'India'),\r\n ('Suhail', 31, 'Mumbai', 'India'),\r\n ('Neelam', 32, 'Bangalore', 'India'),\r\n ('Jay', 16, 'Bengal', 'India'),\r\n ('Mahak', 17, 'Varanasi', 'India')]\r\n# Create a DataFrame object\r\ndf_obj = pd.DataFrame(students,\r\n columns=['Name', 'Age', 'City', 'Country'],\r\n index=['a', 'b', 'c', 'd', 'e', 'f'])\r\n\r\n# Add column with Name Score\r\ndf_obj['Score'] = [10, 20, 45, 33, 22, 11]\r\nprint(df_obj )<\/pre>\nOutput:<\/strong><\/p>\n
Name Age City Country Score\r\na Rakesh 34 Agra India 10\r\nb Rekha 30 Pune India 20\r\nc Suhail 31 Mumbai India 45\r\nd Neelam 32 Bangalore India 33\r\ne Jay 16 Bengal India 22\r\nf Mahak 17 Varanasi India 11\r\n\r\n<\/pre>\nSo in the above example, you have seen we have added one extra column ‘score’ in our dataframe. So in this, we add a new column to Dataframe with Values in the list. In the above dataframe, there is no column name ‘score’ that’s why it added if there is any column with the same name that already exists then it will replace all its values.<\/p>\n
<\/a>Add new column to DataFrame with same default value<\/h3>\n
import pandas as pd\r\n# List of Tuples\r\nstudents = [('Rakesh', 34, 'Agra', 'India'),\r\n ('Rekha', 30, 'Pune', 'India'),\r\n ('Suhail', 31, 'Mumbai', 'India'),\r\n ('Neelam', 32, 'Bangalore', 'India'),\r\n ('Jay', 16, 'Bengal', 'India'),\r\n ('Mahak', 17, 'Varanasi', 'India')]\r\n# Create a DataFrame object\r\ndf_obj = pd.DataFrame(students,\r\n columns=['Name', 'Age', 'City', 'Country'],\r\n index=['a', 'b', 'c', 'd', 'e', 'f'])\r\n\r\ndf_obj['Total'] = 100\r\nprint(df_obj)<\/pre>\nOutput:<\/strong><\/p>\n
Name\u00a0 \u00a0 Age\u00a0 \u00a0 City\u00a0 \u00a0 \u00a0Country\u00a0 \u00a0 \u00a0 Total\r\na\u00a0 \u00a0 \u00a0 \u00a0Rakesh\u00a0 \u00a034\u00a0 \u00a0 Agra\u00a0 \u00a0 \u00a0 \u00a0India\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 100\r\nb\u00a0 \u00a0 \u00a0 \u00a0Rekha\u00a0 \u00a0 30\u00a0 \u00a0 Pune\u00a0 \u00a0 \u00a0 \u00a0India\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 100\r\nc\u00a0 \u00a0 \u00a0 \u00a0Suhail\u00a0 \u00a0 \u00a031\u00a0 \u00a0Mumbai\u00a0 India\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 100\r\nd\u00a0 \u00a0 \u00a0 \u00a0Neelam\u00a0 32\u00a0 Bangalore India\u00a0 \u00a0 \u00a0 \u00a0 100\r\ne\u00a0 \u00a0 \u00a0 \u00a0Jay\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 16\u00a0 Bengal\u00a0 \u00a0 \u00a0 \u00a0India\u00a0 \u00a0 \u00a0 \u00a0 100\r\nf\u00a0 \u00a0 \u00a0 \u00a0Mahak\u00a0 \u00a0 \u00a017 Varanasi\u00a0 \u00a0 \u00a0India\u00a0 \u00a0 \u00a0 \u00a0 100\r\n<\/pre>\nSo in the above example, we have added a new column \u2018Total\u2019 with the same value of 100 in each index.<\/p>\n
<\/a>Add column based on another column<\/h3>\n
Let\u2019s add a new column \u2018Percentage\u2018 where entrance at each index will be added by the values in other columns at that index i.e.,<\/p>\n
df_obj['Percentage'] = (df_obj['Marks'] \/ df_obj['Total']) * 100\r\ndf_obj<\/pre>\nOutput:<\/strong><\/p>\n
Name Age City Country Marks Total Percentage\r\na jack 34 Sydeny Australia 10 50 20.0\r\nb Riti 30 Delhi India 20 50 40.0\r\nc Vikas 31 Mumbai India 45 50 90.0\r\nd Neelu 32 Bangalore India 33 50 66.0\r\ne John 16 New York US 22 50 44.0\r\nf Mike 17 las vegas US 11 50 22.0<\/pre>\n<\/a>Append column to dataFrame using assign() function<\/h3>\n
So for this, we are going to use the same dataframe which we have created in starting.<\/p>\n
Syntax:<\/strong><\/p>\n
DataFrame.assign(**kwargs)<\/code><\/p>\n
Let\u2019s add columns in DataFrame using assign().<\/p>\n
import pandas as pd\r\n# List of Tuples\r\nstudents = [('Rakesh', 34, 'Agra', 'India'),\r\n ('Rekha', 30, 'Pune', 'India'),\r\n ('Suhail', 31, 'Mumbai', 'India'),\r\n ('Neelam', 32, 'Bangalore', 'India'),\r\n ('Jay', 16, 'Bengal', 'India'),\r\n ('Mahak', 17, 'Varanasi', 'India')]\r\n# Create a DataFrame object\r\ndf_obj = pd.DataFrame(students,\r\n columns=['Name', 'Age', 'City', 'Country'],\r\n index=['a', 'b', 'c', 'd', 'e', 'f'])\r\nmod_fd = df_obj.assign(Marks=[10, 20, 45, 33, 22, 11])\r\nprint(mod_fd)\r\n<\/pre>\nOutput:<\/strong><\/p>\n
<\/p>\n
It will return a new dataframe with a new column \u2018Marks\u2019 in that Dataframe. Values provided in the list will be used as column values.<\/p>\n
<\/a><\/a>Add column in DataFrame based on other column using lambda function<\/h3>\n
In this method using two existing columns i.e, score and total value we are going to create a new column i.e..’ percentage’.<\/p>\n
import pandas as pd\r\n# List of Tuples\r\nstudents = [('Rakesh', 34, 'Agra', 'India'),\r\n ('Rekha', 30, 'Pune', 'India'),\r\n ('Suhail', 31, 'Mumbai', 'India'),\r\n ('Neelam', 32, 'Bangalore', 'India'),\r\n ('Jay', 16, 'Bengal', 'India'),\r\n ('Mahak', 17, 'Varanasi', 'India')]\r\n# Create a DataFrame object\r\ndf_obj = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Country'],\r\n index=['a', 'b', 'c', 'd', 'e', 'f'])\r\ndf_obj['Score'] = [10, 20, 45, 33, 22, 11]\r\ndf_obj['Total'] = 100\r\ndf_obj = df_obj.assign(Percentage=lambda x: (x['Score'] \/ x['Total']) * 100)\r\nprint(df_obj)\r\n<\/pre>\nOutput:<\/strong><\/p>\n
<\/p>\n
<\/a>Add new column to Dataframe using insert()<\/h3>\n
import pandas as pd\r\n# List of Tuples\r\nstudents = [('Rakesh', 34, 'Agra', 'India'),\r\n ('Rekha', 30, 'Pune', 'India'),\r\n ('Suhail', 31, 'Mumbai', 'India'),\r\n ('Neelam', 32, 'Bangalore', 'India'),\r\n ('Jay', 16, 'Bengal', 'India'),\r\n ('Mahak', 17, 'Varanasi', 'India')]\r\n# Create a DataFrame object\r\ndf_obj = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Country'],\r\n index=['a', 'b', 'c', 'd', 'e', 'f'])\r\n# Insert column at the 2nd position of Dataframe\r\ndf_obj.insert(2, \"Marks\", [10, 20, 45, 33, 22, 11], True)\r\nprint(df_obj)\r\n<\/pre>\nOutput:<\/strong><\/p>\n
<\/p>\n
<\/p>\n
In other examples, we have added a new column at the end of the dataframe, but in the above example, we insert a new column in between the other columns of the dataframe, then we can use the insert() function.<\/p>\n
<\/a>Add a column to Dataframe by dictionary<\/h3>\n
import pandas as pd\r\n# List of Tuples\r\nstudents = [('Rakesh', 34, 'Agra', 'India'),\r\n ('Rekha', 30, 'Pune', 'India'),\r\n ('Suhail', 31, 'Mumbai', 'India'),\r\n ('Neelam', 32, 'Bangalore', 'India'),\r\n ('Jay', 16, 'Bengal', 'India'),\r\n ('Mahak', 17, 'Varanasi', 'India')]\r\n# Create a DataFrame object\r\ndf_obj = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Country'],\r\n index=['a', 'b', 'c', 'd', 'e', 'f'])\r\nids = [11, 12, 13, 14, 15, 16]\r\n# Provide 'ID' as the column name and for values provide dictionary\r\ndf_obj['ID'] = dict(zip(ids, df_obj['Name']))\r\nprint(df_obj)\r\n<\/pre>\nOutput:<\/strong><\/p>\n
<\/p>\n
Want to expert in the python programming language? Exploring\u00a0Python Data Analysis using Pandas<\/a>\u00a0tutorial changes your knowledge from basic to advance level in python concepts.<\/p>\n
Read more Articles on Python Data Analysis Using Padas \u2013 Add Contents to a Dataframe<\/strong><\/p>\n