{"id":8609,"date":"2021-06-12T10:55:26","date_gmt":"2021-06-12T05:25:26","guid":{"rendered":"https:\/\/python-programs.com\/?p=8609"},"modified":"2021-11-22T18:40:40","modified_gmt":"2021-11-22T13:10:40","slug":"pandas-merge-dataframes-on-specific-columns-or-on-index-in-python-part-2","status":"publish","type":"post","link":"https:\/\/python-programs.com\/pandas-merge-dataframes-on-specific-columns-or-on-index-in-python-part-2\/","title":{"rendered":"Pandas : Merge Dataframes on specific columns or on index in Python \u2013 Part 2"},"content":{"rendered":"
In this article, we will learn to merge dataframes on basis of given columns or index.<\/p>\n
Dataframe class of Python’s Pandas library provide a function i.e. merge() <\/em>which helps in merging of two DataFrames.<\/p>\n Arguments:-<\/strong><\/p>\n Let’s take a scenario where the columns names are same, but contents are different i.e. one column data is of int type and other column data is of string type. And if we apply merge() <\/em>on them without passing any argument, it wouldn’t merge here. Here, we can merge dataframes on a single column by passing on <\/em>argument in merge() function.<\/p>\n And as both dataframes have common column i.e. sponsered, <\/em>so after merging columns are named by default. It will splitted by taking a suffix\u00a0 i.e. Sponsered_x and Sponsered_y as left and right dataframe respectively.<\/p>\n In previous example, for common columns with dissimilar contents suffix x & y are added. We can also add our own custom suffix.<\/p>\n Now let’s take a scenario of changing name of JersyN<\/em> column of a dataframe and try to merge it with another dataframe.<\/p>\n Want to expert in the python programming language? Exploring\u00a0Python Data Analysis using Pandas<\/a>\u00a0tutorial changes your knowledge from basic to advance level in python concepts.<\/p>\n Read more Articles on Python Data Analysis Using Padas<\/strong><\/p>\n Merge Dataframes on specific columns or on index in Python In this article, we will learn to merge dataframes on basis of given columns or index. Merging Dataframe on a given column name as join key Merging Dataframe on a given column with suffix for similar column names Merging Dataframe different columns Dataframe.merge() : Dataframe …<\/p>\nSyntax: DataFrame.merge(right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)<\/pre>\n
\n
how<\/code> value, columns will be considered accordingly for merging.<\/li>\n
<\/a>Merging Dataframe on a given column name as join key :<\/h3>\n
import pandas as sc\r\n# List of Tuples\r\nplayers = [(15,'Smith','Pune', 17,12000),\r\n (99,'Rana', 'Mumbai', 20,2000),\r\n (51,'Jaydev','Kolkata', 22,25640),\r\n (31,'Shikhar','Hyderabad', 28,85410),\r\n (12,'Sanju','Rajasthan', 21,63180),\r\n (35,'Raina','Gujarat', 18,62790)\r\n ]\r\n# Creation of DataFrame object\r\nplayDFObj = sc.DataFrame(players, columns=['JersyN','Name', 'Team', 'Age','Salary'], index=['I', 'II', 'III', 'IV', 'V', 'VI'])\r\nprint('DataFrame 1 : ')\r\nprint(playDFObj)\r\n# List of Tuples\r\nmoreInfo = [(15, 13, 180000, 'Nissin') ,\r\n (99, 2, 195200, 'Jio') ,\r\n (51, 7, 15499, 'Lays') ,\r\n (31, 17, 654000, 'AmbujaC') ,\r\n (12, 5, 201000, 'AsianP') ,\r\n (35, 14, 741000, 'Airtel')\r\n ]\r\n# Creation of DataFrame object\r\nmoreinfoObj = sc.DataFrame(moreInfo, columns=['JersyN', 'PLayingSince' , 'Salary', 'Sponsered'], index=['I', 'II', 'III', 'IV', 'V', 'VI'])\r\nprint('DataFrame 2 : ')\r\nprint(moreinfoObj)\r\n# Merge two Dataframes on basis of common column by default INNER JOIN\r\nmergedDataf = playDFObj.merge(moreinfoObj, on='JersyN')\r\nprint('After merging: ')\r\nprint(mergedDataf)<\/pre>\n
Output :\r\n\r\nDataFrame 1 : \r\n JersyN Name Team Age Salary\r\nI 15 Smith Pune 17 12000\r\nII 99 Rana Mumbai 20 2000\r\nIII 51 Jaydev Kolkata 22 25640\r\nIV 31 Shikhar Hyderabad 28 85410\r\nV 12 Sanju Rajasthan 21 63180\r\nVI 35 Raina Gujarat 18 62790\r\nDataFrame 2 : \r\n JersyN PLayingSince Salary Sponsered\r\nI 15 13 180000 Nissin\r\nII 99 2 195200 Jio\r\nIII 51 7 15499 Lays\r\nIV 31 17 654000 AmbujaC\r\nV 12 5 201000 AsianP\r\nVI 35 14 741000 Airtel\r\nAfter merging: \r\n JersyN Name Team Age Salary_x PLayingSince Salary_y Sponsered\r\n0 15 Smith Pune 17 12000 13 180000 Nissin\r\n1 99 Rana Mumbai 20 2000 2 195200 Jio\r\n2 51 Jaydev Kolkata 22 25640 7 15499 Lays\r\n3 31 Shikhar Hyderabad 28 85410 17 654000 AmbujaC\r\n4 12 Sanju Rajasthan 21 63180 5 201000 AsianP\r\n5 5 Raina Gujarat 18 62790 14 741000 Airtel<\/pre>\n
<\/a>Merging Dataframe on a given column with suffix for similar column names :<\/h3>\n
import pandas as sc\r\n# List of Tuples\r\nplayers = [(15,'Smith','Pune', 17,12000),\r\n (99,'Rana', 'Mumbai', 20,2000),\r\n (51,'Jaydev','Kolkata', 22,25640),\r\n (31,'Shikhar','Hyderabad', 28,85410),\r\n (12,'Sanju','Rajasthan', 21,63180),\r\n (35,'Raina','Gujarat', 18,62790)\r\n ]\r\n# Creation of DataFrame object\r\nplayDFObj = sc.DataFrame(players, columns=['JersyN','Name', 'Team', 'Age','Sponsered'], index=['I', 'II', 'III', 'IV', 'V', 'VI'])\r\nprint('DataFrame 1 : ')\r\nprint(playDFObj)\r\n# List of Tuples\r\nmoreInfo = [(15, 13, 180000, 'Nissin') ,\r\n (99, 2, 195200, 'Jio') ,\r\n (51, 7, 15499, 'Lays') ,\r\n (31, 17, 654000, 'AmbujaC') ,\r\n (12, 5, 201000, 'AsianP') ,\r\n (35, 14, 741000, 'Airtel')\r\n ]\r\n# Creation of DataFrame object\r\nmoreinfoObj = sc.DataFrame(moreInfo, columns=['JersyN', 'PLayingSince' , 'Salary', 'Sponsered'], index=['I', 'II', 'III', 'IV', 'V', 'VI'])\r\nprint('DataFrame 2 : ')\r\nprint(moreinfoObj)\r\n# Merge two Dataframes on basis of common column by default INNER JOIN\r\nmergedDataf = playDFObj.merge(moreinfoObj, on='JersyN',suffixes=('_Price', '_Companies'))\r\nprint('After merging: ')\r\nprint(mergedDataf)\r\n<\/pre>\n
Output :\r\nDataFrame 1 : \r\n JersyN Name Team Age Sponsered\r\nI 15 Smith Pune 17 12000\r\nII 99 Rana Mumbai 20 2000\r\nIII 51 Jaydev Kolkata 22 25640\r\nIV 31 Shikhar Hyderabad 28 85410\r\nV 12 Sanju Rajasthan 21 63180\r\nVI 35 Raina Gujarat 18 62790\r\nDataFrame 2 : \r\n JersyN PLayingSince Salary Sponsered\r\nI 15 13 180000 Nissin\r\nII 99 2 195200 Jio\r\nIII 51 7 15499 Lays\r\nIV 31 17 654000 AmbujaC\r\nV 12 5 201000 AsianP\r\nVI 35 14 741000 Airtel\r\nAfter merging: \r\nJersyN Name Team ... PLayingSince Salary Sponsered_Companies\r\n0 15 Smith Pune ... 13 180000 Nissin\r\n1 99 Rana Mumbai ... 2 195200 Jio\r\n2 51 Jaydev Kolkata ... 7 15499 Lays\r\n3 31 Shikhar Hyderabad ... 17 654000 AmbujaC\r\n4 12 Sanju Rajasthan ... 5 201000 AsianP\r\n5 35 Raina Gujarat ... 14 741000 Airtel\r\n\r\n<\/pre>\n
<\/a>Merging Dataframe different columns :<\/h3>\n
import pandas as sc\r\n# List of Tuples\r\nplayers = [(15,'Smith','Pune', 17,12000),\r\n (99,'Rana', 'Mumbai', 20,2000),\r\n (51,'Jaydev','Kolkata', 22,25640),\r\n (31,'Shikhar','Hyderabad', 28,85410),\r\n (12,'Sanju','Rajasthan', 21,63180),\r\n (35,'Raina','Gujarat', 18,62790)\r\n ]\r\n# Creation of DataFrame object\r\nplayDFObj = sc.DataFrame(players, columns=['JersyN','Name', 'Team', 'Age','Salary'], index=['I', 'II', 'III', 'IV', 'V', 'VI'])\r\nprint('DataFrame 1 : ')\r\nprint(playDFObj)\r\n# List of Tuples\r\nmoreInfo = [(15, 13, 180000, 'Nissin') ,\r\n (99, 2, 195200, 'Jio') ,\r\n (51, 7, 15499, 'Lays') ,\r\n (31, 17, 654000, 'AmbujaC') ,\r\n (12, 5, 201000, 'AsianP') ,\r\n (35, 14, 741000, 'Airtel')\r\n ]\r\n# Creation of DataFrame object\r\nmoreinfoObj = sc.DataFrame(moreInfo, columns=['JersyN', 'PLayingSince' , 'Salary', 'Sponsered'], index=['I', 'II', 'III', 'IV', 'V', 'VI'])\r\nprint('DataFrame 2 : ')\r\nprint(moreinfoObj)\r\n# Rename column JersyN to ShirtN\r\nmoreinfoObj.rename(columns={'JersyN': 'ShirtN'}, inplace=True)\r\n# Merge two Dataframes on basis of common column by default INNER JOIN\r\nmergedDataf = playDFObj.merge(moreinfoObj, left_on='JersyN', right_on='ShirtN')\r\nprint('After merging: ')\r\nprint(mergedDataf)\r\n<\/pre>\n
Output ;\r\nDataFrame 1 : \r\n JersyN Name Team Age Salary\r\nI 15 Smith Pune 17 12000\r\nII 99 Rana Mumbai 20 2000\r\nIII 51 Jaydev Kolkata 22 25640\r\nIV 31 Shikhar Hyderabad 28 85410\r\nV 12 Sanju Rajasthan 21 63180\r\nVI 35 Raina Gujarat 18 62790\r\nDataFrame 2 : \r\n JersyN PLayingSince Salary Sponsered\r\nI 15 13 180000 Nissin\r\nII 99 2 195200 Jio\r\nIII 51 7 15499 Lays\r\nIV 31 17 654000 AmbujaC\r\nV 12 5 201000 AsianP\r\nVI 35 14 741000 Airtel\r\nAfter merging: \r\nJersyN Name Team Age ... ShirtN PLayingSince Salary Sponsered_y\r\n0 15 Smith Pune 17 ... 15 13 180000 Nissin\r\n1 99 Rana Mumbai 20 ... 99 2 195200 Jio\r\n2 51 Jaydev Kolkata 22 ... 51 7 15499 Lays\r\n3 31 Shikhar Hyderabad 28 ... 31 17 654000 AmbujaC\r\n4 12 Sanju Rajasthan 21 ... 12 5 201000 AsianP\r\n5 35 Raina Gujarat 18 ... 35 14 741000 Airtel\r\n\r\n[6 rows x 9 columns]<\/pre>\n
\n