{"id":8588,"date":"2021-06-12T10:54:42","date_gmt":"2021-06-12T05:24:42","guid":{"rendered":"https:\/\/python-programs.com\/?p=8588"},"modified":"2021-11-22T18:40:40","modified_gmt":"2021-11-22T13:10:40","slug":"convert-pandas-dataframe-column-into-an-index-using-set_index-in-python","status":"publish","type":"post","link":"https:\/\/python-programs.com\/convert-pandas-dataframe-column-into-an-index-using-set_index-in-python\/","title":{"rendered":"Pandas : Convert Dataframe column into an index using set_index() in Python"},"content":{"rendered":"
In this article we will learn to convert an existing column of Dataframe to a index including various cases. We can implement this using Arguments:<\/strong><\/p>\n Let’s try to convert of column Name <\/em>into index of dataframe. We can implement this by passing that column name into Here it only it changes is made in the copy of dataframe without modifying original dataframe.<\/p>\n In this case we will try to keep the column name and also index as ‘Name’ by passing drop argument as false.<\/p>\n In above cases the index ‘SL’ is replaced. If we want to keep it we have to pass append argument as True.<\/p>\n If we wanted to check index doesn’t contain any duplicate values after converting a column to the index by passing \u00a0<\/strong>We can also make changes in existing dataframe. We can implement this by assign two methods-<\/p>\n Want to expert in the python programming language? Exploring\u00a0Python Data Analysis using Pandas<\/a>\u00a0tutorial changes your knowledge from basic to advance level in python concepts.<\/p>\n Read more Articles on Python Data Analysis Using Padas \u2013 Modify a Dataframe<\/strong><\/p>\n Converting Dataframe column into an index using set_index() in Python In this article we will learn to convert an existing column of Dataframe to a index including various cases. We can implement this using set_index() function of Pandas Dataframe class. Convert a column of Dataframe into an index of the Dataframe Convert a column of …<\/p>\nset_index()<\/code> function of Pandas Dataframe class.<\/p>\n
\n
DataFrame.set_index() :<\/h3>\n
Syntax:- DataFrame.set_index(self, keys, drop=True, append=False, inplace=False, verify_integrity=False)<\/pre>\n
\n
\n
\n
\n
\n
\n
import pandas as sc\r\n# List of Tuples\r\nplayers = [('Smith', 15, 'Pune', 170000),\r\n ('Rana', 99, 'Mumbai', 118560),\r\n ('Jaydev', 51, 'Kolkata', 258741),\r\n ('Shikhar', 31, 'Hyderabad', 485169),\r\n ('Sanju', 12, 'Rajasthan', 150000),\r\n ('Raina', 35, 'Gujarat', 250000)\r\n ]\r\n# Creation of DataFrame object\r\nplayDFObj = sc.DataFrame(players, columns=['Name', 'JerseyN', 'Team', 'Salary'])\r\n# Renaming index of dataframe as 'SL'\r\nplayDFObj.index.rename('SL', inplace=True)\r\nprint('Original Dataframe: ')\r\nprint(playDFObj)\r\n<\/pre>\n
Output :\r\nOriginal Dataframe: \r\nName JerseyN Team Salary\r\nSL \r\n0 Smith 15 Pune 170000\r\n1 Rana 99 Mumbai 118560\r\n2 Jaydev 51 Kolkata 258741\r\n3 Shikhar 31 Hyderabad 485169\r\n4 Sanju 12 Rajasthan 150000\r\n5 Raina 35 Gujarat 250000<\/pre>\n
<\/a>Converting a column of Dataframe into an index of the Dataframe :<\/h3>\n
set_index<\/code>. Here the column names would be converted to ‘Name’ deleting old index.<\/p>\n
import pandas as sc\r\n# List of Tuples\r\nplayers = [('Smith', 15, 'Pune', 170000),\r\n ('Rana', 99, 'Mumbai', 118560),\r\n ('Jaydev', 51, 'Kolkata', 258741),\r\n ('Shikhar', 31, 'Hyderabad', 485169),\r\n ('Sanju', 12, 'Rajasthan', 150000),\r\n ('Raina', 35, 'Gujarat', 250000)\r\n ]\r\n# Creation DataFrame object\r\nplayDFObj = sc.DataFrame(players, columns=['Name', 'JerseyN', 'Team', 'Salary'])\r\n# Renaming index of dataframe as 'SL'\r\nplayDFObj.index.rename('SL', inplace=True)\r\nprint('Original Dataframe: ')\r\nprint(playDFObj)\r\n# set column 'Name' as the index of the Dataframe\r\nmodifplayDF = playDFObj.set_index('Name')\r\nprint('Modified Dataframe of players:')\r\nprint(modifplayDF)\r\n<\/pre>\n
Output :\r\nOriginal Dataframe: \r\nName JerseyN Team Salary\r\nSL \r\n0 Smith\u00a0 \u00a015 Pune 170000\r\n1 Rana\u00a0 \u00a0 \u00a099 Mumbai 118560\r\n2 Jaydev\u00a0 51 Kolkata 258741\r\n3 Shikhar 31 Hyderabad 485169\r\n4 Sanju\u00a0 \u00a0 12 Rajasthan 150000\r\n5 Raina\u00a0 \u00a0 35 Gujarat 250000\r\n\r\nModified Dataframe of players:\r\nJerseyN Team Salary\r\nName \r\nSmith\u00a0 \u00a0 15 Pune 170000\r\nRana\u00a0 \u00a0 \u00a099 Mumbai 118560\r\nJaydev\u00a0 51 Kolkata 258741\r\nShikhar 31 Hyderabad 485169\r\nSanju\u00a0 \u00a0 12 Rajasthan 150000\r\nRaina\u00a0 \u00a0 35 Gujarat 250000<\/pre>\n
<\/a>Converting a column of Dataframe into index without deleting the column :<\/h3>\n
import pandas as sc\r\n# List of Tuples\r\nplayers = [('Smith', 15, 'Pune', 170000),\r\n ('Rana', 99, 'Mumbai', 118560),\r\n ('Jaydev', 51, 'Kolkata', 258741),\r\n ('Shikhar', 31, 'Hyderabad', 485169),\r\n ('Sanju', 12, 'Rajasthan', 150000),\r\n ('Raina', 35, 'Gujarat', 250000)\r\n ]\r\n# Creation of DataFrame object\r\nplayDFObj = sc.DataFrame(players, columns=['Name', 'JerseyN', 'Team', 'Salary'])\r\nplayDFObj.index.rename('ID', inplace=True)\r\n# keep column name and index as 'Name'\r\nmodifplayDF = playDFObj.set_index('Name', drop=False)\r\nprint('Modified Dataframe of players:')\r\nprint(modifplayDF)\r\n<\/pre>\n
Output :\r\nModified Dataframe of players:\r\nName\u00a0 JerseyN\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Team\u00a0 Salary\r\nName\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \r\nSmith\u00a0\u00a0\u00a0\u00a0\u00a0 Smith\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 15\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Pune\u00a0 170000\r\nRana\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Rana\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 99\u00a0\u00a0\u00a0\u00a0 Mumbai\u00a0 118560\r\nJaydev\u00a0\u00a0\u00a0 Jaydev\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 51\u00a0\u00a0\u00a0 Kolkata\u00a0 258741\r\nShikhar\u00a0 Shikhar\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 31\u00a0 Hyderabad\u00a0 485169\r\nSanju\u00a0\u00a0\u00a0\u00a0\u00a0 Sanju\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 12\u00a0 Rajasthan\u00a0 150000\r\nRaina\u00a0\u00a0\u00a0\u00a0\u00a0 Raina\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 35\u00a0\u00a0\u00a0 Gujarat\u00a0 250000<\/pre>\n
<\/a>Appending a Dataframe column of into index to make it Multi-Index Dataframe :<\/h3>\n
import pandas as sc\r\n# List of Tuples\r\nplayers = [('Smith', 15, 'Pune', 170000),\r\n ('Rana', 99, 'Mumbai', 118560),\r\n ('Jaydev', 51, 'Kolkata', 258741),\r\n ('Shikhar', 31, 'Hyderabad', 485169),\r\n ('Sanju', 12, 'Rajasthan', 150000),\r\n ('Raina', 35, 'Gujarat', 250000)\r\n ]\r\n# Creation DataFrame object\r\nplayDFObj = sc.DataFrame(players, columns=['Name', 'JerseyN', 'Team', 'Salary'])\r\nplayDFObj.index.rename('SL', inplace=True)\r\n# Making a mulit-index dataframe\r\nmodifplayDF = playDFObj.set_index('Name', append=True)\r\nprint('Modified Dataframe of players:')\r\nprint(modifplayDF)\r\n<\/pre>\n
Output :\r\nModified Dataframe of players:\r\nJerseyN\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Team\u00a0 Salary\r\nSL Name\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \r\n0\u00a0 Smith\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 15\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Pune\u00a0 170000\r\n1\u00a0 Rana\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 99\u00a0\u00a0\u00a0\u00a0 Mumbai\u00a0 118560\r\n2\u00a0 Jaydev\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 51\u00a0\u00a0\u00a0 Kolkata\u00a0 258741\r\n3\u00a0 Shikhar\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 31 \u00a0Hyderabad\u00a0 485169\r\n4\u00a0 Sanju\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 12\u00a0 Rajasthan\u00a0 150000\r\n5\u00a0 Raina\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 35\u00a0\u00a0\u00a0 Gujarat\u00a0 250000<\/pre>\n
<\/a>Checking for duplicates in the new index :<\/h3>\n
verify_integrity<\/code> as True in
set_index(<\/code>). If any duplicate value found error will be raised.<\/p>\n
import pandas as sc\r\n# List of Tuples\r\nplayers = [('Smith', 15, 'Pune', 170000),\r\n ('Rana', 99, 'Mumbai', 118560),\r\n ('Jaydev', 51, 'Kolkata', 258741),\r\n ('Shikhar', 31, 'Mumbai', 485169),\r\n ('Sanju', 12, 'Rajasthan', 150000),\r\n ('Raina', 35, 'Gujarat', 250000)\r\n ]\r\n# Creation of DataFrame object\r\nplayDFObj = sc.DataFrame(players, columns=['Name', 'JerseyN', 'Team', 'Salary'])\r\n# Rename index of dataframe as 'SL'\r\nplayDFObj.index.rename('SL', inplace=True)\r\nmodifplayDF = playDFObj.set_index('Team', verify_integrity=True)\r\nprint(modifplayDF)\r\n<\/pre>\n
Output :\r\nValueError: Index has duplicate keys<\/pre>\n
<\/a>Modifying existing Dataframe by converting into index :<\/h3>\n
\n
inplace<\/code> as True.<\/li>\n<\/ol>\n
import pandas as sc\r\n# List of Tuples\r\nplayers = [('Smith', 15, 'Pune', 170000),\r\n ('Rana', 99, 'Mumbai', 118560),\r\n ('Jaydev', 51, 'Kolkata', 258741),\r\n ('Shikhar', 31, 'Hyderabad', 485169),\r\n ('Sanju', 12, 'Rajasthan', 150000),\r\n ('Raina', 35, 'Gujarat', 250000)\r\n ]\r\n# Creation DataFrame object\r\nplayDFObj = sc.DataFrame(players, columns=['Name', 'JerseyN', 'Team', 'Salary'])\r\nplayDFObj.index.rename('SL', inplace=True)\r\nplayDFObj.set_index('Name', inplace=True)\r\nprint('Contenets of original dataframe :')\r\nprint(playDFObj)\r\n<\/pre>\n
Output :\r\nContenets of original dataframe :\r\nJerseyN Team Salary\r\nName \r\nSmith 15 Pune 170000\r\nRana 99 Mumbai 118560\r\nJaydev 51 Kolkata 258741\r\nShikhar 31 Hyderabad 485169\r\nSanju 12 Rajasthan 150000\r\nRaina 35 Gujarat 250000<\/pre>\n
\n