Python

Pandas: Dataframe.fillna()

Dataframe.fillna() in Dataframes using Python

In this article, we will discuss how to use Dataframe.fillna() method with examples, like how to replace NaN values in a complete dataframe or some specific rows/columns

Dataframe.fillna()

Dataframe.fillna() is used to fill NaN values with some other values in Dataframe. This method widely came into use when there are fewer NaN values in any column so instead of dropping the whole column we replace the NaN or missing values of that column with some other values.

Syntax: DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None)

Parameters

1) Value: This parameter contains the values that we want to fill instead of NaN values. By default value is None.

2) method: The method parameter is used when the value doesn’t pass. There are different methods like backfill,bfill, etc. By default method is None.

3) axis: axis=1 means fill NaN values in columns and axis=0 means fill NaN values in rows.

4) inplace: It is a boolean which makes the changes in dataframe itself if True.

Different methods to use Dataframe.fillna() method

  • Method 1: Replace all NaN values in Dataframe

In this method, we normally pass some value in the value parameter and all the NaN values will be replaced with that value. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 95) ,
            ('Rahul', np.NaN,97) ,
            ('Aadi', 22,81) ,
            ('Abhay', np.NaN,np.NaN) ,
            ('Ajjet', 21,74),
            ('Amar',np.NaN,np.NaN),
            ('Aman',np.NaN,76)]
# Create a DataFrame object
df = pd.DataFrame(  students, 
                    columns=['Name', 'Age','Marks'])
print("Original Dataframe\n")
print(df,'\n')
new_df=df.fillna(0)
print("New Dataframe\n")
print(new_df)

Output

Original Dataframe

    Name   Age  Marks
0    Raj  24.0   95.0
1  Rahul   NaN   97.0
2   Aadi  22.0   81.0
3  Abhay   NaN    NaN
4  Ajjet  21.0   74.0
5   Amar   NaN    NaN
6   Aman   NaN   76.0 

New Dataframe

    Name   Age  Marks
0    Raj  24.0   95.0
1  Rahul   0.0   97.0
2   Aadi  22.0   81.0
3  Abhay   0.0    0.0
4  Ajjet  21.0   74.0
5   Amar   0.0    0.0
6   Aman   0.0   76.0

Here we see that we replace all NaN values with 0.

  • Method 2- Replace all NaN values in specific columns

In this method, we replace all NaN values with some other values but only in specific columns not on the whole dataframe.

import pandas as pd
import numpy as np
students = [('Raj', 24, 95) ,
            ('Rahul', np.NaN,97) ,
            ('Aadi', 22,81) ,
            ('Abhay', np.NaN,np.NaN) ,
            ('Ajjet', 21,74),
            ('Amar',np.NaN,np.NaN),
            ('Aman',np.NaN,76)]
# Create a DataFrame object
df = pd.DataFrame(  students, 
                    columns=['Name', 'Age','Marks'])
print("Original Dataframe\n")
print(df,'\n')
df['Age'].fillna(0,inplace=True)
print("New Dataframe\n")
print(df)

Output

Original Dataframe

    Name   Age  Marks
0    Raj  24.0   95.0
1  Rahul   NaN   97.0
2   Aadi  22.0   81.0
3  Abhay   NaN    NaN
4  Ajjet  21.0   74.0
5   Amar   NaN    NaN
6   Aman   NaN   76.0 

New Dataframe

    Name   Age  Marks
0    Raj  24.0   95.0
1  Rahul   0.0   97.0
2   Aadi  22.0   81.0
3  Abhay   0.0    NaN
4  Ajjet  21.0   74.0
5   Amar   0.0    NaN
6   Aman   0.0   76.0

Here we see that the NaN value only in the Age column replaces with 0. Here we use inplace=’true’ because we want changes to be made in the original dataframe.

  • Method 3- Replace NaN values of one column with values of other columns

Here we pass the column in the value parameter of which we want the value to be copied.Let see this with help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 95) ,
            ('Rahul', np.NaN,97) ,
            ('Aadi', 22,81) ,
            ('Abhay', np.NaN,87) ,
            ('Ajjet', 21,74),
            ('Amar',np.NaN,76),
            ('Aman',np.NaN,76)]
# Create a DataFrame object
df = pd.DataFrame(  students, 
                    columns=['Name', 'Age','Marks'])
print("Original Dataframe\n")
print(df,'\n')
df['Age'].fillna(value=df['Marks'],inplace=True)
print("New Dataframe\n")
print(df)

Output

Original Dataframe

    Name   Age  Marks
0    Raj  24.0     95
1  Rahul   NaN     97
2   Aadi  22.0     81
3  Abhay   NaN     87
4  Ajjet  21.0     74
5   Amar   NaN     76
6   Aman   NaN     76 

New Dataframe

    Name   Age  Marks
0    Raj  24.0     95
1  Rahul  97.0     97
2   Aadi  22.0     81
3  Abhay  87.0     87
4  Ajjet  21.0     74
5   Amar  76.0     76
6   Aman  76.0     76

Here we see NaN values of the Age column are replaced with non NaN value of the Marks Column.

  • Method 4-Replace NaN values in specific rows

To replace NaN values in a row we need to use .loc[‘index name’] to access a row in a dataframe, then we will call the fillna() function on that row. Let see this with help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 95) ,
            ('Rahul', np.NaN,97) ,
            ('Aadi', 22,81) ,
            ('Abhay', np.NaN,87) ,
            ('Ajjet', 21,74),
            ('Amar',np.NaN,76),
            ('Aman',np.NaN,76)]
# Create a DataFrame object
df = pd.DataFrame(  students, 
                    columns=['Name', 'Age','Marks'])
print("Original Dataframe\n")
print(df,'\n')
df.loc[1]=df.loc[1].fillna(value=0)
print("New Dataframe\n")
print(df)

Output

Original Dataframe

    Name   Age  Marks
0    Raj  24.0     95
1  Rahul   NaN     97
2   Aadi  22.0     81
3  Abhay   NaN     87
4  Ajjet  21.0     74
5   Amar   NaN     76
6   Aman   NaN     76 

New Dataframe

    Name   Age  Marks
0    Raj  24.0     95
1  Rahul   0.0     97
2   Aadi  22.0     81
3  Abhay   NaN     87
4  Ajjet  21.0     74
5   Amar   NaN     76
6   Aman   NaN     76

So these are some of the ways to use Dataframe.fillna().

Pandas: Dataframe.fillna() Read More »

Get Rows And Columns Names In Dataframe Using Python

Methods to get rows and columns names in dataframe

In this we will study different methods to get rows and column names in a dataframe.

Methods to get column name in dataframe

  • Method 1: By iterating over columns

In this method, we will simply be iterating over all the columns and print the names of each column. Point to remember that dataframe_name. columns give a list of columns.Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print(df.columns,'\n')
print("columns are:")
for column in df.columns:
  print(column,end=" ")

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

Index(['Name', 'Age', 'City', 'Marks'], dtype='object') 

columns are:
Name Age City Marks 

Here we see that df. columns give a list of columns and by iterating over this list we can easily get column names.

  • Method 2-Using columns.values

columns. values return an array of column names. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("columns are:")
print(df.columns.values,'\n')

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

columns are:
['Name' 'Age' 'City' 'Marks'] 
  • Method 3- using tolist() method

Using tolist() method with values with given the list of columns. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("columns are:")
print(df.columns.values.tolist(),'\n')

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

columns are:
['Name', 'Age', 'City', 'Marks'] 
  • Method 4- Access specific column name using index

As we know that columns. values give an array of columns and we can access array elements using an index. So in this method, we use this concept. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("columns at second index:")
print(df.columns.values[2],'\n')

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

columns at second index:
City 

So these are the methods to get column names.

Method to get rows name in dataframe

  • Method 1-Using index.values

As columns., values give a list or array of columns similarly index. values give a list of array of indexes. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("Rows are:")
print(df.index.values,'\n')

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

Rows are:
[0 1 2 3 4] 
  • Method 2- Get Row name at a specific index

As we know that index. values give an array of indexes and we can access array elements using an index. So in this method, we use this concept. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
('Rahul', 21, 'Delhi' , 97) , 
('Aadi', 22, 'Kolkata', 81) , 
('Abhay', 24,'Rajasthan' ,76) , 
('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("Row at index 2:")
print(df.index.values[2],'\n')

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

Row at index 2:
2 
  • Method 3-By iterating over indices

As dataframe_names.columns give a list of columns similarly dataframe_name.index gives the list of indexes. Hence we can simply be iterating over all lists of indexes and print rows names. Let see this with help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("List of indexes:")
print(df.index,'\n')
print("Indexes or rows names are:")
for row in df.index:
  print(row,end=" ")

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

List of indexes:
RangeIndex(start=0, stop=5, step=1) 

Indexes or rows names are:
0 1 2 3 4 

So these are the methods to get rows and column names in the dataframe using python.

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Select items from a Dataframe

Get Rows And Columns Names In Dataframe Using Python Read More »

Pandas: Sum rows in Dataframe ( all or certain rows)

Sum rows in Dataframe ( all or certain rows) in Python

In this article we will discuss how we can merge rows into a dataframe and add values ​​as a new queue to the same dataframe.

So, let’s start exploring the topic.

First, we will build a Dataframe,

import pandas as pd
import numpy as np
# The List of Tuples
salary_of_employees = [('Amit', 2000, 2050, 1099, 2134, 2111),
                    ('Rabi', 2122, 3022, 3456, 3111, 2109),
                    ('Abhi', np.NaN, 2334, 2077, np.NaN, 3122),
                    ('Naresh', 3050, 3050, 2010, 2122, 1111),
                    ('Suman', 2023, 2232, 3050, 2123, 1099),
                    ('Viroj', 2050, 2510, np.NaN, 3012, 2122),
                    ('Nabin', 4000, 2000, 2050, np.NaN, 2111)]
# By Creating a DataFrame object from list of tuples
test = pd.DataFrame(salary_of_employees,
                  columns=['Name',  'Jan', 'Feb', 'March', 'April', 'May'])
# To Set column Name as the index of dataframe
test.set_index('Name', inplace=True)
print(test)
Output :
             Jan           Feb        March    April         May
Name 
Amit     2000.0     2050       1099.0    2134.0     2111
Rabi      2122.0    3022       3456.0     3111.0    2109
Abhi      NaN       2334       2077.0     NaN       3122
Naresh  3050.0     3050      2010.0     2122.0    1111
Suman  2023.0     2232      3050.0     2123.0    1099
Viroj     2050.0     2510       NaN        3012.0    2122
Nabin   4000.0    2000       2050.0     NaN        2111

This Dataframe contains employee salaries from January to May. We’ve created a column name as a data name index. Each line of this dataframe contains the employee’s salary from January to May.

Get the sum of all rows in a Pandas Dataframe :

Let’s say in the above dataframe, we want to get details about the total salary paid each month. Basically, we want a Series that contains the total number of rows and columns eg. each item in the Series should contain a total column value.

Let’s see how we can find that series,

import pandas as pd
import numpy as np
# The List of Tuples
salary_of_employees = [('Amit', 2000, 2050, 1099, 2134, 2111),
                    ('Rabi', 2122, 3022, 3456, 3111, 2109),
                    ('Abhi', np.NaN, 2334, 2077, np.NaN, 3122),
                    ('Naresh', 3050, 3050, 2010, 2122, 1111),
                    ('Suman', 2023, 2232, 3050, 2123, 1099),
                    ('Viroj', 2050, 2510, np.NaN, 3012, 2122),
                    ('Nabin', 4000, 2000, 2050, np.NaN, 2111)]
# By Creating a DataFrame object from list of tuples
test = pd.DataFrame(salary_of_employees,
                  columns=['Name',  'Jan', 'Feb', 'March', 'April', 'May'])
# To Set column Name as the index of dataframe
test.set_index('Name', inplace=True)


#By getting sum of all rows in the Dataframe as a Series
total = test.sum()
print('Total salary paid in each month:')
print(total)
Output :
Total salary paid in each month:
Jan 15245.0
Feb 17198.0
March 13742.0
April 12502.0
May 13785.0
dtype: float64

We have called the sum() function in dataframe without parameter. So, it automatically considered the axis as 0 and added all the columns wisely i.e. added all values ​​to each column and returned a string item containing those values. Each item in this series item contains the total amount paid in monthly installments and the name of the month in the index label for that entry.

We can add this Series as a new line to the dataframe i.e.

import pandas as pd
import numpy as np
# The List of Tuples
salary_of_employees = [('Amit', 2000, 2050, 1099, 2134, 2111),
                    ('Rabi', 2122, 3022, 3456, 3111, 2109),
                    ('Abhi', np.NaN, 2334, 2077, np.NaN, 3122),
                    ('Naresh', 3050, 3050, 2010, 2122, 1111),
                    ('Suman', 2023, 2232, 3050, 2123, 1099),
                    ('Viroj', 2050, 2510, np.NaN, 3012, 2122),
                    ('Nabin', 4000, 2000, 2050, np.NaN, 2111)]
# By Creating a DataFrame object from list of tuples
test = pd.DataFrame(salary_of_employees,
                  columns=['Name',  'Jan', 'Feb', 'March', 'April', 'May'])
# To Set column Name as the index of dataframe
test.set_index('Name', inplace=True)




# By getting sum of all rows as a new row in Dataframe
total = test.sum()
total.name = 'Total'
# By assignimg sum of all rows of DataFrame as a new Row
test = test.append(total.transpose())
print(test)
Output :
                 Jan        Feb         March         April          May
Name 
Amit        2000.0   2050.0    1099.0        2134.0     2111.0
Rabi        2122.0    3022.0    3456.0       3111.0     2109.0
Abhi       NaN       2334.0     2077.0       NaN        3122.0
Naresh   3050.0    3050.0     2010.0       2122.0    1111.0
Suman    2023.0   2232.0     3050.0       2123.0    1099.0
Viroj      2050.0     2510.0     NaN          3012.0     2122.0
Nabin    4000.0    2000.0     2050.0       NaN         2111.0
Total     15245.0   17198.0   13742.0    12502.0    13785.0

Added a new line to the dataframe and ‘Total’ reference label. Each entry in this line contains the amount of details paid per month.

How did it work?

We have passed the Series to create a Dataframe in one line. All references in the series became columns in the new dataframe. Then add this new data name to the original dataframe. The result was that I added a new line to the dataframe.

Get Sum of certain rows in Dataframe by row numbers :

In the previous example we added all the rows of data but what if we want to get a total of only a few rows of data? As with the data above we want the total value in the top 3 lines eg to get the total monthly salary for only 3 employees from the top,

import pandas as pd
import numpy as np
# The List of Tuples
salary_of_employees = [('Amit', 2000, 2050, 1099, 2134, 2111),
                    ('Rabi', 2122, 3022, 3456, 3111, 2109),
                    ('Abhi', np.NaN, 2334, 2077, np.NaN, 3122),
                    ('Naresh', 3050, 3050, 2010, 2122, 1111),
                    ('Suman', 2023, 2232, 3050, 2123, 1099),
                    ('Viroj', 2050, 2510, np.NaN, 3012, 2122),
                    ('Nabin', 4000, 2000, 2050, np.NaN, 2111)]
# By Creating a DataFrame object from list of tuples
test = pd.DataFrame(salary_of_employees,
                  columns=['Name',  'Jan', 'Feb', 'March', 'April', 'May'])
# To Set column Name as the index of dataframe
test.set_index('Name', inplace=True)


#By getting sum of values of top 3 DataFrame rows,
sumtabOf = test.iloc[0:3].sum()
print(sumtabOf)
Output :
Jan      4122.0
Feb      7406.0
March    6632.0
April    5245.0
May      7342.0
dtype: float64

We selected the first 3 lines of the data file and called the total () for that. Returns a series containing the total monthly salary paid to selected employees only which means for the first three lines of the actual data list.

Get the sum of specific rows in Pandas Dataframe by index/row label :

Unlike the previous example, we can select specific lines with the reference label and find the value of values ​​in those selected lines only i.e.

import pandas as pd
import numpy as np
# The List of Tuples
salary_of_employees = [('Amit', 2000, 2050, 1099, 2134, 2111),
                    ('Rabi', 2122, 3022, 3456, 3111, 2109),
                    ('Abhi', np.NaN, 2334, 2077, np.NaN, 3122),
                    ('Naresh', 3050, 3050, 2010, 2122, 1111),
                    ('Suman', 2023, 2232, 3050, 2123, 1099),
                    ('Viroj', 2050, 2510, np.NaN, 3012, 2122),
                    ('Nabin', 4000, 2000, 2050, np.NaN, 2111)]
# By Creating a DataFrame object from list of tuples
test = pd.DataFrame(salary_of_employees,
                  columns=['Name',  'Jan', 'Feb', 'March', 'April', 'May'])
# To Set column Name as the index of dataframe
test.set_index('Name', inplace=True)


# By getting sum of 3 DataFrame rows (selected by index labels)
sumtabOf = test.loc[['Amit', 'Naresh', 'Viroj']].sum()
print(sumtabOf)
Output :
Jan      7100.0
Feb      7610.0
March    3109.0
April    7268.0
May      5344.0
dtype: float64

We have selected 3 lines of data name with the reference label namely ‘Amit’, ‘Naresh’ and ‘Viroj’. We then added the queue values ​​for these selected employees only. Return a series with the total amount of salary paid per month to those selected employees per month only wisely.

Conclusion:

So in the above cases we found out that to sum the multiple rows given in a dataframe.

Pandas: Sum rows in Dataframe ( all or certain rows) Read More »

Python Pandas : Replace or change Column & Row index names in DataFrame

Replacing or changing Column & Row index names in DataFrame

In this article we will discuss

  • How to change column names or
  • Row Index names in the DataFrame object.

First, create an object with a database name for student records i.e.

import pandas as pd
students_record = [ ('Amit', 27, 'Kolkata') ,
                    ('Mini', 24, 'Chennai' ) ,
                    ('Nira', 34, 'Mumbai') ]
# By creating a DataFrame object
do = pd.DataFrame(students_record, columns = ['Name' , 'Age', 'City'], index=['x', 'y', 'z']) 
print(do)
Output :
   Name   Age     City
x  Amit     27    Kolkata
y  Mini     24    Chennai
z  Nira     34     Mumbai

Change Column Names in DataFrame :

The DataFrame item contains Attribute columns which is the Index item and contains the Data Labels in the Dataframe column. We can find column name detection in this Index item i.e.

import pandas as pd
students_record = [ ('Amit', 27, 'Kolkata') ,
                    ('Mini', 24, 'Chennai' ) ,
                    ('Nira', 34, 'Mumbai') ]
# By creating a DataFrame object
do = pd.DataFrame(students_record, columns = ['Name' , 'Age', 'City'], index=['x', 'y', 'z']) 


# By getting ndArray of all column names 
column_Name_Arr = do.columns.values
print(column_Name_Arr)
Output :
['Name' 'Age' 'City']

Any modifications to this ndArray (df.column.values) will change the actual DataFrame. For example let’s change the column name to index 0 i.e.

import pandas as pd
students_record = [ ('Amit', 27, 'Kolkata') ,
                    ('Mini', 24, 'Chennai' ) ,
                    ('Nira', 34, 'Mumbai') ]
# By creating a DataFrame object
do = pd.DataFrame(students_record, columns = ['Name' , 'Age', 'City'], index=['x', 'y', 'z']) 


# By getting ndArray of all column names 
column_Name_Arr = do.columns.values
# By Modifying a Column Name
column_Name_Arr[0] = 'Name_Vr'
print(column_Name_Arr)
Output :
['Name_Vr' 'Age' 'City']

Change Row Index in DataFrame

The content of the data items is as follows,

To get a list of all the line references names from the dataFrame object, there use the attribute index instead of columns i.e. df.index.values

It returns the ndarray of all line references to the data file. Any modifications to this ndArray (df.index.values) will modify the actual DataFrame. For example let’s change the name of the line indicator to 0 i.e.replace with ‘j’.

This change will be reflected in the linked DataFrame object again. Now the content of the DataFrame object is,

But if we change it to the list before changing the changes it will not be visible in the original DataFrame object. For example create a list of copies of Row Index Names of DataFrame i.e.

The whole activities of the program is given below.

import pandas as pd
students_record = [ ('Amit', 27, 'Kolkata') ,
                    ('Mini', 24, 'Chennai' ) ,
                    ('Nira', 34, 'Mumbai') ]
# By creating a DataFrame object
do = pd.DataFrame(students_record, columns = ['Name' , 'Age', 'City'], index=['x', 'y', 'z']) 


# For getting a list of all the column names 
index_Name_Arr = do.index.values
print(index_Name_Arr)


#For Modifying a Row Index Name
index_Name_Arr[0] = 'j'
print(index_Name_Arr)


#For getting a copy list of all the column names 
index_Names = list(do.index.values)
print(index_Names)
print(do)
Output :
['x' 'y' 'z']
['j' 'y' 'z']
['j', 'y', 'z']
  Name  Age     City
j  Amit   27  Kolkata
y  Mini   24  Chennai
z  Nira   34   Mumbai

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Modify a Dataframe

Python Pandas : Replace or change Column & Row index names in DataFrame Read More »

6 Ways to check if all values in Numpy Array are zero (in both 1D & 2D arrays) – Python

Check if all values in Numpy Array are zero (in both 1D & 2D arrays) in Python

In this article we will discuss about different ways to  check if all values in a numpy array are 0 i.e in both 1D and 2D arrays.

So let’s start exploring the topic.

Method 1: Using numpy.all() to check if a 1D Numpy array contains only 0 :

In this method we will check that each element of the array will be compared with the particular element i.e zero. And a a result it will return a bool array containing True or False.

# program :

import numpy as np

# 1D numpy array created from a list
arr = np.array([0, 0, 0, 0, 0, 0])

# Checking if all elements in array are zero
check_zero = np.all((arr == 0))
if check_zero:
    print('All the elements of array are zero')
else:
    print('All the elements of the array are not zero')
Output :
All the elements of array are zero

Method 2: Using numpy.any() to check if a 1D Numpy array contains only 0 :

We can use the numpy.any() function to check that if the array contains only zeros by looking for any non zero value. All the elements received by numpy.any() function gets typecast to bool values i.e. 0 to False and others as True. So if all the values in the array are zero then it will return False and then by using not we can confirm our array contains only zeros or not.

# program :

import numpy as np

# 1D numpy array created from a list
arr = np.array([0, 0, 0, 0, 0, 0])

# Checking if all elements in array are zero
check_zero = not np.any(arr)
if check_zero:
    print('All the elements of array are zero')
else:
    print('All the elements of the array are not zero')
Output : 
All the elements of array are zero

Method 3: Using numpy.count_nonzero() to check if a 1D Numpy array contains only 0 :

numpy.count_nonzero()  function returns a count of non-zero values in the array. So by using it we can check if the array contains any non zero value or not. If any non zero element found in the array then all the elements of the array are not zero and if no non zero value found then then all the elements in the array are zero.

# program :

import numpy as np

# 1D numpy array created from a list
arr = np.array([0, 0, 0, 0, 0, 0])

# Checking if all elements in array are zero
# Count non zero items in array
count_non_zeros = np.count_nonzero(arr)
if count_non_zeros==0:
    print('All the elements of array are zero')
else:
    print('All the elements of the array are not zero')
Output :
All the elements of array are zero

Method 4: Using for loop to check if a 1D Numpy array contains only 0 :

By iterating over all the elements of the array we also check that array contains only zeros or not.

# program :

import numpy as np

# 1D numpy array created from a list
arr = np.array([0, 0, 0, 0, 0, 0])

# Checking if all elements in array are zero
def check_zero(arr):
    # iterating of the array
    # and checking if any element is not equal to zero
    for elem in arr:
        if elem != 0:
            return False
    return True
result = check_zero(arr)

if result:
    print('All the elements of array are zero')
else:
    print('All the elements of the array are not zero')
Output :
All the elements of array are zero

Method 5: Using List Comprehension to check if a 1D Numpy array contains only 0 :

By using List Comprehension also we can iterate over each element in the numpy array and then we can create a list of values which are non zero.And if the list contains any element then we can confirm all the values of the numpy array were not zero.

# program :

import numpy as np

# 1D numpy array created from a list
arr = np.array([0, 0, 0, 0, 0, 0])

# Iterating over each element of array 
# And create a list of non zero items from array
result = len([elem for elem in arr if elem != 0])
# from this we can knoew that if our list contains no element then the array contains all zero values.

if result==0:
    print('All the elements of array are zero')
else:
    print('All the elements of the array are not zero')
Output :
All the elements of array are zero

Method 6: Using min() and max() to check if a 1D Numpy array contains only 0 :

If the minimum and maximum value in the array are same and i.e zero then we can confirm the array contains only zeros.

# program :

import numpy as np

# 1D numpy array created from a list
arr = np.array([0, 0, 0, 0, 0, 0])

if arr.min() == 0 and arr.max() == 0:
    print('All the elements of array are zero')
else:
    print('All the elements of the array are not zero')
Output :
All the elements of array are zero

Check if all elements in a 2D numpy array or matrix are zero :

Using the first technique that is by using numpy.all() function we can check the 2D array contains only zeros or not.

# program :

import numpy as np

# 2D numpy array created 
arr_2d = np.array([[0, 0, 0],
                   [0, 0, 0],
                   [0, 0, 0]])

# Checking if all 2D numpy array contains only 0
result = np.all((arr_2d == 0))

if result:
    print('All elemnets of the 2D array are zero')
else:
    print('All elemnets of the 2D array are not zero')
Output : 
All the elements of the 2D array are zero

6 Ways to check if all values in Numpy Array are zero (in both 1D & 2D arrays) – Python Read More »

Disarium Number in Python

Disarium Number in Python

Disarium Number:

A Disarium number is one in which the sum of each digit raised to the power of its respective position equals the original number.

like 135 , 89 etc.

Example1:

Input:

number =135

Output:

135 is disarium number

Explanation:

Here 1^1 + 3^2 + 5^3 = 135 so it is disarium Number

Example2:

Input:

number =79

Output:

79 is not disarium number

Explanation:

Here 7^1 + 9^2 = 87 not equal to 79  so it is not disarium Number

Disarium Number in Python

Below are the ways to check Disarium number in python

Explore more instances related to python concepts from Python Programming Examples Guide and get promoted from beginner to professional programmer level in Python Programming Language.

Method #1: Using while loop

Algorithm:

  • Scan the number and calculate the size of the number.
  • Make a copy of the number so you can verify the outcome later.
  • Make a result variable (with a value of 0) and an iterator ( set to the size of the number)
  • Create a while loop to go digit by digit through the number.
  • On each iteration, multiply the result by a digit raised to the power of the iterator value.
  • On each traversal, increment the iterator.
  • Compare the result value to a copy of the original number.

Below is the implementation:

# given number
num = 135
# intialize result to zero(ans)
ans = 0
# calculating the digits
digits = len(str(num))
# copy the number in another variable(duplicate)
dup_number = num
while (dup_number != 0):

    # getting the last digit
    remainder = dup_number % 10

    # multiply the result by a digit raised to the power of the iterator value.
    ans = ans + remainder**digits
    digits = digits - 1
    dup_number = dup_number//10
# It is disarium number if it is equal to original number
if(num == ans):
    print(num, "is disarium number")
else:
    print(num, "is not disarium number")

Output:

135 is disarium number

Method #2: By converting the number to string and Traversing the string to extract the digits

Algorithm:

  • Initialize a variable say ans to 0
  • Using a new variable, we must convert the given number to a string.
  • Take a temp count =1 and increase the count after each iteration.
  • Iterate through the string, convert each character to an integer, multiply the ans by a digit raised to the power of the count.
  • If the ans is equal to given number then it is disarium number

Below is the implementation:

# given number
num = 135
# intialize result to zero(ans)
ans = 0
# make a temp count to 1
count = 1
# converting given number to string
numString = str(num)
# Traverse through the string
for char in numString:
    # Converting the character of string to integer
    # multiply the ans by a digit raised to the power of the iterator value.
    ans = ans+int(char)**count
    count = count+1
# It is disarium number if it is equal to original number
if(num == ans):
    print(num, "is disarium number")
else:
    print(num, "is not disarium number")

Output:

135 is disarium number

Related Programs:

Disarium Number in Python Read More »

Harshad Number in Python

Harshad Number in Python

Harshad Number:

A Harshad number is one whose original number is divisible by the sum of its digits.

like 5 , 18 , 156 etc.

Example 1:

Input:

number=18

Output:

18 is harshad number

Explanation:

Here sum_of_digits=9 i.e (1+8 ) and 18 is divisible by 9

Example 2:

Input:

number=19

Output:

19 is not harshad number

Explanation:

Here sum_of_digits=10 i.e (1+ 9 ) and 19  is not divisible by 10

Harshad Number in Python

Below are the ways to check harshad number in python

Explore more instances related to python concepts from Python Programming Examples Guide and get promoted from beginner to professional programmer level in Python Programming Language.

Method #1: Using while loop

Algorithm:

  • Scan the input number
  • Make a copy of the number so you can verify the outcome later.
  • Make a result variable ( set to 0 ).
  • Create a while loop to go digit by digit through the number.
  • Every iteration, increase the result by a digit.
  • Divide the result by the number’s duplicate.
  • If a number divides perfectly, it is a Harshad Number; otherwise, it is not.

Below is the implementation:

# given number
num = 18
# intiialize sum of digits to 0
sum_of_digits = 0
# copy the number in another variable(duplicate)
dup_number = num
# Traverse the digits of number using for loop
while dup_number > 0:
    sum_of_digits = sum_of_digits + dup_number % 10
    dup_number = dup_number // 10
# It is harshad number if sum of digits is equal to given number

if(num % sum_of_digits == 0):
    print(num, "is harshad number")
else:
    print(num, "is not harshad number")

Output:

18 is harshad number

Method #2: By converting the number to string and Traversing the string to extract the digits

Algorithm:

  • Using a new variable, we must convert the given number to a string.
  • Iterate through the string, convert each character to an integer, and add the result to the sum.
  • If a number divides perfectly, it is a Harshad Number; otherwise, it is not.

Below is the implementation:

# given number
num = 18

# Converting the given number to string
numString = str(num)

# intiialize sum of digits to 0
sum_of_digits = 0

# Traverse through the string
for char in numString:
  # Converting the character of string to integer and adding to sum_of_digits
    sum_of_digits = sum_of_digits + int(char)


# It is harshad number if sum of digits is equal to given number

if(num % sum_of_digits == 0):
    print(num, "is harshad number")
else:
    print(num, "is not harshad number")

Output:

18 is harshad number

Method #3: Using list and map

Algorithm:

  • Convert the digits of given number to list using map function.
  • Calculate the sum of digits using sum() function.
  • If a number divides perfectly, it is a Harshad Number; otherwise, it is not.

Below is the implementation:

# given number
num = 18
# Converting the given number to string
numString = str(num)
# Convert the digits of given number to list using map function
numlist = list(map(int, numString))
# calculate sum of list
sum_of_digits = sum(numlist)

# It is harshad number if sum of digits is equal to given number
if(num % sum_of_digits == 0):
    print(num, "is harshad number")
else:
    print(num, "is not harshad number")

Output:

18 is harshad number

Related Programs:

Harshad Number in Python Read More »

Find the Factorial of a Number

Python Program to Find the Factorial of a Number

Factorial of a number:

The product of all positive integers less than or equal to n is the factorial of a non-negative integer n, denoted by n! in mathematics:

n! = n * (n – 1) *(n – 2) * . . . . . . . . . . 3 * 2 * 1.

4 != 4 * 3 * 2 * 1

Examples:

Input:

number = 7

Output:

Factorial of 7 = 5040

Given a number , the task is to find the factorial of the given number

Finding Factorial of a Number

There are several ways to find factorial of a number in python some of them are:

Explore more instances related to python concepts from Python Programming Examples Guide and get promoted from beginner to professional programmer level in Python Programming Language.

Method #1:Using for loop(Without Recursion)

Approach:

  • Create a variable say res and assign it to 1.
  • Using for loop and range,  Run a loop from 1 to n.
  • Multiply the res with the loop iterator value.

Below is the implementation:

# given number
num = 7
# initializing a variable res with 1
res = 1
# Traverse from 1 to n
for i in range(1, num+1):
    res = res*i
# print the factorial
print("Factorial of", num, "=", res)

Output:

Factorial of 7 = 5040

Method #2: Using recursion

To find the factorial of a given number, we will use recursion. We defined the factorial(num) function, which returns 1 if the entered value is 1 and 0 otherwise, until we get the factorial of a given number.

Below is the implementation:

# function which returns the factorial of given number
def facto(num):
    if(num == 1 or num == 0):
        return 1
    else:
        return num*facto(num-1)


# given number
num = 7
# passing the given num to facto function which returns the factorial of the number
res = facto(num)
# print the factorial
print("Factorial of", num, "=", res)

Output:

Factorial of 7 = 5040

Method #3 : Using Built in Python functions

Python has a built-in function

factorial (number)

which returns the factorial of given number.

Note: factorial function is available in math module

Below is the implementation:

# import math module
import math
# given number
num = 7
# finding factorial using Built in python function
res = math.factorial(num)
# print the factorial
print("Factorial of", num, "=", res)

Output:

Factorial of 7 = 5040

Related Programs:

Related Programs:

Python Program to Find the Factorial of a Number Read More »

Pandas: Create Series from dictionary in python

Creating Series from dictionary in python

In this article we will discuss about different ways to convert a dictionary in python to a Pandas Series object.

Series class provides a constructor in Pandas i.e

Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)

Where,

  • data : It represents array-like, Iterable sequence where all items in this iterable sequence will be added as values in the Series.
  • index : It represents array-like, Iterable sequence where all values in this iterable sequence will be added as indices in the Series.
  • dtype : It represents datatype of the output series.

Create a Pandas Series from dict in python :

By passing the dictionary to the Series class Constructor i.e. Series(). we can get a new Series object where all the keys in the dictionary will become the indices of the Series object, and all the values from the key-value pairs in the dictionary will converted into the values of the Series object.

So, let’s see the example.

# Program :

import pandas as pd
# Dictionary 
dict = {
    'C': 56,
    "A": 23,
    'D': 43,
    'E': 78,
    'B': 11
}
# Converting a dictionary to a Pandas Series object.
# Where dictionary keys will be converted into index of Series &
# values of dictionar will become values in Series.
series_object = pd.Series(dict)
print('Contents of Pandas Series: ')
print(series_object)
Output :
Contents of Pandas Series: 
C  56
A  23
D  43
E  78
B  11
dtype: int64

Where the index of the series object contains the keys of the dictionary and the values of the series object contains the values of the dictionary.

Create Pandas series object from a dictionary with index in a specific order :

In the above example we observed the indices of the series object are in the same order as the keys of the dictionary. In this example we will see how to convert the dictionary into series object with some other order.

So, let’s see the example.

# Program :

import pandas as pd
# Dictionary 
dict = {
    'C': 6,
    "A": 3,
    'D': 4,
    'E': 8,
    'B': 1
}
# Creating Series from dict, but pass the index list separately
# Where dictionary keys will be converted into index of Series &
# values of dictionar will become values in Series.
# But the order of indices will be some other order
series_object = pd.Series(dict,
                       index=['E', 'D', 'C', 'B', 'A'])
print('Contents of Pandas Series: ')
print(series_object)
Output :
Contents of Pandas Series: 
E  8
D  4
C  6
B  1
A  3
dtype: int64

Create a Pandas Series object from specific key-value pairs in a dictionary :

In above examples we saw Series object is created from all the items in the dictionary as we pass the dictionary as the only argument in the series constructor. But now we will see how we will see how we can convert specific key-value pairs from dictionary to the Series object.

So, let’s see the example.

# Program :

import pandas as pd
# Dictionary 
dict = {
    'C': 6,
    "A": 3,
    'D': 4,
    'E': 8,
    'B': 1
}
# Creating Series from dict, but pass the index list separately
# Where dictionary keys will be converted into index of Series &
# values of dictionar will become values in Series.
# But here we have passed some specific key-value pairs of dictionary
series_object = pd.Series(dict,
                       index=['E', 'D', 'C'])
print('Contents of Pandas Series: ')
print(series_object)
Output :
Contents of Pandas Series: 
E 8
D 4
C 6
dtype: int64

 

Pandas: Create Series from dictionary in python Read More »

Pandas : Sort a DataFrame based on column names or row index labels using Dataframe.sort_index()

Sorting a DataFrame based on column names or row index labels using Dataframe.sort_index() in Python

In this article we will discuss how we organize the content of data entered based on column names or line reference labels using Dataframe.sort_index ().

Dataframe.sort_index():

In the Python Pandas Library, the Dataframe section provides a member sort sort_index () to edit DataFrame based on label names next to the axis i.e.

DataFrame.sort_index(axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, by=None)

Where,

  • axis :If the axis is 0, then the data name will be sorted based on the line index labels. The default is 0
  • ascending :If the axis is 1, then the data name will be sorted based on column names.
  • inplace : If the type of truth in the rise of another type arrange in order. The default is true
  • na_position :  If True, enable localization in Dataframe. Determines NaN status after filter i.e. irst puts NaNs first, finally puts NaNs at the end.

It returns the edited data object. Also, if the location dispute is untrue then it will return a duplicate copy of the provided data, instead of replacing the original Dataframe. While, if the internal dispute is true it will cause the current file name to be edited.

Let’s understand some examples,

# Program :

import pandas as pd
# List of Tuples
students = [ ('Rama', 31, 'canada') ,
             ('Symon', 23, 'Chennai' ) ,
             ('Arati', 16, 'Maharastra') ,
             ('Bhabani', 32, 'Kolkata' ) ,
             ('Modi', 33, 'Uttarpradesh' ) ,
             ('Heeron', 39, 'Hyderabad' )
              ]
# Create a DataFrame object
dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])
print(dfObj)
Output :
    Name   Marks  City
b  Rama     31   canada
a  Symon   23   Chennai
f   Arati      16   Maharastra
e  Bhabani  32  Kolkata
d  Modi      33  Uttarpradesh
c  Heeron  39   Hyderabad

Now let’s see how we organize this DataFrame based on labels i.e. columns or line reference labels,

Sort rows of a Dataframe based on Row index labels :

Sorting by line index labels we can call sort_index() in the data name item.

import pandas as pd
# The List of Tuples
students = [ ('Rama', 31, 'canada') ,
             ('Symon', 23, 'Chennai' ) ,
             ('Arati', 16, 'Maharastra') ,
             ('Bhabani', 32, 'Kolkata' ) ,
             ('Modi', 33, 'Uttarpradesh' ) ,
             ('Heeron', 39, 'Hyderabad' )
              ]
# To create DataFrame object 
dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])
# By sorting the rows of dataframe based on row index label names
modDFObj = dfObj.sort_index()
print(' Dataframes are in sorted oreder of index value given:')
print(modDFObj)
Output :
Dataframes are in sorted oreder of index value given:
    Name    Marks        City
a Symon      23         Chennai
b Rama        31         canada
c Heeron     39         Hyderabad
d Modi        33         Uttarpradesh
e Bhabani    32         Kolkata
f Arati          16         Maharastra

As we can see in the output lines it is sorted based on the reference labels now. Instead of changing the original name data backed up an edited copy of the dataframe.

Sort rows of a Dataframe in Descending Order based on Row index labels :

Sorting based on line index labels in descending order we need to pass the argument = False in sort_index() function in the data object object.

import pandas as pd
# The List of Tuples
students = [ ('Rama', 31, 'canada') ,
             ('Symon', 23, 'Chennai' ) ,
             ('Arati', 16, 'Maharastra') ,
             ('Bhabani', 32, 'Kolkata' ) ,
             ('Modi', 33, 'Uttarpradesh' ) ,
             ('Heeron', 39, 'Hyderabad' )
              ]
# To create DataFrame object 
dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])
# By sorting the rows of dataframe in descending order based on row index label names
conObj = dfObj.sort_index(ascending=False)
print('The Contents of Dataframe are sorted in descending Order based on Row Index Labels are of :')
print(conObj)
The Contents of Dataframe are sorted in descending Order based on Row Index Labels are of :
     Name       Marks          City
f      Arati          16          Maharastra
e     Bhabani     32          Kolkata
d     Modi        33           Uttarpradesh
c     Heeron     39           Hyderabad
b     Rama       31           canada
a     Symon     23           Chennai

As we can see in the output lines it is sorted by destructive sequence based on the current reference labels. Also, instead of changing the original data name it restored the edited copy of the data.

Sort rows of a Dataframe based on Row index labels in Place :

Filtering a local data name instead of finding the default copy transfer inplace = True in sort_index () function in the data object object to filter the data name with local reference label labels i.e.

import pandas as pd
# The List of Tuples
students = [ ('Rama', 31, 'canada') ,
             ('Symon', 23, 'Chennai' ) ,
             ('Arati', 16, 'Maharastra') ,
             ('Bhabani', 32, 'Kolkata' ) ,
             ('Modi', 33, 'Uttarpradesh' ) ,
             ('Heeron', 39, 'Hyderabad' )
              ]
# To create DataFrame object 
dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])
#By sorting the rows of dataframe in Place based on row index label names
dfObj.sort_index(inplace=True)
print('The Contents of Dataframe are sorted in Place based on Row Index Labels are of :')
print(dfObj)
Output :
The Contents of Dataframe are sorted in Place based on Row Index Labels are of :
     Name     Marks      City
a    Symon     23       Chennai
b     Rama     31        canada
c   Heeron     39     Hyderabad
d     Modi     33     Uttarpradesh
e  Bhabani     32       Kolkata
f    Arati       16       Maharastra

Sort Columns of a Dataframe based on Column Names :

To edit DataFrame based on column names we can say sort_index () in a DataFrame object with an axis= 1 i.e.

import pandas as pd
# The List of Tuples
students = [ ('Rama', 31, 'canada') ,
             ('Symon', 23, 'Chennai' ) ,
             ('Arati', 16, 'Maharastra') ,
             ('Bhabani', 32, 'Kolkata' ) ,
             ('Modi', 33, 'Uttarpradesh' ) ,
             ('Heeron', 39, 'Hyderabad' )
              ]
# To create DataFrame object 
dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])
# By sorting a dataframe based on column names
conObj = dfObj.sort_index(axis=1)
print('The Contents are of Dataframe sorted based on Column Names are in the type :')
print(conObj)

Output :
The Contents are of Dataframe sorted based on Column Names are in the type :
           City          Marks     Name
b        canada         31      Rama
a       Chennai         23     Symon
f      Maharastra     16     Arati
e       Kolkata          32     Bhabani
d  Uttarpradesh     33     Modi
c     Hyderabad     39      Heeron

As we can see, instead of changing the original data name it returns a fixed copy of the data data based on the column names.

Sort Columns of a Dataframe in Descending Order based on Column Names :

By sorting DataFrame based on column names in descending order, we can call sort_index () in the DataFrame item with axis = 1 and ascending = False i.e.

import pandas as pd
# The List of Tuples
students = [ ('Rama', 31, 'canada') ,
             ('Symon', 23, 'Chennai' ) ,
             ('Arati', 16, 'Maharastra') ,
             ('Bhabani', 32, 'Kolkata' ) ,
             ('Modi', 33, 'Uttarpradesh' ) ,
             ('Heeron', 39, 'Hyderabad' )
              ]
# To create DataFrame object 
dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])
# By sorting a dataframe in descending order based on column names
conObj = dfObj.sort_index(ascending=False, axis=1)
print('The Contents of Dataframe sorted in Descending Order based on Column Names are of :')
print(conObj)
Output :
The Contents of Dataframe sorted in Descending Order based on Column Names are of :
Name  Marks          City
b     Rama     31        canada
a    Symon     23       Chennai
f    Arati     16    Maharastra
e  Bhabani     32       Kolkata
d     Modi     33  Uttarpradesh
c   Heeron     39     Hyderabad

Instead of changing the original data name restore the edited copy of the data based on the column names (sorted by order)

Sort Columns of a Dataframe in Place based on Column Names :

Editing a local data name instead of obtaining an approved copy pass input = True and axis = 1 in sort_index () function in the dataframe object to filter the local data name by column names i.e.

import pandas as pd
# The List of Tuples
students = [ ('Rama', 31, 'canada') ,
             ('Symon', 23, 'Chennai' ) ,
             ('Arati', 16, 'Maharastra') ,
             ('Bhabani', 32, 'Kolkata' ) ,
             ('Modi', 33, 'Uttarpradesh' ) ,
             ('Heeron', 39, 'Hyderabad' )
              ]
# To create DataFrame object 
dfObj = pd.DataFrame(students, columns=['Name', 'Marks', 'City'], index=['b', 'a', 'f', 'e', 'd', 'c'])
# By sorting a dataframe in place based on column names
dfObj.sort_index(inplace=True, axis=1)
print('The Contents of Dataframe sorted in Place based on Column Names are of:')
print(dfObj)

Output :
The Contents of Dataframe sorted in Place based on Column Names are of:
City  Marks     Name
b        canada     31     Rama
a       Chennai     23    Symon
f    Maharastra     16    Arati
e       Kolkata     32  Bhabani
d  Uttarpradesh     33     Modi
c     Hyderabad     39   Heeron

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Modify a Dataframe

Pandas : Sort a DataFrame based on column names or row index labels using Dataframe.sort_index() Read More »