Satyabrata Jena

Python: numpy.reshape() function Tutorial with examples

Understanding numpy.reshape() function Tutorial with examples

In this article we will see how we can use numpy.reshape() function to change the shape of a numpy array.

numpy.reshape() :

Syntax:- numpy.reshape(a, newshape, order='C')

where,

  • a : Array, list or list of lists which need to be reshaped.
  • newshape : New shape which is a tuple or a int. (Pass tuple for converting a 2D or 3D array and Pass integer for creating array of 1D shape.)
  • order : Order in which items from given array will be used. (‘C‘: Read items from array in row-wise manner, ‘F‘: Read items from array in column-wise manner, ‘A‘: Read items from array based on memory order of items)

Converting shapes of Numpy arrays using numpy.reshape() :

Use numpy.reshape() to convert a 1D numpy array to a 2D Numpy array :

To pass 1D numpy array to 2D numpy array we will pass array and tuple i.e. (3×3) as numpy to reshape() function.

import numpy as sc
# Produce a 1D Numpy array from a given list
numArr = sc.array([10, 20, 30, 40, 50, 60, 70, 80, 92])
print('Original Numpy array:')
print(numArr)
# Convert the 1D Numpy array to a 2D Numpy array
arr_twoD = sc.reshape(numArr, (3,3))
print('2D Numpy array:')
print(arr_twoD)
Output :
Original Numpy array:
[10 20 30 40 50 60 70 80 92]
2D Numpy array:
[[10 20 30]
 [40 50 60]
 [70 80 90]]

New shape must be compatible with the original shape :

The new shape formed must be compatible with the shape of array passed i.e. if rows denoted by ‘R’, columns by ‘C’, total no. of items by ‘N’ then new shape must satisfy the relation R*C=N otherwise it will give rise to error.

import numpy as sc
# Produce a 1D Numpy array from a given list
numArr = sc.array([10, 20, 30, 40, 50, 60, 70, 80, 92])
print('Original Numpy array:')
print(numArr)
# convert the 1D Numpy array to a 2D Numpy array
arr_twoD = sc.reshape(numArr, (3,2))
print('2D Numpy array:')
print(arr_twoD)
Output :
ValueError: total size of new array must be unchanged

Using numpy.reshape() to convert a 1D numpy array to a 3D Numpy array :

We can convert a 1D numpy array into 3D numpy array passing array and shape of 3D array as tuple to reshape() function.

import numpy as sc
# Produce a 1D Numpy array from a given list
numArr = sc.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 91, 95, 99])
print('Original Numpy array:')
print(numArr)
# Convert the 1D Numpy array to a 3D Numpy array
arr_threeD = sc.reshape(numArr, (3,2,2))
print('3D Numpy array:')
print(arr_threeD)
Output :
Original Numpy array:
[10 20 30 40 50 60 70 80 90 91 95 99]
3D Numpy array:
[[[10 20]
  [30 40]]
 [[50 60]
  [70 80]]
 [[90 91]
  [95 99]]]

Use numpy.reshape() to convert a 3D numpy array to a 2D Numpy array :

We can also even convert a 3D numpy array to 2D numpy array.

import numpy as sc
# Create a 3D numpy array
arr_threeD = sc.array([[[10, 20],
                   [30, 40],
                   [50, 60]],
                 [[70, 80],
                  [90, 91],
                  [95, 99]]])
print('3D Numpy array:')
print(arr_threeD)
# Converting 3D numpy array to numpy array of size 2x6
arr_twoD = sc.reshape(arr_threeD, (2,6))
print('2D Numpy Array:')
print(arr_twoD)
Output :
3D Numpy array:
[[[10 20]
  [30 40]
  [50 60]]
 [[70 80]
  [90 91]
  [95 99]]]
2D Numpy Array:
[[10 20 30 40 50 60]
 [70 80 90 91 95 99]]

Use numpy.reshape() to convert a 2D numpy array to a 1D Numpy array :

If we pass a numpy array and ‘-1’ to reshape() then it will get convert into array of any shape to a flat array.

import numpy as sc
arr_twoD = sc.array([[10, 20, 30],
                [30, 40, 50],
                [60, 70, 82]])     
# Covert numpy array of any shape to 1D array
flatn_arr = sc.reshape(arr_twoD, -1)
print('1D Numpy array:')
print(flatn_arr) 
Output :
1D Numpy array:
[10 20 30 30 40 50 60 70 82]

numpy.reshape() returns a new view object if possible :

If possible in some scenarios reshape() function returns a view of the passed object. If we modify anything in view object it will reflect in main objet and vice-versa.

import numpy as sc
# create a 1D Numpy array
num_arr = sc.array([10, 20, 30, 40, 50, 60, 70, 80, 92])  
# Get a View object of any shape 
arr_twoD = sc.reshape(num_arr, (3,3))
print('Original array:')
print(arr_twoD)
# Modify the 5th element of the original array 
# Modification will also be visible in view object
num_arr[4] = 9
print('Modified 1D Numpy array:')
print(num_arr)
print('2D Numpy array:')
print(arr_twoD)
Output :
Original array:
[[10 20 30]
 [40 50 60]
 [70 80 92]]
Modified 1D Numpy array:
[10 20 30 40  9 60 70 80 90]
2D Numpy array:
[[10 20 30]
 [40  9 60]
 [70 80 90]]

How to check if reshape() returned a view object ?

In some scenarios reshape() function may not return a view object. We can check what object reshape() returns by seeing its base attribute if it is view or not.

If base attribute is None then it is not a view object, else it is a view object i.e. base attribute point to original array object.

import numpy as sc
# create a 1D Numpy array
num_arr = sc.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
arr_twoD = sc.reshape(num_arr, (3,3))
if arr_twoD.base is not None:
    print('arr_twoD is a view of original array')
    print('base array : ', arr_twoD.base)
Output :
arr_twoD is a view of original array
base array :  [10 20 30 40 50 60 70 80 90]

numpy.reshape() & different type of order parameters :

We can also pass order parameter whose value can be ‘C’ or ‘F’ or ‘A’. This parameter will decide in which order the elements of given array will be used. Default value of order parameter is ‘C’.

Convert 1D to 2D array row wise with order ‘C’ :

                By passing order paramter ‘C’ in reshape() function the given array will be read row wise.

import numpy as sc
# create a 1D Numpy array
num_arr = sc.array([10, 20, 30, 40, 50, 60, 70, 80, 92])
print('original array:')
print(num_arr)
# Covert 1D numpy array to 2D by reading array in row wise manner
arr_twoD = sc.reshape(num_arr, (3, 3), order = 'C')
print('2D array being read in row wise manner:')
print(arr_twoD)
Output :
original array:
[10 20 30 40 50 60 70 80 92]
2D array being read in row wise manner:
[[10 20 30]
 [40 50 60]
 [70 80 90]]

Convert 1D to 2D array column wise with order ‘F’ :

                By passing order parameter ‘C’ in reshape() function the given array will be read row wise.

import numpy as sc
# create a 1D Numpy array
num_arr = sc.array([10, 20, 30, 40, 50, 60, 70, 80, 92])
print('original array:')
print(num_arr)
# Covert 1D numpy array to 2D by reading array in column wise manner
arr_twoD = sc.reshape(num_arr, (3, 3), order = 'F')
print('2D array being read in column wise manner:')
print(arr_twoD)
Output :
original array:
[10 20 30 40 50 60 70 80 92]
2D array being read in column wise manner:
[[10 40 70]
 [20 50 80]
 [30 60 90]]

Convert 1D to 2D array by memory layout with parameter order “A” :

                If we pass order as ‘A’ in reshape() function, then items of input array will be read  basis on internal memory unit.

Here it will read elements based on memory layout of original given array and it does not consider the current view of input array

import numpy as sc
# create a 1D Numpy array
num_arr = sc.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
print('Original 1D array: ',num_arr)
# Create a 2D view object and get transpose view of it
arr_twoD = sc.reshape(num_arr, (3, 3)).T
print('2D transposed View:')
print(arr_twoD)
# Read elements in row wise from memory layout of original 1D array
flatn_arr = sc.reshape(arr_twoD, 9, order='A')
print('Flattened 1D array')
print(flatn_arr)
Output :
Original 1D array:  [10 20 30 40 50 60 70 80 90]
2D transposed View:
[[10 40 70]
 [20 50 80]
 [30 60 90]]
Flattened 1D array
[10 20 30 40 50 60 70 80 90]

Convert the shape of a list using numpy.reshape() :

  • In reshape() function we can pass list or list of list instead of array.
import numpy as sc
num_list = [10,20,30,40,50,60,70,80,90]
# To convert a list to 2D Numpy array
arr_twoD = sc.reshape(num_list, (3,3))
print('2D Numpy array:')
print(arr_twoD)
Output :
2D Numpy array:
[[10 20 30]
 [40 50 60]
 [70 80 90]]
  • We can also convert a 2D numpy array to list of list
import numpy as sc
num_list = [10,20,30,40,50,60,70,80,90]
# Convert a given list to 2D Numpy array
arr_twoD = sc.reshape(num_list, (3,3))
print('2D Numpy array:')
print(arr_twoD)
# Convert the 2D Numpy array to list of list
list_list = [ list(elem) for elem in arr_twoD]
print('List of List: ')
print(list_list)
Output :
2D Numpy array:
[[10 20 30]
 [40 50 60]
 [70 80 90]]
List of List:
[[10, 20, 30], [40, 50, 60], [70, 80, 90]]

 

 

Pandas : Change data type of single or multiple columns of Dataframe in Python

Changeing data type of single or multiple columns of Dataframe in Python

In this article we will see how we can change the data type of a single or multiple column of Dataframe in Python.

Change Data Type of a Single Column :

We will use series.astype() to change the data type of columns

Syntax:- Series.astype(self, dtype, copy=True, errors='raise', **kwargs)

where Arguments:

  • dtype : It is python type to which whole series object will get converted.
  • errors : It is a way of handling errors, which can be ignore/ raise and default value is ‘raised’. (raise- Raise exception in case of invalid parsing , ignore- Return the input as original in case of invalid parsing
  • copy : bool (Default value is True) (If False- Will make change in current object , If True- Return a copy)

Returns: If copy argument is true, new Series object with updated type is returned.

import pandas as sc
# List of Tuples
students = [('Rohit', 34, 'Swimming', 155) ,
        ('Ritik', 25, 'Cricket' , 179) ,
        ('Salim', 26, 'Music', 187) ,
        ('Rani', 29,'Sleeping' , 154) ,
        ('Sonu', 17, 'Singing' , 184) ,
        ('Madhu', 20, 'Travelling', 165 ),
        ('Devi', 22, 'Art', 141)
        ]
# Create a DataFrame object with different data type of column
studObj = sc.DataFrame(students, columns=['Name', 'Age', 'Hobby', 'Height'])
print(studObj)
print(studObj.dtypes)
Output :
Name        Age       Hobby         Height
0  Rohit      34        Swimming     155
1  Ritik        25        Cricket          179
2  Salim      26        Music            187
3   Rani       29       Sleeping         154
4   Sonu      17       Singing          184
5  Madhu    20       Travelling       165
6   Devi       22        Art                141

Name      object
Age          int64
Hobby     object
Height     int64
dtype:      object

Change data type of a column from int64 to float64 :

We can change data type of a column a column e.g.  Let’s try changing data type of ‘Age’ column from int64 to float64. For this we have to write Float64 in astype() which will get reflected in dataframe.

import pandas as sc
# List of Tuples
students = [('Rohit', 34, 'Swimming', 155) ,
        ('Ritik', 25, 'Cricket' , 179) ,
        ('Salim', 26, 'Music', 187) ,
        ('Rani', 29,'Sleeping' , 154) ,
        ('Sonu', 17, 'Singing' , 184) ,
        ('Madhu', 20, 'Travelling', 165 ),
        ('Devi', 22, 'Art', 141)
        ]
# Create a DataFrame object with different datatype of column
studObj = sc.DataFrame(students, columns=['Name', 'Age', 'Hobby', 'Height'])
# Change data type of column 'Age' to float64
studObj['Age'] = studObj['Age'].astype('float64')
print(studObj)
print(studObj.dtypes)
Output :
Name   Age       Hobby  Height
0  Rohit  34.0    Swimming     155
1  Ritik  25.0     Cricket     179
2  Salim  26.0       Music     187
3   Rani  29.0    Sleeping     154
4   Sonu  17.0     Singing     184
5  Madhu  20.0  Travelling     165
6   Devi  22.0         Art     141
Name       object
Age           float64
Hobby      object
Height      int64
dtype: object

Change data type of a column from int64 to string :

Let’s try to change the data type of ‘Height’ column to string i.e. Object type. As we know by default value of astype() was True, so it returns a copy of passed series with changed Data type which will be assigned to studObj['Height'].

import pandas as sc
# List of Tuples
students = [('Rohit', 34, 'Swimming', 155) ,
        ('Ritik', 25, 'Cricket' , 179) ,
        ('Salim', 26, 'Music', 187) ,
        ('Rani', 29,'Sleeping' , 154) ,
        ('Sonu', 17, 'Singing' , 184) ,
        ('Madhu', 20, 'Travelling', 165 ),
        ('Devi', 22, 'Art', 141)
        ]
studObj = sc.DataFrame(students, columns=['Name', 'Age', 'Hobby', 'Height'])
# Change data type of column 'Marks' from int64 to float64
studObj['Age'] = studObj['Age'].astype('float64')
# Change data type of column 'Marks' from int64 to Object type or string
studObj['Height'] = studObj['Height'].astype('object')
print(studObj)
print(studObj.dtypes)
Output :
Name   Age       Hobby Height
0  Rohit  34.0    Swimming    155
1  Ritik  25.0     Cricket    179
2  Salim  26.0       Music    187
3   Rani  29.0    Sleeping    154
4   Sonu  17.0     Singing    184
5  Madhu  20.0  Travelling    165
6   Devi  22.0         Art    141
Name       object
Age           float64
Hobby      object
Height     object
dtype: object

Change Data Type of Multiple Columns in Dataframe :

To change the datatype of multiple column in Dataframe we will use DataFeame.astype() which can be applied for whole dataframe or selected columns.

Synatx:- DataFrame.astype(self, dtype, copy=True, errors='raise', **kwargs)

Arguments:

  • dtype : It is python type to which whole series object will get converted. (Dictionary of column names and data types where given colum will be converted to corrresponding types.)
  • errors : It is a way of handling errors, which can be ignore/ raise and default value is ‘raised’.
  • raise : Raise exception in case of invalid parsing
  • ignore : Return the input as original in case of invalid parsing
  • copy : bool (Default value is True) (If False- Will make change in current object , If True- Return a copy)

Returns: If copy argument is true, new Series object with updated type is returned.

Change Data Type of two Columns at same time :

Let’s try to convert columns ‘Age’ & ‘Height of int64 data type to float64 & string respectively. We will pass a Dictionary to Dataframe.astype() where it contain column name as keys and new data type as values.

import pandas as sc
# List of Tuples
students = [('Rohit', 34, 'Swimming', 155) ,
        ('Ritik', 25, 'Cricket' , 179) ,
        ('Salim', 26, 'Music', 187) ,
        ('Rani', 29,'Sleeping' , 154) ,
        ('Sonu', 17, 'Singing' , 184) ,
        ('Madhu', 20, 'Travelling', 165 ),
        ('Devi', 22, 'Art', 141)
        ]
# Create a DataFrame object with different datatype of column
studObj = sc.DataFrame(students, columns=['Name', 'Age', 'Hobby', 'Height'])
# Convert the data type of column Age to float64 & column Marks to string
studObj = studObj.astype({'Age': 'float64', 'Height': 'object'})
print(studObj)
print(studObj.dtypes)
Output :
Name   Age       Hobby Height
0  Rohit  34.0    Swimming    155
1  Ritik  25.0     Cricket    179
2  Salim  26.0       Music    187
3   Rani  29.0    Sleeping    154
4   Sonu  17.0     Singing    184
5  Madhu  20.0  Travelling    165
6   Devi  22.0         Art    141
Name       object
Age           float64
Hobby      object
Height     object
dtype: object

Handle errors while converting Data Types of Columns :

Using astype() to convert either a column or multiple column we can’t pass the content which can’t be typecasted. Otherwise error will be produced.

import pandas as sc
# List of Tuples
students = [('Rohit', 34, 'Swimming', 155) ,
        ('Ritik', 25, 'Cricket' , 179) ,
        ('Salim', 26, 'Music', 187) ,
        ('Rani', 29,'Sleeping' , 154) ,
        ('Sonu', 17, 'Singing' , 184) ,
        ('Madhu', 20, 'Travelling', 165 ),
        ('Devi', 22, 'Art', 141)
        ]
# Create a DataFrame object with different datatype of column
studObj = sc.DataFrame(students, columns=['Name', 'Age', 'Hobby', 'Height'])
# Trying to change dataype of a column with unknown dataype
try:
        studObj['Name'] = studObj['Name'].astype('xyz')
except TypeError as ex:
        print(ex)

Output :
data type "xyz" not understood

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Modify a Dataframe

Pandas : Convert a DataFrame into a list of rows or columns in python | (list of lists)

Converting a DataFrame into a list of rows or columns in python | (list of lists)

In this article, we will discuss how we can convert a dataframe into a list, by converting each row or column into a list and creating a python lists from them.

Let’s first, create a dataframe,

import pandas as pd
#The List of Tuples
students = [('Arun', 23, 'Chennai', 127),
            ('Priya', 31, 'Delhi', 174.5),
            ('Ritik', 24, 'Mumbai', 181),
            ('Kimun', 37, 'Hyderabad', 125),
            ('Sinvee', 16, 'Delhi', 175.5),
            ('Kunu', 28, 'Mumbai', 115),
            ('Lisa', 31, 'Pune', 191)
            ]
# Creating DataFrame object
studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score'])
print(studentId)
Output :
     Name     Age       City           Score
0    Arun      23     Chennai         127.0
1   Priya       31      Delhi            174.5
2   Ritik        24     Mumbai         181.0
3   Kimun     37    Hyderabad     125.0
4   Sinvee    16      Delhi              175.5
5    Kunu    28     Mumbai           115.0
6    Lisa      31       Pune              191.0

Convert a Dataframe into a list of lists – Rows Wise :

In the dataframe created above, we must fetch each line as a list and create a list of these lists.

Let’s see how we can do this

import pandas as pd
#The List of Tuples
students = [('Arun', 23, 'Chennai', 127),
            ('Priya', 31, 'Delhi', 174.5),
            ('Ritik', 24, 'Mumbai', 181),
            ('Kimun', 37, 'Hyderabad', 125),
            ('Sinvee', 16, 'Delhi', 175.5),
            ('Kunu', 28, 'Mumbai', 115),
            ('Lisa', 31, 'Pune', 191)
            ]
# Creating DataFrame object
studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score'])

# By Converting a dataframe to the list of rows (list of lists)
listOfRows = studentId.to_numpy().tolist()
print(listOfRows)
print(type(listOfRows))
Output :
[['Arun', 23, 'Chennai', 127.0], ['Priya', 31, 'Delhi', 174.5], ['Ritik', 24, 'Mumbai', 181.0], ['Kimun', 37, 'Hyderabad', 125.0], ['Sinvee', 16, 'Delhi', 175.5], ['Kunu', 28, 'Mumbai', 115.0], ['Lisa', 31, 'Pune', 191.0]]
<class 'list'>

It Converted the data name into a sequential target list, that is, each linked list contains a line of data names. But what happened in one line ?

How did it work?

Let’s divide one line above into several lines to understand the concept behind it.

Step 1: Convert the Dataframe to a nested Numpy array using DataFrame.to_numpy() :

import pandas as pd
#The List of Tuples
students = [('Arun', 23, 'Chennai', 127),
            ('Priya', 31, 'Delhi', 174.5),
            ('Ritik', 24, 'Mumbai', 181),
            ('Kimun', 37, 'Hyderabad', 125),
            ('Sinvee', 16, 'Delhi', 175.5),
            ('Kunu', 28, 'Mumbai', 115),
            ('Lisa', 31, 'Pune', 191)
            ]
# Creating DataFrame object
studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score'])


# By getting rows of a dataframe as a nested numpy array
numpy_2d_array = studentId.to_numpy()
print(numpy_2d_array)
print(type(numpy_2d_array))
Output :
[['Arun' 23 'Chennai' 127.0]
['Priya' 31 'Delhi' 174.5]
['Ritik' 24 'Mumbai' 181.0]
['Kimun' 37 'Hyderabad' 125.0]
['Sinvee' 16 'Delhi' 175.5]
['Kunu' 28 'Mumbai' 115.0]
['Lisa' 31 'Pune' 191.0]]
<class 'numpy.ndarray'>

Actually DataFrame.to_numpy() converts data name into Numpy array. So we have a 2D Numpy array here. We have confirmed that by printing the type of returned item.

Step 2: Convert 2D Numpy array into a list of lists :

Numpy provides a function tolist(), which converts Numpy Array into a list. Let’s call that function in the object built above 2D Numpy,

import pandas as pd
#The List of Tuples
students = [('Arun', 23, 'Chennai', 127),
            ('Priya', 31, 'Delhi', 174.5),
            ('Ritik', 24, 'Mumbai', 181),
            ('Kimun', 37, 'Hyderabad', 125),
            ('Sinvee', 16, 'Delhi', 175.5),
            ('Kunu', 28, 'Mumbai', 115),
            ('Lisa', 31, 'Pune', 191)
            ]
# Creating DataFrame object
studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score'])


# By getting rows of a dataframe as a nested numpy array
numpy_2d_array = studentId.to_numpy()

# By Converting 2D numpy array to the list of lists
listOfRows = numpy_2d_array.tolist()
print(listOfRows)
print(type(listOfRows))
Output :
[['Arun', 23, 'Chennai', 127.0], ['Priya', 31, 'Delhi', 174.5], ['Ritik', 24, 'Mumbai', 181.0], ['Kimun', 37, 'Hyderabad', 125.0], ['Sinvee', 16, 'Delhi', 175.5], ['Kunu', 28, 'Mumbai', 115.0], ['Lisa', 31, 'Pune', 191.0]]
<class 'list'>

It converted 2D Numpy Array into a list.

So, this is how we changed the dataframe to 2D Numpy Array and then List of Lists, where each nested list represents a dataframe line.

Convert a Dataframe into a list of lists – Column Wise :

Now turn each column into a list and create a list of these lists,

import pandas as pd
#The List of Tuples
students = [('Arun', 23, 'Chennai', 127),
            ('Priya', 31, 'Delhi', 174.5),
            ('Ritik', 24, 'Mumbai', 181),
            ('Kimun', 37, 'Hyderabad', 125),
            ('Sinvee', 16, 'Delhi', 175.5),
            ('Kunu', 28, 'Mumbai', 115),
            ('Lisa', 31, 'Pune', 191)
            ]
# Creating DataFrame object
studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score'])


# Convert a dataframe to the list of columns i.e. list of lists
listOfRows = studentId.transpose().values.tolist()
print(listOfRows)
print(type(listOfRows))
Output :
[['Arun', 'Priya', 'Ritik', 'Kimun', 'Sinvee', 'Kunu', 'Lisa'], [23, 31, 24, 37, 16, 28, 31], ['Chennai', 'Delhi', 'Mumbai', 'Hyderabad', 'Delhi', 'Mumbai', 'Pune'], [127.0, 174.5, 181.0, 125.0, 175.5, 115.0, 191.0]]

<class 'list'>

How did it work?

It works on the same concept we discussed above, just one more step here i.e.

Step 1: Transpose the dataframe to convert rows as columns and columns as rows :

import pandas as pd
#The List of Tuples
students = [('Arun', 23, 'Chennai', 127),
            ('Priya', 31, 'Delhi', 174.5),
            ('Ritik', 24, 'Mumbai', 181),
            ('Kimun', 37, 'Hyderabad', 125),
            ('Sinvee', 16, 'Delhi', 175.5),
            ('Kunu', 28, 'Mumbai', 115),
            ('Lisa', 31, 'Pune', 191)
            ]
# Creating DataFrame object
studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score'])



# Transposing the dataframe, rows are now columns and columns are now rows
transposedObj = studentId.transpose()
print(transposedObj)

Output :
0      1       2          3       4       5      6
Name      Arun  Priya   Ritik      Kimun  Sinvee    Kunu   Lisa
Age         23     31      24         37      16      28     31
City   Chennai  Delhi  Mumbai  Hyderabad   Delhi  Mumbai   Pune
Score    127.0  174.5   181.0      125.0   175.5   115.0  191.0

transposedObj is a transpose of the original data i.e. lines in studentId with columns in transposedObj and columns in studentId are lines in transposedObj.

Step 2: Convert the Dataframe to a nested Numpy array using DataFrame.to_numpy() :

import pandas as pd
#The List of Tuples
students = [('Arun', 23, 'Chennai', 127),
            ('Priya', 31, 'Delhi', 174.5),
            ('Ritik', 24, 'Mumbai', 181),
            ('Kimun', 37, 'Hyderabad', 125),
            ('Sinvee', 16, 'Delhi', 175.5),
            ('Kunu', 28, 'Mumbai', 115),
            ('Lisa', 31, 'Pune', 191)
            ]
# Creating DataFrame object
studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score'])


# Transposing the dataframe, rows are now columns and columns are now rows
transposedObj = studentId.transpose()


# By getting rows of a dataframe as a nested numpy array
numpy_2d_array = transposedObj.to_numpy()
print(numpy_2d_array)
print(type(numpy_2d_array))
Output :
[['Arun' 'Priya' 'Ritik' 'Kimun' 'Sinvee' 'Kunu' 'Lisa']
[23 31 24 37 16 28 31]
['Chennai' 'Delhi' 'Mumbai' 'Hyderabad' 'Delhi' 'Mumbai' 'Pune']
[127.0 174.5 181.0 125.0 175.5 115.0 191.0]]
<class 'numpy.ndarray'>

Step 3: Convert 2D Numpy array into a list of lists. :

import pandas as pd
#The List of Tuples
students = [('Arun', 23, 'Chennai', 127),
            ('Priya', 31, 'Delhi', 174.5),
            ('Ritik', 24, 'Mumbai', 181),
            ('Kimun', 37, 'Hyderabad', 125),
            ('Sinvee', 16, 'Delhi', 175.5),
            ('Kunu', 28, 'Mumbai', 115),
            ('Lisa', 31, 'Pune', 191)
            ]
# Creating DataFrame object
studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score'])


# Transposing the dataframe, rows are now columns and columns are now rows
transposedObj = studentId.transpose()


# By getting rows of a dataframe as a nested numpy array
numpy_2d_array = transposedObj.to_numpy()

#By Converting 2D numpy array to the list of lists
listOfRows = numpy_2d_array.tolist()
print(listOfRows)
print(type(listOfRows))
Output :
[['Arun', 'Priya', 'Ritik', 'Kimun', 'Sinvee', 'Kunu', 'Lisa'], [23, 31, 24, 37, 16, 28, 31], ['Chennai', 'Delhi', 'Mumbai', 'Hyderabad', 'Delhi', 'Mumbai', 'Pune'], [127.0, 174.5, 181.0, 125.0, 175.5, 115.0, 191.0]]
<class 'list'>

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Select items from a Dataframe

numpy.count_nonzero() – Python

Using numpy.count_nonzero() Function

In this article we will discuss about how to count values based on conditions in 1D or 2D Numpy Arrays using numpy.count_nonzero() function in python. So let’s explore the topic.

numpy.count_nonzero() :

A function numpy.count_nonzero() is provided by Numpy module in python to count the non-zero values in array,

Syntax- numpy.count_nonzero(arr, axis=None, keepdims=False)

Where,

  • arr : It represents the array like object in which we want to count the non zero values.
  • axis : It represents along which we want to count the values. If the value is 1 then then it will count non zero values in rows  and if the value is 0 then it will count non zero values in columns and if the value is None then it will count non zero values by flattening the arrays.
  • kepdims : It takes the bool value and if the value is True, then the axes that are counted are left in the result as dimensions with size one.

Which returns int or array of int containing count of non zero values in numpy array and if the Axis is provided then it returns the array of count of values along the axis.

Counting non zero values in a Numpy Array :

Suppose we have a numpy array with some zeros and non zero values. Now we will count the non zero values using numpy.count_nonzero() function.

So let’s see the program to understand how it actually works.

# Program :

import numpy as np
# numpy array from list created
arr = np.array([2, 3, 0, 5, 0, 0, 5, 0, 5])
# Counting non zero elements in numpy array
count = np.count_nonzero(arr)
print('Total count of non-zero values in NumPy Array: ', count)
Output :
Total count of non-zero values in NumPy Array: 5

Counting True values in a numpy Array :

As we know in python True is equivalent to 1 and False is equivalent to 0 then we can use the numpy.count_nonzero() function to count the True values in a bool numpy array.

# Program :

import numpy as np
# Numpy Array of bool values created
arr = np.array([False, True, True, True, False, False, False, True, True])
# Counting True elements in numpy array
count = np.count_nonzero(arr)
print('Total count of True values in NumPy Array: ', count)
Output :
Total count of True values in NumPy Array: 5

Counting Values in Numpy Array that satisfy a condition :

It is very simple to count the non-zero values as we did in previous example we only passed the complete numpy array , here we will pass the condition.

So lets see the example to understand it clearly.

# Program :

import numpy as np
# A Numpy array of numbers is created
arr = np.array([2, 3, 1, 5, 4, 2, 5, 6, 5])
# Count even number of even elements in array
count = np.count_nonzero(arr % 2 == 0)
print('Total count of Even Numbers in Numpy Array: ', count)
Output :
Total count of Even Numbers in Numpy Array: 4

In the above example which element will satisfy the condition the value will be True and which will not satisfy the value will be false. And it will count the True values.

Counting Non-Zero Values in 2D Numpy Array :

By using the same numpy.count_nonzero() function we can count the non-zero values in a 2D array where the default axis value is None.

So lets see the example to understand it clearly.

# Program :

import numpy as np
# 2D Numpy Array created 
arr_2d = np.array( [[20, 30, 0],
                    [50, 0, 0],
                    [50, 0, 50]])
# counting of non zero values in complete 2D array
count = np.count_nonzero(arr_2d)
print('Total count of non zero values in complete 2D array: ', count)
Output :
Total count of non zero values in complete 2D array:  5

Counting Non-Zero Values in each row of 2D Numpy Array :

To count the non-zero values in each row of 2D numpy array just pass value of axis as 1.

So lets see the example to understand it clearly.

# Program :

import numpy as np
# Create 2D Numpy ARray
arr = np.array( [[20, 30, 0],
                    [50, 0, 0],
                    [50, 0, 50]])
# Get count of non zero values in each row of 2D array
count = np.count_nonzero(arr, axis=1)
print('Total count of non zero values in each row of 2D array: ', count)
Output :
Total count of non zero values in each row of 2D array: [2 1 2]

Counting Non-Zero Values in each column of 2D Numpy Array :

To count the non-zero values in each columnof 2D numpy array just pass value of axis as 0.

So lets see the example to understand it clearly.

# Program :

import numpy as np
# 2D Numpy Array created
arr = np.array( [[20, 30, 0],
                    [50, 0, 0],
                    [50, 0, 50]])
# counting of non zero values in each column of 2D array
count = np.count_nonzero(arr, axis=0)
print('Total count of non zero values in each column of 2D array: ', count)
Output :
Total count of non zero values in each column of 2D array: [3 1 1]

 

 

 

Get first key-value pair from a Python Dictionary

Finding first key-value pair from a Dictionary in Python

In this article, we will see of different methods by which we can fetch the first key-value pair of a dictionary. We will also discuss how to get first N pairs and any number of key-value pairs from a dictionary.

Getting first key-value pair of dictionary in python using iter() & next() :

In python, iter() function creates a iterator object of the the iterable sequence of key-value pairs from dictionary and by calling next() function we can get the first key-value pair.

# Program :

dict_eg = {
    'Sachin' : 10,
    "Gayle"  : 333,
    'Kohil'  : 18,
    'Murali' : 800,
    'Dhoni'  : 7,
    'AB'     : 17
}
# Get the first key-value pair in dictionary
dict_eg = next(iter((dict_eg.items())) )
print('The first Key Value Pair in the Dictionary is:')
print(dict_eg)
print('First Key: ', dict_eg[0])
print('First Value: ', dict_eg[1])
Output :
The first Key Value Pair in the Dictionary is:
('Sachin', 10)
First Key:  Sachin
First Value:  10

Get first key-value pair of dictionary using list :

In python, items() function in dictionary returns the iterable sequence of all key-value pairs. Then by creating a list from all key-value pairs in dictionary and by selecting the first item we will get first key-value pair of dictionary.

# Program :

dict_eg = {
    'Sachin' : 10,
    "Gayle"  : 333,
    'Kohil'  : 18,
    'Murali' : 800,
    'Dhoni'  : 7,
    'AB'     : 17
}
# Get the first key-value pair in dictionary
dict_eg = list(dict_eg.items())[0]
print('First Key Value Pair of Dictionary:')
print(dict_eg)
print('Key: ', dict_eg[0])
print('Value: ', dict_eg[1])
Output :
First Key Value Pair of Dictionary:
('Sachin', 10)
Key:  Sachin
Value:  10

Getting the first N key-value pair of dictionary in python using list & slicing :

Here from similar process, we will create a list of key-value pairs from dictionary. We can get first ‘N’ items by list[:N] or any items by list[start:end].

# Program :

dict_eg = {
    'Sachin' : 10,
    "Gayle"  : 333,
    'AB'     : 17,
    'Murali' : 800,
    'Dhoni'  : 7,
    'AB'     : 17,
    'Kohil'  : 18
}
n = 5
# Get first 5 pairs of key-value pairs
firstN_pairs = list(dict_eg.items())[:n]
print('The first 5 Key Value Pairs of Dictionary are:')
for key,value in firstN_pairs:
    print(key, '::', value)
Output :
The first 5 Key Value Pairs of Dictionary are:
Dhoni :: 7
Kohil :: 18
Gayle :: 333
Sachin :: 10
AB :: 17

Getting the first N key-value pair of dictionary in python using itertools :

We can slice first ‘N’ entries from a sequence by itertools.islice(iterable, stop) after creating key-value pairs sequence from items() function.

# Program :

import itertools
dict_eg = {
    'Sachin' : 10,
    "Gayle"  : 333,
    'AB'     : 17,
    'Murali' : 800,
    'Dhoni'  : 7,
    'AB'     : 17,
    'Kohil'  : 18
}
n = 5
# Get first 5 pairs of key-value pairs
firstN_pairs = itertools.islice(dict_eg.items(), n)
print('The first 5 Key Value Pairs of Dictionary are:')
for key,value in firstN_pairs:
    print(key, '::', value)
Output :
The first 5 Key Value Pairs of Dictionary are:
Murali :: 800
AB :: 17
Sachin :: 10
Dhoni :: 7
Kohil :: 18

Python: Remove characters from string by regex and 4 other ways

Removing characters from string by regex and many other ways in Python.

In this article we will get to know about deleting characters from string in python by regex() and various other functions.

Removing characters from string by regex :

sub() function of regex module in Python helps to get a new string by replacing a particular pattern in the string by a string replacement. If no pattern found, then same string will be returned.

Removing all occurrences of a character from string using regex :

Let we want to delete all occurrence of ‘a’ from a string. We can implement this by passing in sub() where it will replace all ‘s’ with empty string.

#Program :

import re
origin_string = "India is my country"

pattern = r'a'

# If found, all 'a' will be replaced by empty string
modif_string = re.sub(pattern, '', origin_string )
print(modif_string)
Output :
Indi is my country

Removing multiple characters from string using regex in python :

Let we want to delete all occurrence of ‘a’, ‘i’. So we have to replace these characters by empty string.

#Program :

import re
origin_string = "India is my country"
pattern = r'[ai]'
# If found, all 'a' & 'i' will be replaced by empty string
modif_string = re.sub(pattern, '', origin_string )
print(modif_string) 
Output :
Ind s my country

Removing characters in list from the string in python :

Let we want to delete all occurrence of ‘a’ & ‘i’ from a string and also these characters are in a list.

#Program :

import re
origin_string = "India is my country"
char_list = ['i','a']
pattern = '[' + ''.join(char_list) + ']'
# All charcaters are removed if matched by pattern
modif_string = re.sub(pattern, '', origin_string)
print(modif_string)
Output :
Ind s my country

Removing characters from string using translate() :

str class of python provide a function translate(table). The characters in the string will be replaced on the basis of mapping provided in translation table.

Removing all occurrence of a character from the string using translate() :

Let we want to delete all occurrence of ‘i’ in a string. For this we have to pass a translation table to translate() function where ‘i’ will be mapped to None.

#Program :

origin_string = "India is my country"
# If found, remove all occurence of 'i' from string
modif_string = origin_string.translate({ord('i'): None})
print(modif_string)
Output :
Inda s my country

Removing multiple characters from the string using translate() :

Let, we want to delete ‘i’, ‘y’ from the above string.

#Program :

org_string= "India is my country"
list_char = ['y', 'i']
# Remove all 's', 'a' & 'i' from the string
mod_string = org_string.translate( {ord(elem): None for elem in list_char} )
print(mod_string)
Output :
Inda s m countr

Removing characters from string using replace()  :

Python provides str class, from which the replace() returns the copy of a string by replacing all occurrence of substring by a replacement. In the above string we will try to replace all ‘i’ with ‘a’.

#Program :

origin_string = "India is my country"
# Replacing all of 's' with 'a'
modif_string = origin_string.replace('i', 'a')
print(modif_string)
Output :
Indaa as my country

Removing characters from string using join() and generator expression :

Let we have a list with some characters. For removing all characters from string that are in the list, we would iterate over each characters in the string and join them except those which are not in the list.

#Program :

origin_string= "India is my country"
list_char = ['i', 'a', 's']
# Removes all occurence of 'i', 'a','s' from the string
modif_string = ''.join((elem for elem in origin_string if elem not in list_char))
print(modif_string)
Output :
Ind  my country

Removing characters from string using join and filter() :

filter() function filter the characters from string based on logic provided in call back function. If we provide a call back function as lambda function, it checks if the characters in the list are filtered are not. After that it joins the remaining characters to from a new string i.e. it eliminates all occurrence of characters that are in the list form a string.

#Programn :

origin_string = "India is my country"
list_char = ['i', 'a', 's']
# To remove all characters in list, from the string
modif_string = ''.join(filter(lambda k: k not in list_char, origin_string))
print(modif_string)
Output :
Ind  my country

 

Pandas: Sort rows or columns in Dataframe based on values using Dataframe.sort_values()

How to sort rows or columns in Dataframe based on values using Dataframe.sort_values() in Python ?

In this article we will get to know about how to sort rows in ascending & descending order based on values in columns and also sorting columns based on value in rows using member function DataFrame.sort_values().

DataFrame.sort_values() :

Pandas library contains a Dataframe class which provide a function to arrange the contents of dataframe.

Syntax : DataFrame.sort_values(by, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')

where,

  • by:  Column names or index labels where sorting is to be done.
  • axis:  If axis=0. names by argument will be taken as column names and If axis=1, names by argument will be taken as row index labels
  • ascending:  If True: Ascending Sorting , Else: Descending Sorting
  • inplace: When true in-place operation performed in dataframe
  • na_position: If True: It will sort the current dataframe , Else: Return a sorted copy of dataframe instead of modifying the dataframe.

Sorting Dataframe rows based on a single column :

To sort rows in dataframe based on ‘Team’ column, we will pass ‘Team’ as argument.

#Program :

import pandas as sc

# List of Tuples as Players
players = [ ('Curran', 325, 'CSK') ,
             ('Ishan', 481, 'MI' ) ,
             ('Vijay', 106, 'SRH') ,
             ('Pant', 224, 'Delhi' ) ,
             ('Seifert', 65, 'KKR' ) ,
             ('Hooda', 440, 'PK' )
              ]

# To Create DataFrame object
datafObjs = sc.DataFrame(players, columns=['Name', 'Runs', 'Team'], index=['i', 'ii', 'iii', 'iv', 'v', 'vi'])

# Sorting the rows of dataframe by column 'Team'
datafObjs = datafObjs.sort_values(by ='Team' )

print("Now the sorted Dataframe based on column 'Team' is : ")
print(datafObjs)
Output :
 Now the sorted Dataframe based on column 'Team' is :
        Name  Runs   Team
i     Curran   325    CSK
iv      Pant   224  Delhi
v    Seifert    65    KKR
ii     Ishan   481     MI
vi     Hooda   440     PK
iii    Vijay   106    SRH

Sorting Dataframe rows based on multiple column :

To sort all rows of dataframe based on Team & Runs column.

#Program :

import pandas as sc

# List of Tuples as Players
players = [ ('Curran', 325, 'CSK') ,
             ('Ishan', 481, 'MI' ) ,
             ('Vijay', 106, 'SRH') ,
             ('Pant', 224, 'Delhi' ) ,
             ('Seifert', 65, 'KKR' ) ,
             ('Hooda', 440, 'PK' )
              ]

# To Create DataFrame object
datafObjs = sc.DataFrame(players, columns=['Name', 'Runs', 'Team'], index=['i', 'ii', 'iii', 'iv', 'v', 'vi'])

# Sorting the rows of dataframe by column 'Team'
datafObjs = datafObjs.sort_values(by =['Runs', 'Team'])

print("Now the sorted Dataframe on the basis of columns 'Name' & 'Marks' : ")
print(datafObjs)
Output :
Now the sorted Dataframe on the basis of columns 'Name' & 'Marks' :
        Name   Runs   Team
v     Seifert    65       KKR
iii    Vijay      106     SRH
iv     Pant      224    Delhi
i      Curran   325     CSK
vi     Hooda   440     PK
ii     Ishan      481     MI

Sorting Dataframe rows based on columns in Descending Order :

Here we will sort the dataframe in descending order by passing value as false in argument ascending.

#Program :

import pandas as sc

# List of Tuples as Players
players = [ ('Curran', 325, 'CSK') ,
             ('Ishan', 481, 'MI' ) ,
             ('Vijay', 106, 'SRH') ,
             ('Pant', 224, 'Delhi' ) ,
             ('Seifert', 65, 'KKR' ) ,
             ('Hooda', 440, 'PK' )
              ]

# To Create DataFrame object
datafObjs = sc.DataFrame(players, columns=['Name', 'Runs', 'Team'], index=['i', 'ii', 'iii', 'iv', 'v', 'vi'])

# Sortimg rows of dataframe by column 'Name' in descending manner
datafObjs = datafObjs.sort_values(by ='Name' , ascending=False)

print("Now the sorted Dataframe on basis of column 'Name' in Descending manner : ")
print(datafObjs)
Output :
Now the sorted Dataframe on basis of column 'Name' in Descending manner :
        Name   Runs   Team
iii     Vijay      106    SRH
v     Seifert     65      KKR
iv    Pant       224     Delhi
ii     Ishan      481     MI
vi     Hooda   440     PK
i      Curran     325    CSK

Sorting Dataframe rows based on a column in Place :

Here we will sort the dataframe based on single column in place with argument inplace and value True.

#Program :

import pandas as sc

# List of Tuples as Players
players = [ ('Curran', 325, 'CSK') ,
             ('Ishan', 481, 'MI' ) ,
             ('Vijay', 106, 'SRH') ,
             ('Pant', 224, 'Delhi' ) ,
             ('Seifert', 65, 'KKR' ) ,
             ('Hooda', 440, 'PK' )
              ]

# To Create DataFrame object
datafObjs = sc.DataFrame(players, columns=['Name', 'Runs', 'Team'], index=['i', 'ii', 'iii', 'iv', 'v', 'vi'])

# Sorting e rows of the dataframe by column 'Name' inplace
datafObjs.sort_values(by='Team' , inplace=True)

print("Now the sorted Dataframe based on a single column 'Team' inplace : ")
print(datafObjs)
Output :
Now the sorted Dataframe based on a single column 'Team' inplace :
        Name  Runs   Team
i      Curran   325    CSK
iv     Pant     224     Delhi
v      Seifert    65    KKR
ii      Ishan    481     MI
vi     Hooda   440     PK
iii     Vijay     106     SRH

Sorting columns of a Dataframe based on a single or multiple rows :

Let’s take a example where column of a dataframe is to be sorted on the  basis of single or multiple rows.

#program :

import pandas as sc

# List of Tuples as matrix
matrix = [(11, 2, 33),
          (4, 55, 6),
          (77, 8, 99),
          ]

# To create a DataFrame object like a Matrix
datafObjs = sc.DataFrame(matrix, index=list('abc'))

print("Sorting the daataframe based on single rows or multiple rows: ")
print(datafObjs)
Output :
Sorting the daataframe based on single rows or multiple rows:
    0   1   2
a  11   2  33
b   4  55   6
c  77   8  99

Sorting columns of a Dataframe based on a single row :

Now from above dataframe let’s sort the dataframe on basis of ‘b’ row.

#program :

import pandas as sc

# List of Tuples as matrix
matrix = [(11, 2, 33),
          (4, 55, 6),
          (77, 8, 99),
          ]

# To create a DataFrame object like a Matrix
datafObjs = sc.DataFrame(matrix, index=list('abc'))
datafObjs = datafObjs.sort_values(by ='b', axis=1)

print("Now the sorted Dataframe on the basis of single row index label 'b' : ")
print(datafObjs)
Output :
Now the sorted dataframe is sorted on the basis of row index label 'b' :
    0   2   1
a  11  33   2
b   4   6  55
c  77  99   8

Sorting columns of a Dataframe in Descending Order based on a single row :

Here we will sort columns of the dataframe in descending order based on single row.

Here we will sort columns of a Dataframe on basis of multiple rows by passing value in ascending as false.

#Program :

import pandas as sc

# List of Tuples as matrix
matrix = [(11, 2, 33),
          (4, 55, 6),
          (77, 8, 99),
          ]

# To create a DataFrame object like a Matrix
datafObjs = sc.DataFrame(matrix, index=list('abc'))

# Sorting the columns of a dataframe in descending order based on a single row with index label 'b'
datafObjs = datafObjs.sort_values(by='b', axis=1, ascending=False)

print("Now the sorted Dataframe on the basis of single row index label 'b' in descending order :")
print(datafObjs)
Output :
Now the sorted Dataframe on the basis of single row index label 'b' in descending order :

    1   2   0
a   2  33  11
b  55   6   4
c   8  99  77

Sorting columns of a Dataframe based on a multiple rows :

 Here the columns of dataframe are sorted on the  basis of multiple rows with passing labels ‘b’ & ‘c’ and axis=1.

#program :

import pandas as sc

# List of Tuples as matrix
matrix = [(11, 2, 33),
          (4, 55, 6),
          (77, 8, 99),
          ]

# To create a DataFrame object like a Matrix
datafObjs = sc.DataFrame(matrix, index=list('abc'))

# Sorting columns of the dataframe based on a multiple row with index labels 'b' & 'c'
datafObjs = datafObjs.sort_values(by =['b' , 'c' ], axis=1)

print("Now the sorted Dataframe based on multiple rows index label 'b' & 'c' :")
print(datafObjs)
Output :
Now the sorted Dataframe based on multiple rows index label 'b' & 'c' :

    0   2   1
a  11  33   2
b   4   6  55
c  77  99   8

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Modify a Dataframe

Python Numpy : Select elements or indices by conditions from Numpy Array

How to elect elements or indices by conditions from Numpy Array in Python ?

In this article we will see how we can select and print elements from a given Numpy array provided with multiple conditions.

Selecting elements from a Numpy array based on Single or Multiple Conditions :

When we apply a comparison operator to a numpy array it is applied to each & every elements of the array. It is seen that True or False will be returned as bool array accordingly when its elements satisfies the condition.

#Program :

import numpy as sc

# To create a numpy array from 2 to 25 with interval 23
num_arr = sc.arange(2, 25, 3)

# To compare with all elements in array
bool_arr = num_arr < 15
print(bool_arr)

Output :
[ True  True  True  True  True False False False]

If we pass the resulted bool Numpy array in [] operartor then it will form a new Numpy Array with elements which were found True in corresponding bool Numpy array.

#Program :

import numpy as sc

num_arr = sc.arange(2, 25, 3)
bool_arr = num_arr < 15
print(bool_arr)

# Those elements will be selected where it is True at corresponding value in bool array
new_arr = num_arr[bool_arr]
print(new_arr)
Output :
[ True  True  True  True  True False False False]
[ 2  5  8 11 14]

Select elements from Numpy Array which are divisible by 5 :

We can select and print those elements which are divisible by 5 from given Numpy array.

 

#Program :

import numpy as sc

# Numpy arrray with elements frrm 3 to 25
num_arr = sc.arange(3, 25, 1)

# To select those numbers which are divisible by 5
new_arr = num_arr[num_arr%5==0]

print(new_arr)
Output :
[ 5  10  15  20 ]

Select elements from Numpy Array which are greater than 10 and less than 18 :

We can select and print those elements which are smaller than 10 and greater than 18 from given Numpy array.

 

#Program :

import numpy as sc

# Numpy arrray with elements frrm 3 to 25
num_arr = sc.arange(3, 25, 1)

# To select those numbers which are greater than 10 and smaller than 18
new_arr = num_arr[(num_arr > 10) & (num_arr < 18)]

print(new_arr)
Output :
[11 12 13 14 15 16 17]

 

Python : Create boolean Numpy array with all True or all False or random boolean values

How to create boolean Numpy array with all True or all False or random boolean values in Python ?

This article is about creating boolean Numpy array with all True or all False or random boolean values. Here we will discuss various ways of creating boolean Numpy array. So let’s start the topic.

Approach-1 : Creating 1-D boolean Numpy array with random boolean values

Python’s Numpy module provide random.choice( ) function which will create a boolean Numpy array with some random values.

Syntax : numpy.random.choice(a, size=None, replace=True, p=None)

where,

  • a: A Numpy array from which random values will be generated
  • size: Shape of array that is to b generated
  • replace: Here the sample is with or without replacement

To create a 1-D Numpy array with random true and false values, we will initially take a bool array and pass the array to numpy.random.choice() with some size which will generate a array with random true and false values.

So, let’s see how it actually works.

#Program :

import numpy as sc

# bool Array for random sampling with only 2 values i.e. True anf False
sample_Barr = [True, False]

# To create a numpy array with random true and false with size 5
bool_Narr = sc.random.choice(sample_Barr, size=5)

print('Numpy Array with random values: ')
print(bool_Narr)
Output :
Numpy Array with random values:
[False False False  True False]

Approach-2 : Creating 2-D boolean Numpy array with random values :

For implementing this we can pass the size of 2D array as tuple in random.choice( ) function.

So, let’s see how it actually works.

#Program :

import numpy as sc

# bool Array for random sampling with only 2 values i.e. True anf False
sample_Barr = [True, False]

# To create a 2D numpy array with random true and false values
bool_Narr = sc.random.choice(sample_Barr, size=(3,3))

print('2D Numpy Array with random values: ')
print(bool_Narr)
Output :
2D Numpy Array with random values:
[[False False False]
[False  True  True]
[ True False  True]]

Approach-3 : Create a 1-D Bool array with all True values :

We can use numpy.ones( ) function to form boolean Numpy array with all True values. nump.ones( ) creates a numpy array with initializing value with 1, later dtype argument is passed as bool which converts all 1 to True.

So, let’s see how it actually works.

#Program :

import numpy as sc

# To create a 1D numpy array with all true values
bool_Narr = sc.ones(5, dtype=bool)

print('1D Numpy Array with all true values: ')
print(bool_Narr)

Output :
1D Numpy Array with all true values:
[ True  True  True  True  True]

Approach-4 : Create a 1-D Bool array with all False values :

We can use numpy.zeros( ) function to form boolean Numpy array with all True values. nump.zeros( ) creates a numpy array with initializing value with 0, later dtype argument is passed as bool which converts all 0 to False.

So, let’s see how it actually works.

# Program :

import numpy as sc

# To create a 1D numpy array with all false values
bool_Narr = sc.zeros(5, dtype=bool)

print('1D Numpy Array with all false values: ')
print(bool_Narr)
Output :
1D Numpy Array with all false values:
[False False False False False]

Approach-5 :  Creating 2-D Numpy array with all True values :

Here also numpy.ones( ) including passing the dtype argument as bool can be used to generate all values as true in a 2D  numpy array.

So, let’s see how it actually works.

# Program :

import numpy as sc

# To create a 2D numpy array with all true values
bool_Narr = sc.ones((3,4), dtype=bool)

print('2D Numpy Array with all true values: ')
print(bool_Narr)
Output :
2D Numpy Array with all true values:
[[ True  True  True  True]
[ True  True  True  True]
[ True  True  True  True]]

Approach-6 :  Creating 2D Numpy array with all False values :

Here also numpy.zeros( ) including passing the dtype argument as bool can be used to generate all values as false in a 2D  numpy array.

So, let’s see how it actually works.

#Program :

import numpy as sc

# To create a 2D numpy array with all true values
bool_Narr = sc.zeros((3,4), dtype=bool)

print('2D Numpy Array with all true values: ')
print(bool_Narr)
Output :
2D Numpy Array with all false values:
[[False False False False]
[False False False False]
[False False False False]]

Converting a List to bool Numpy array :

Approach-7 : Convert a list of integers to boolean numpy array :

Here we will pass dtype arguement as bool in numpy.array( ) function where each and every elements in the list will be converted as true/ false values, i.e. all 0s will be converted to false and any integer other than 0 to true.

So, let’s see how it actually works.

#Program :

import numpy as sc

# Integers list
list_ele = [8,55,0,24,100,0,0,-1]

# To convert a list of integers to boolean array
bool_Narr = sc.array(list_ele, dtype=bool)

print('Boolean Numpy Array: ')
print(bool_Narr)
Output :
Numpy Array:
[ True  True  False  True  True  False False  True]

Approach-8 : Convert a heterogeneous list to boolean numpy array :

In python, list can contain elements with different data types i.e. heterogeneous in nature. But Numpy arrays are homogeneous in nature i.e. elements with same data type, so it will convert all 0s to false and any other values which can be of any data type to true.

So, let’s see how it actually works.

#Program :

import numpy as sc

# Integers list
list_ele = [8,55,0,24.56,100,0,-85.6,"Satya"]

# To convert a list of integers to boolean array
bool_Narr = sc.array(list_ele, dtype=bool)

print('Boolean Numpy Array: ')
print(bool_Narr)
Output :
Boolean Numpy Array:
[ True  True  False  True  True False  True  True]

 

Python Pandas : Drop columns in DataFrame by label Names or by Index Positions

How to drop columns in DataFrame by label Names or by Index Positions in Python ?

In this article, we are going to demonstrate how to drop columns in a dataframe by their labels or index. So, let’s start exploring the topic in detail.

In dataframe there is a function drop() which can be used to drop columns.

Syntax – DataFrame.drop

(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

Where,

  • labels : The label or the labels we want to delete are passed here.
  • index : The index of the position to be deleted
  • columns : If it is True it deletes the columns, else the rows
  • axis : If axis is 1 it considers the columns like the columns. If 0, it works on rows.
  • inplace : If it is False then it doesn’t modify the dataframe, it returns a new one. If it is set to True it modifies the dataframe.

We will be using the following dataset as example :

     Regd     Name        Age      City           Exp
a    10         Jill             16.0     Tokyo         10
b    11         Rachel      38.0     Texas           5
c    12         Kirti          39.0      New York    7
d    13        Veena       40.0     Texas           21
e    14        Lucifer      NaN     Texas           30
f     15        Pablo        30.0     New York     7
g    16       Lionel        45.0     Colombia    11

Delete a Single column in DataFrame by Column Name :

We can delete a single column just by passing its name into the function. Let’s try deleting ‘Age’ column from the Dataframe.

Let’s see the program how to implement this.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [(10,'Jill',    16,     'Tokyo',    10),
            (11,'Rachel',  38,     'Texas',     5),
            (12,'Kirti',   39,     'New York',  7),
            (13,'Veena',   40,     'Texas',    21),
            (14,'Lucifer', np.NaN, 'Texas',    30),
            (15,'Pablo',   30,     'New York',  7),
            (16,'Lionel',  45,     'Colombia', 11) ]
#Creating a dataframe object
dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) 
#Modifying the dataframe and storing it into a new object
modDfObj = dfObj.drop('Age' , axis='columns')
print(modDfObj)
Output :
    Regd     Name      City         Exp
a    10        Jill           Tokyo       10
b    11      Rachel     Texas          5
c    12       Kirti       New York    7
d    13      Veena     Texas          21
e    14      Lucifer     Texas         30
f    15       Pablo     New York    7
g    16      Lionel    Colombia    11

Drop Multiple Columns by Label Names in DataFrame :

To delete multiple columns by name we just have to pass all the names as a list into the function. Let’s try deleting ‘Age’ and ‘Exp’

Let’s see the program how to implement this.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [(10,'Jill',    16,     'Tokyo',    10),
            (11,'Rachel',  38,     'Texas',     5),
            (12,'Kirti',   39,     'New York',  7),
            (13,'Veena',   40,     'Texas',    21),
            (14,'Lucifer', np.NaN, 'Texas',    30),
            (15,'Pablo',   30,     'New York',  7),
            (16,'Lionel',  45,     'Colombia', 11) ]
#Creating a dataframe object
dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) 
#Modifying the dataframe without the columns and storing it into a new object
modDfObj = dfObj.drop(['Age' , 'Exp'] , axis='columns')
print(modDfObj)
Output :
   Regd   Name     City
a    10     Jill          Tokyo
b    11   Rachel     Texas
c    12    Kirti        New York
d    13    Veena     Texas
e    14   Lucifer     Texas
f    15    Pablo      New York
g    16   Lionel     Colombia

Drop Columns by Index Position in DataFrame :

In case we know the index position of the columns we want to drop, we can pass them into the function. Let’s try deleting the same two columns as above but with their index position.

Let’s see the program how to implement this.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [(10,'Jill',    16,     'Tokyo',    10),
            (11,'Rachel',  38,     'Texas',     5),
            (12,'Kirti',   39,     'New York',  7),
            (13,'Veena',   40,     'Texas',    21),
            (14,'Lucifer', np.NaN, 'Texas',    30),
            (15,'Pablo',   30,     'New York',  7),
            (16,'Lionel',  45,     'Colombia', 11) ]
#Creating a dataframe object
dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) 
#Modifying the dataframe without the columns and storing it into a new object by passsing the index of the columns
modDfObj = dfObj.drop([dfObj.columns[2] , dfObj.columns[4]] ,  axis='columns')
print(modDfObj)
Output :
    Regd   Name      City
a    10     Jill           Tokyo
b    11    Rachel     Texas
c    12     Kirti         New York
d    13    Veena      Texas
e    14    Lucifer     Texas
f    15     Pablo       New York
g    16    Lionel      Colombia

Drop Columns in Place :

In case we don’t want a new dataframe object to hold the modified values, but want to store it in the same object, we can do it by passing inplace= True. Let’s use the previous example for this.

Let’s see the program how to implement this.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [(10,'Jill',    16,     'Tokyo',    10),
            (11,'Rachel',  38,     'Texas',     5),
            (12,'Kirti',   39,     'New York',  7),
            (13,'Veena',   40,     'Texas',    21),
            (14,'Lucifer', np.NaN, 'Texas',    30),
            (15,'Pablo',   30,     'New York',  7),
            (16,'Lionel',  45,     'Colombia', 11) ]
#Creating a dataframe object
dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) 
#Modifying the dataframe without the columns and storing it into the same object 
dfObj.drop([dfObj.columns[2] , dfObj.columns[4]] ,  axis='columns',inplace = True)
print(dfObj)
Output :
     Regd     Name      City
a    10          Jill         Tokyo
b    11       Rachel     Texas
c    12        Kirti       New York
d    13       Veena     Texas
e    14      Lucifer     Texas
f    15       Pablo    New York
g    16     Lionel     Colombia

Drop Column If Exists :

In case the column/row does not exist we can do a check beforehand to avoid further bugs in the program. We can do so by using the same function drop( ) , it checks for the columns and if it is not found it returns KeyError which we can handle by an if-else condition.

#program :

import numpy as np
import pandas as pd

# Example data
students = [(10,'Jill',    16,     'Tokyo',    10),
            (11,'Rachel',  38,     'Texas',     5),
            (12,'Kirti',   39,     'New York',  7),
            (13,'Veena',   40,     'Texas',    21),
            (14,'Lucifer', np.NaN, 'Texas',    30),
            (15,'Pablo',   30,     'New York',  7),
            (16,'Lionel',  45,     'Colombia', 11) ]
#Creating a dataframe object
dfObj = pd.DataFrame(students, columns=['Regd','Name','Age','City','Exp'], index=['a', 'b', 'c' , 'd' , 'e' , 'f', 'g']) 
#Checking for a non-existent column
if 'Last Name' in dfObj.columns :
    dfObj.drop('Last Name' ,  axis='columns')
    print(dfObj)
else :
    print("The column was not found")
Output :
The column was not found

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Remove Contents from a Dataframe