Shikha Mishra

Python Word Count (Filter out Punctuation, Dictionary Manipulation, and Sorting Lists)

Python Word Count (Filter out Punctuation, Dictionary Manipulation, and Sorting Lists)

In this tutorial, we will discuss python word count (Filter out Punctuation, Dictionary Manipulation, and Sorting Lists). Also, you guys can see some of the approaches on Output a List of Word Count Pairs. Let’s use the below links and have a quick reference on this python concept.

How to count the number of words in a sentence, ignoring numbers, punctuation, and whitespace?

First, we will take a paragraph after that we will clean punctuation and transform all words to lowercase. Then we will count how many times each word occurs in that paragraph.

Text="Python can be easy to pick up whether you're a first time programmer or you're experienced with other languages. The following pages are a useful first step to get on your way writing programs with Python!The community hosts conferences and meetups, collaborates on code, and much more. Python's documentation will help you along the way, and the mailing lists will keep you in touch.Python is developed under an OSI-approved open source license, making it freely usable and distributable, even for commercial use. Python's license is administered.Python is a general-purpose coding language—which means that, unlike HTML, CSS, and JavaScript, it can be used for other types of programming and software development besides web development. That includes back end development, software development, data science and writing system scripts among other things."
for char in '-.,\n':
Text=Text.replace(char,' ')
Text = Text.lower()
# split returns a list of words delimited by sequences of whitespace (including tabs, newlines, etc, like re's \s) 
word_list = Text.split()
print(word_list)

Output:

['python', 'can', 'be', 'easy', 'to', 'pick', 'up', 'whether', 
"you're", 'a', 'first', 'time', 'programmer', 'or', "you're",
 'experienced', 'with', 'other', 'languages', 'the', 'following', 
'pages', 'are', 'a', 'useful', 'first', 'step', 'to', 'get', 'on', 'your', 
'way', 'writing', 'programs', 'with', 'python!the', 'community',
 'hosts', 'conferences', 'and', 'meetups', 'collaborates', 'on', 'code', 
'and', 'much', 'more', "python's", 'documentation', 'will', 'help', 'you',
 'along', 'the', 'way', 'and', 'the', 'mailing', 'lists', 'will', 'keep', 'you', 'in',
 'touch', 'python', 'is', 'developed', 'under', 'an', 'osi', 'approved', 'open',
 'source', 'license', 'making', 'it', 'freely', 'usable', 'and', 'distributable', 
'even', 'for', 'commercial', 'use', "python's", 'license', 'is', 'administered', 
'python', 'is', 'a', 'general', 'purpose', 'coding', 'language—which', 'means', 
'that', 'unlike', 'html', 'css', 'and', 'javascript', 'it', 'can', 'be', 'used', 'for', 'other', 
'types', 'of', 'programming', 'and', 'software', 'development', 'besides', 'web', 
'development', 'that', 'includes', 'back', 'end', 'development', 'software', 
'development', 'data', 'science', 'and', 'writing', 'system', 'scripts', 'among', 'other', 'things']

So in the above output, you can see a list of word count pairs which is sorted from highest to lowest.

Thus, now we are going to discuss some approaches.

Also Check:

Output a List of Word Count Pairs (Sorted from Highest to Lowest)

1. Collections Module:

The collections module approach is the easiest one but for using this we have to know which library we are going to use.

from collections import Counter

Counter(word_list).most_common()

In this, collections module, we will import the counter then implement this in our programme.

from collections import Counter
Text="Python can be easy to pick up whether you're a first time programmer or you're experienced with other languages. The following pages are a useful first step to get on your way writing programs with Python!The community hosts conferences and meetups, collaborates on code, and much more. Python's documentation will help you along the way, and the mailing lists will keep you in touch.Python is developed under an OSI-approved open source license, making it freely usable and distributable, even for commercial use. Python's license is administered.Python is a general-purpose coding language—which means that, unlike HTML, CSS, and JavaScript, it can be used for other types of programming and software development besides web development. That includes back end development, software development, data science and writing system scripts among other things."
word_list = Text.split()
count=Counter(word_list).most_common()
print(count)

Output:

[('and', 7), ('a', 3), ('other', 3), ('is', 3), ('can', 2), ('be', 2), ('to', 2), 
("you're", 2), ('first', 2), ('with', 2), ('on', 2), ('writing', 2), ("Python's", 2),
 ('will', 2), ('you', 2), ('the', 2), ('it', 2), ('for', 2), ('software', 2), ('development,', 2), 
('Python', 1), ('easy', 1), ('pick', 1), ('up', 1), ('whether', 1), ('time', 1), ('programmer', 1),
 ('or', 1), ('experienced', 1), ('languages.', 1), ('The', 1), ('following', 1), ('pages', 1), ('are', 1), 
('useful', 1), ('step', 1), ('get', 1), ('your', 1), ('way', 1), ('programs', 1), ('Python!The', 1), 
('community', 1), ('hosts', 1), ('conferences', 1), ('meetups,', 1), ('collaborates', 1), ('code,', 1), 
('much', 1), ('more.', 1), ('documentation', 1), ('help', 1), ('along', 1), ('way,', 1), ('mailing', 1),
 ('lists', 1), ('keep', 1), ('in', 1), ('touch.Python', 1), ('developed', 1), ('under', 1), ('an', 1),
 ('OSI-approved', 1), ('open', 1), ('source', 1), ('license,', 1), ('making', 1), ('freely', 1),
 ('usable', 1), ('distributable,', 1), ('even', 1), ('commercial', 1), ('use.', 1), ('license', 1), 
('administered.Python', 1), ('general-purpose', 1), ('coding', 1), ('language—which', 1), ('means', 1),
 ('that,', 1), ('unlike', 1), ('HTML,', 1), ('CSS,', 1), ('JavaScript,', 1), ('used', 1), ('types', 1), ('of', 1), 
('programming', 1), ('development', 1), ('besides', 1), ('web', 1), ('development.', 1), ('That', 1), 
('includes', 1), ('back', 1), ('end', 1), ('data', 1), ('science', 1), ('system', 1), ('scripts', 1), ('among', 1), ('things.', 1)]

2. Using For Loops:

This is the second approach and in this, we will use for loop and dictionary get method.

from collections import Counter
Text="Python can be easy to pick up whether you're a first time programmer or you're experienced with other languages. The following pages are a useful first step to get on your way writing programs with Python!The community hosts conferences and meetups, collaborates on code, and much more. Python's documentation will help you along the way, and the mailing lists will keep you in touch.Python is developed under an OSI-approved open source license, making it freely usable and distributable, even for commercial use. Python's license is administered.Python is a general-purpose coding language—which means that, unlike HTML, CSS, and JavaScript, it can be used for other types of programming and software development besides web development. That includes back end development, software development, data science and writing system scripts among other things."
word_list = Text.split()
# Initializing Dictionary
d = {}
# counting number of times each word comes up in list of words (in dictionary)
for word in word_list: 
    d[word] = d.get(word, 0) + 1
word_freq = []
for key, value in d.items():
    word_freq.append((value, key))
word_freq.sort(reverse=True) 
print(word_freq)

Output:

[(7, 'and'), (3, 'other'), (3, 'is'), (3, 'a'), (2, "you're"), (2, 'you'), (2, 'writing'),
 (2, 'with'), (2, 'will'), (2, 'to'), (2, 'the'), (2, 'software'), (2, 'on'), (2, 'it'), (2, 'for'), (
2, 'first'), (2, 'development,'), (2, 'can'), (2, 'be'), (2, "Python's"), (1, 'your'), (1, 'whether'),
 (1, 'web'), (1, 'way,'), (1, 'way'), (1, 'useful'), (1, 'used'), (1, 'use.'), (1, 'usable'), (1, 'up'), 
(1, 'unlike'), (1, 'under'), (1, 'types'), (1, 'touch.Python'), (1, 'time'), (1, 'things.'), (1, 'that,'), 
(1, 'system'), (1, 'step'), (1, 'source'), (1, 'scripts'), (1, 'science'), (1, 'programs'),
 (1, 'programming'), (1, 'programmer'), (1, 'pick'), (1, 'pages'), (1, 'or'), (1, 'open'), 
(1, 'of'), (1, 'much'), (1, 'more.'), (1, 'meetups,'), (1, 'means'), (1, 'making'), (1, 'mailing'),
 (1, 'lists'), (1, 'license,'), (1, 'license'), (1, 'language—which'), (1, 'languages.'), (1, 'keep'),
 (1, 'includes'), (1, 'in'), (1, 'hosts'), (1, 'help'), (1, 'get'), (1, 'general-purpose'), (1, 'freely'), 
(1, 'following'), (1, 'experienced'), (1, 'even'), (1, 'end'), (1, 'easy'), (1, 'documentation'),
 (1, 'distributable,'), (1, 'development.'), (1, 'development'), (1, 'developed'), (1, 'data'), 
(1, 'conferences'), (1, 'community'), (1, 'commercial'), (1, 'collaborates'), (1, 'coding'), 
(1, 'code,'), (1, 'besides'), (1, 'back'), (1, 'are'), (1, 'an'), (1, 'among'), (1, 'along'), (1, 'administered.Python'),
 (1, 'The'), (1, 'That'), (1, 'Python!The'), (1, 'Python'), (1, 'OSI-approved'), (1, 'JavaScript,'), (1, 'HTML,'), (1, 'CSS,')]

So in the above approach, we have used for loop after that we reverse the key and values so they can be sorted using tuples. Now we sorted from lowest to highest.

3. Not using Dictionary Get Method:

So in this approach, we will not use the get method dictionary.

from collections import Counter
Text="Python can be easy to pick up whether you're a first time programmer or you're experienced with other languages. The following pages are a useful first step to get on your way writing programs with Python!The community hosts conferences and meetups, collaborates on code, and much more. Python's documentation will help you along the way, and the mailing lists will keep you in touch.Python is developed under an OSI-approved open source license, making it freely usable and distributable, even for commercial use. Python's license is administered.Python is a general-purpose coding language—which means that, unlike HTML, CSS, and JavaScript, it can be used for other types of programming and software development besides web development. That includes back end development, software development, data science and writing system scripts among other things."
word_list = Text.split()
# Initializing Dictionary
d = {}

# Count number of times each word comes up in list of words (in dictionary)
for word in word_list:
    if word not in d:
        d[word] = 0
    d[word] += 1
word_freq = []
for key, value in d.items():
    word_freq.append((value, key))
word_freq.sort(reverse=True)
print(word_freq)

Output:

[(7, 'and'), (3, 'other'), (3, 'is'), (3, 'a'), (2, "you're"), (2, 'you'), (2, 'writing'),
 (2, 'with'), (2, 'will'), (2, 'to'), (2, 'the'), (2, 'software'), (2, 'on'), (2, 'it'), (2, 'for'), (2, 'first'), 
(2, 'development,'), (2, 'can'), (2, 'be'), (2, "Python's"), (1, 'your'), (1, 'whether'), (1, 'web'),
 (1, 'way,'), (1, 'way'), (1, 'useful'), (1, 'used'), (1, 'use.'), (1, 'usable'), (1, 'up'), (1, 'unlike'),
 (1, 'under'), (1, 'types'), (1, 'touch.Python'), (1, 'time'), (1, 'things.'), (1, 'that,'), (1, 'system'), 
(1, 'step'), (1, 'source'), (1, 'scripts'), (1, 'science'), (1, 'programs'), (1, 'programming'), 
(1, 'programmer'), (1, 'pick'), (1, 'pages'), (1, 'or'), (1, 'open'), (1, 'of'), (1, 'much'),
 (1, 'more.'), (1, 'meetups,'), (1, 'means'), (1, 'making'), (1, 'mailing'), (1, 'lists'), (1, 'license,'), 
(1, 'license'), (1, 'language—which'), (1, 'languages.'), (1, 'keep'), (1, 'includes'), (1, 'in'), (1, 'hosts'),
 (1, 'help'), (1, 'get'), (1, 'general-purpose'), (1, 'freely'), (1, 'following'), (1, 'experienced'), 
(1, 'even'), (1, 'end'), (1, 'easy'), (1, 'documentation'), (1, 'distributable,'), (1, 'development.'),
 (1, 'development'), (1, 'developed'), (1, 'data'), (1, 'conferences'), (1, 'community'), (1, 'commercial'),
 (1, 'collaborates'), (1, 'coding'), (1, 'code,'), (1, 'besides'), (1, 'back'), (1, 'are'), (1, 'an'), (1, 'among'),
 (1, 'along'), (1, 'administered.Python'), (1, 'The'), (1, 'That'), (1, 'Python!The'), (1, 'Python'), 
(1, 'OSI-approved'), (1, 'JavaScript,'), (1, 'HTML,'), (1, 'CSS,')]

4. Using Sorted:

# initializing a dictionary
d = {};

# counting number of times each word comes up in list of words
for key in word_list: 
    d[key] = d.get(key, 0) + 1

sorted(d.items(), key = lambda x: x[1], reverse = True)

Conclusion:

In this article, you have seen different approaches on how to count the number of words in a sentence, ignoring numbers, punctuation, and whitespace. Thank you!

Python- How to convert a timestamp string to a datetime object using datetime.strptime()

Python: How to convert a timestamp string to a datetime object using datetime.strptime()

In this tutorial, we will learn how to convert a timestamp string to a datetime object using datetime.strptime(). Also, you can understand how to to create a datetime object from a string in Python with examples below.

String to a DateTime object using datetime.strptime()

Thestrptime()method generates a datetime object from the given string.

Datetime module provides a datetime class that has a method to convert string to a datetime object.

Syntax:

datetime.strptime(date_string, format)

So in the above syntax, you can see that it accepts a string containing a timestamp. It parses the string according to format codes and returns a datetime object created from it.

First import datetime class from datetime module to use this,

from datetime import datetime

Also Read:

Complete Format Code List

Format Codes Description Example
%d Day of the month as a zero-padded decimal number 01, 02, 03, 04 …, 31
%a Weekday as the abbreviated name Sun, Mon, …, Sat
%A Weekday as full name Sunday, Monday, …, Saturday
%m Month as a zero-padded decimal number 01, 02, 03, 04 …, 12
%b Month as an abbreviated name Jan, Feb, …, Dec
%B Month as full name January, February, …, December
%y A Year without century as a zero-padded decimal number 00, 01, …, 99
%Y A Year with a century as a decimal number 0001, …, 2018, …, 9999
%H Hour (24-hour clock) as a zero-padded decimal number 01, 02, 03, 04 …, 23
%M Minute as a zero-padded decimal number 01, 02, 03, 04 …, 59
%S Second as a zero-padded decimal number 01, 02, 03, 04 …, 59
%f Microsecond as a decimal number, zero-padded on the left 000000, 000001, …, 999999
%I Hour (12-hour clock) as a zero-padded decimal number 01, 02, 03, 04 …, 12
%p Locale’s equivalent of either AM or PM AM, PM
%j Day of the year as a zero-padded decimal number 01, 02, 03, 04 …, 366

How strptime() works?

In thestrptime()class method, it takes two arguments:

  • string (that be converted to datetime)
  • format code

In the accordance with the string and format code used, the method returns its equivalent datetime object.

Let’s see the following example, to understand how it works:

python strptime method example

where,

%d – Represents the day of the month. Example: 01, 02, …, 31
%B – Month’s name in full. Example: January, February etc.
%Y – Year in four digits. Example: 2018, 2019 etc.

Examples of converting a Time String in the format codes using strptime() method

Just have a look at the few examples on how to convert timestamp string to a datetime object using datetime.strptime() in Python and gain enough knowledge on it.

Example 1:

Let’s take an example,

from datetime import datetime
datetimeObj = datetime.strptime('2021-05-17T15::11::45.456777', '%Y-%m-%dT%H::%M::%S.%f')
print(datetimeObj)
print(type(datetimeObj))

Output:

2021-05-17 15:11:45.456777
<class 'datetime.datetime'>

So in the above example, you can see that we have converted a time string in the format “YYYY-MM-DDTHH::MM::SS.MICROS” to a DateTime object.

Let’s take another example,

Example 2:

from datetime import datetime
datetimeObj = datetime.strptime('17/May/2021 14:12:22', '%d/%b/%Y %H:%M:%S')
print(datetimeObj)
print(type(datetimeObj))

Output:

2021-05-17 14:12:22
<class 'datetime.datetime'>

So this is the other way to show timestamp here we have converted a time string in the format “DD/MM/YYYY HH::MM::SS” to a datetime object.

Example 3:

If we want to show the only date in this format “DD MMM YYYY”. We do like this,

from datetime import datetime
datetimeObj = datetime.strptime('17 May 2021', '%d %b %Y')
# Get the date object from datetime object
dateObj = datetimeObj.date()
print(dateObj)
print(type(dateObj))

Output:

2021-05-17
<class 'datetime.date'>

Example 4:

So if we want to show only time “‘HH:MM:SS AP‘” in this format. We will do like that,

from datetime import datetime
datetimeObj = datetime.strptime('08:12:22 PM', '%I:%M:%S %p') 
# Get the time object from datetime object 
timeObj = datetimeObj.time()
print(timeObj) 
print(type(timeObj))

Output:

20:12:22
<class 'datetime.time'>

Example 5:

If we want to show our timestamp in text format. We will execute like that,

from datetime import datetime
textStr = "On January the 17th of 2021 meet me at 8 PM"
datetimeObj = datetime.strptime(textStr, "On %B the %dth of %Y meet me at %I %p")
print(datetimeObj)

Output:

2021-01-17 20:00:00

Conclusion:

So in the above tutorial, you can see that we have shown different methods of how to convert a timestamp string to a datetime object using datetime.strptime(). Thank you!

Pandas- Select first or last N rows in a Dataframe using head() & tail()

Pandas: Select first or last N rows in a Dataframe using head() & tail()

In this tutorial, we are going to discuss how to select the first or last N rows in a Dataframe using head() & tail() functions. This guide describes the following contents.

Select first N Rows from a Dataframe using head() function

pandas.DataFrame.head()

In Python’s Pandas module, the Dataframe class gives the head() function to fetch top rows from it.

Syntax:

DataFrame.head(self, n=5)

If we give some value to n it will return n number of rows otherwise default is 5.

Let’s create a dataframe first,

import pandas as pd
# List of Tuples
empoyees = [('Ram', 34, 'Sunderpur', 5) ,
           ('Riti', 31, 'Delhi' , 7) ,
           ('Aman', 16, 'Thane', 9) ,
           ('Shishir', 41,'Delhi' , 12) ,
           ('Veeru', 33, 'Delhi' , 4) ,
           ('Shan',35,'Mumbai', 5 ),
           ('Shikha', 35, 'kolkata', 11)
            ]
# Create a DataFrame object
empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])
print("Contents of the Dataframe : ")
print(empDfObj)

Output:

Contents of the Dataframe :
     Name  Age  City           Experience
a   Ram     34   Sunderpur 5
b   Riti       31  Delhi          7
c   Aman   16  Thane         9
d  Shishir   41 Delhi          12
e  Veeru     33 Delhi          4
f   Shan      35 Mumbai     5
g  Shikha   35 kolkata      11

So if we want to select the top 4 rows from the dataframe,

import pandas as pd
# List of Tuples
empoyees = [('Ram', 34, 'Sunderpur', 5) ,
           ('Riti', 31, 'Delhi' , 7) ,
           ('Aman', 16, 'Thane', 9) ,
           ('Shishir', 41,'Delhi' , 12) ,
           ('Veeru', 33, 'Delhi' , 4) ,
           ('Shan',35,'Mumbai', 5 ),
           ('Shikha', 35, 'kolkata', 11)
            ]
# Create a DataFrame object
empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])

dfObj1 = empDfObj.head(4)
print("First 4 rows of the Dataframe : ")
print(dfObj1)

Output:

First 4 rows of the Dataframe :
  Name    Age   City         Experience
a Ram      34     Sunderpur 5
b Riti        31    Delhi          7
c Aman    16    Thane        9
d Shishir   41   Delhi         12

So in the above example, you can see that we have given n value 4 so it returned the top 4 rows from the dataframe.

Do Check:

Select first N rows from the dataframe with specific columns

In this, while selecting the first 3 rows, we can select specific columns too,

import pandas as pd
# List of Tuples
empoyees = [('Ram', 34, 'Sunderpur', 5) ,
           ('Riti', 31, 'Delhi' , 7) ,
           ('Aman', 16, 'Thane', 9) ,
           ('Shishir', 41,'Delhi' , 12) ,
           ('Veeru', 33, 'Delhi' , 4) ,
           ('Shan',35,'Mumbai', 5 ),
           ('Shikha', 35, 'kolkata', 11)
            ]
# Create a DataFrame object
empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])

# Select the top 3 rows of the Dataframe for 2 columns only
dfObj1 = empDfObj[['Name', 'City']].head(3)
print("First 3 rows of the Dataframe for 2 columns : ")
print(dfObj1)

Output:

First 3 rows of the Dataframe for 2 columns :
   Name  City
a Ram    Sunderpur
b Riti      Delhi
c Aman  Thane

Select last N Rows from a Dataframe using tail() function

In the Pandas module, the Dataframe class provides a tail() function to select bottom rows from a Dataframe.

Syntax:

DataFrame.tail(self, n=5)

It will return the last n rows from a dataframe. If n is not provided then the default value is 5. So for this, we are going to use the above dataframe as an example,

import pandas as pd
# List of Tuples
empoyees = [('Ram', 34, 'Sunderpur', 5) ,
           ('Riti', 31, 'Delhi' , 7) ,
           ('Aman', 16, 'Thane', 9) ,
           ('Shishir', 41,'Delhi' , 12) ,
           ('Veeru', 33, 'Delhi' , 4) ,
           ('Shan',35,'Mumbai', 5 ),
           ('Shikha', 35, 'kolkata', 11)
            ]
# Create a DataFrame object
empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])

# Select the last 4 rows of the Dataframe
dfObj1 = empDfObj.tail(4)
print("Last 4 rows of the Dataframe : ")
print(dfObj1)

Output:

Last 5 rows of the Dataframe :
  Name     Age City    Experience
d Shishir   41   Delhi      12
e Veeru    33   Delhi       4
f Shan      35   Mumbai  5
g Shikha  35   kolkata    11

So in above example, you can see that we are given n value 4 so tail() function return last 4 data value.

Select bottom N rows from the dataframe with specific columns

In this, while selecting the last 4 rows, we can select specific columns too,

import pandas as pd
# List of Tuples
empoyees = [('Ram', 34, 'Sunderpur', 5) ,
           ('Riti', 31, 'Delhi' , 7) ,
           ('Aman', 16, 'Thane', 9) ,
           ('Shishir', 41,'Delhi' , 12) ,
           ('Veeru', 33, 'Delhi' , 4) ,
           ('Shan',35,'Mumbai', 5 ),
           ('Shikha', 35, 'kolkata', 11)
            ]
# Create a DataFrame object
empDfObj = pd.DataFrame(empoyees, columns=['Name', 'Age', 'City', 'Experience'], index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])

# Select the bottom 4 rows of the Dataframe for 2 columns only
dfObj1 = empDfObj[['Name', 'City']].tail(4)
print("Last 4 rows of the Dataframe for 2 columns : ")
print(dfObj1)

Output:

Last 4 rows of the Dataframe for 2 columns :
     Name   City
d  Shishir  Delhi
e  Veeru    Delhi
f   Shan     Mumbai
g  Shikha   kolkata

Conclusion:

In this article, you have seen how to select first or last N  rows in a Dataframe using head() & tail() functions. Thank you!

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Select items from a Dataframe

Pandas- Find maximum values & position in columns or rows of a Dataframe

Pandas: Find maximum values & position in columns or rows of a Dataframe | How to find the max value of a pandas DataFrame column in Python?

In this article, we will discuss how to find maximum value & position in rows or columns of a Dataframe and its index position.

DataFrame.max()

Python pandas provide a member function in the dataframe to find the maximum value.

Syntax:

DataFrame.max(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)

Dataframe.max() accepts these arguments:

axis: Where max element will be searched

skipna: Default is True means if not provided it will be skipped.

Let’s create a dataframe,

import pandas as pd
import numpy as np
# List of Tuples
matrix = [(17, 15, 12),
          (53, np.NaN, 10),
          (46, 34, 11),
          (35, 45, np.NaN),
          (76, 26, 13)
          ]
# Create a DataFrame object
dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz'))
print(dfObj)

Output:

   x             y      z
a 17     15.0   12.0
b 53     NaN   10.0
c 46      34.0   11.0
d 35      45.0   NaN
e 76      26.0   13.0

Get maximum values in every row & column of the Dataframe

Here, you will find two ways to get the maximum values in dataframe

Also Check: 

Get maximum values of every column

In this, we will call the max() function to find the maximum value of every column in DataFrame.

import pandas as pd
import numpy as np
# List of Tuples
matrix = [(17, 15, 12),
          (53, np.NaN, 10),
          (46, 34, 11),
          (35, 45, np.NaN),
          (76, 26, 13)
          ]
# Create a DataFrame object
dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz'))
# Get a series containing maximum value of each column
maxValuesObj = dfObj.max()
print('Maximum value in each column : ')
print(maxValuesObj)

Output:

Maximum value in each column :
x 76.0
y 45.0
z 13.0

Get maximum values of every row

In this also we will call the max() function to find the maximum value of every row in DataFrame.

import pandas as pd
import numpy as np
# List of Tuples
matrix = [(17, 15, 12),
          (53, np.NaN, 10),
          (46, 34, 11),
          (35, 45, np.NaN),
          (76, 26, 13)
          ]
# Create a DataFrame object
dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz'))
# Get a series containing maximum value of each row
maxValuesObj = dfObj.max(axis=1)
print('Maximum value in each row : ')
print(maxValuesObj)

Output:

Maximum value in each row :
a   17.0
b   53.0
c   46.0
d   45.0
e   76.0

So in the above example, you can see that it returned a series with a row index label and maximum value of each row.

Get maximum values of every column without skipping NaN

import pandas as pd
import numpy as np
# List of Tuples
matrix = [(17, 15, 12),
          (53, np.NaN, 10),
          (46, 34, 11),
          (35, 45, np.NaN),
          (76, 26, 13)
          ]
# Create a DataFrame object
dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz'))
# Get a series containing maximum value of each column without skipping NaN
maxValuesObj = dfObj.max(skipna=False)
print('Maximum value in each column including NaN: ')
print(maxValuesObj)

Output:

Maximum value in each column including NaN:
x 76.0
y NaN
z NaN

So in the above example, you can see that we have passed the ‘skipna=False’ in the max() function, So it included the NaN while searching for NaN.

If there is any NaN in the column then it will be considered as the maximum value of that column.

Get maximum values of a single column or selected columns

So for getting a single column maximum value we have to select that column and apply the max() function in it,

import pandas as pd
import numpy as np
# List of Tuples
matrix = [(17, 15, 12),
          (53, np.NaN, 10),
          (46, 34, 11),
          (35, 45, np.NaN),
          (76, 26, 13)
          ]
# Create a DataFrame object
dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz'))
# Get maximum value of a single column 'y'
maxValue = dfObj['y'].max()
print("Maximum value in column 'y': " , maxValue)

Here you can see that we have passed y  maxValue = dfObj['y'].max()for getting max value in that column.

Output:

Maximum value in column 'y': 45.0

We can also pass the list of column names instead of passing single column like.,

import pandas as pd
import numpy as np
# List of Tuples
matrix = [(17, 15, 12),
          (53, np.NaN, 10),
          (46, 34, 11),
          (35, 45, np.NaN),
          (76, 26, 13)
          ]
# Create a DataFrame object
dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz'))
# Get maximum value of a single column 'y'
maxValue = dfObj[['y', 'z']].max()
print("Maximum value in column 'y' & 'z': ")
print(maxValue)

Output:

Maximum value in column 'y' & 'z':
y 45.0
z 13.0

Get row index label or position of maximum values of every column

DataFrame.idxmax()

So in the above examples, you have seen how to get the max value of rows and columns but what if we want to know the index position of that row and column whereas the value is maximum, by using dataframe.idxmax() we get the index position.

Syntax-

DataFrame.idxmax(axis=0, skipna=True)

Get row index label of Maximum value in every column

import pandas as pd
import numpy as np
# List of Tuples
matrix = [(17, 15, 12),
          (53, np.NaN, 10),
          (46, 34, 11),
          (35, 45, np.NaN),
          (76, 26, 13)
          ]
# Create a DataFrame object
dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz'))
# get the index position of max values in every column
maxValueIndexObj = dfObj.idxmax()
print("Max values of columns are at row index position :")
print(maxValueIndexObj)

Output:

Max values of columns are at row index position :
x e
y d
z e
dtype: object

So here you have seen it showed the index position of the column where max value exists.

Get Column names of Maximum value in every row

import pandas as pd
import numpy as np
# List of Tuples
matrix = [(17, 15, 12),
          (53, np.NaN, 10),
          (46, 34, 11),
          (35, 45, np.NaN),
          (76, 26, 13)
          ]
# Create a DataFrame object
dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz'))
# get the column name of max values in every row
maxValueIndexObj = dfObj.idxmax(axis=1)
print("Max values of row are at following columns :")
print(maxValueIndexObj)

Output:

Max values of row are at following columns :
a x
b x
c x
d y
e x
dtype: object

So here you have seen it showed the index position of a row where max value exists.

Conclusion:

So in this article, we have seen how to find maximum value & position in rows or columns of a Dataframe and its index position. Thank you!

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Find Elements in a Dataframe

Python-Add column to dataframe in Pandas

Python: Add column to dataframe in Pandas ( based on other column or list or default value)

In this tutorial, we are going to discuss different ways to add columns to the dataframe in pandas. Moreover, you can have an idea about the Pandas Add Column, Adding a new column to the existing DataFrame in Pandas and many more from the below explained various methods.

Pandas Add Column

Pandas is one such data analytics library created explicitly for Python to implement data manipulation and data analysis. The Pandas library made of specific data structures and operations to deal with numerical tables, analyzing data, and work with time series.

Basically, there are three ways to add columns to pandas i.e., Using [] operator, using assign() function & using insert().

We will discuss it all one by one.

First, let’s create a dataframe object,

import pandas as pd
# List of Tuples
students = [('Rakesh', 34, 'Agra', 'India'),
            ('Rekha', 30, 'Pune', 'India'),
            ('Suhail', 31, 'Mumbai', 'India'),
            ('Neelam', 32, 'Bangalore', 'India'),
            ('Jay', 16, 'Bengal', 'India'),
            ('Mahak', 17, 'Varanasi', 'India')]
# Create a DataFrame object
df_obj = pd.DataFrame(students,
                      columns=['Name', 'Age', 'City', 'Country'],
                      index=['a', 'b', 'c', 'd', 'e', 'f'])
print(df_obj )

Output:

    Name    Age  City        Country
a  Rakesh  34     Agra        India
b  Rekha   30     Pune        India
c  Suhail    31    Mumbai   India
d  Neelam 32   Bangalore India
e  Jay         16   Bengal      India
f  Mahak    17  Varanasi     India

Do Check:

Add column to dataframe in pandas using [] operator

import pandas as pd
# List of Tuples
students = [('Rakesh', 34, 'Agra', 'India'),
            ('Rekha', 30, 'Pune', 'India'),
            ('Suhail', 31, 'Mumbai', 'India'),
            ('Neelam', 32, 'Bangalore', 'India'),
            ('Jay', 16, 'Bengal', 'India'),
            ('Mahak', 17, 'Varanasi', 'India')]
# Create a DataFrame object
df_obj = pd.DataFrame(students,
                      columns=['Name', 'Age', 'City', 'Country'],
                      index=['a', 'b', 'c', 'd', 'e', 'f'])

# Add column with Name Score
df_obj['Score'] = [10, 20, 45, 33, 22, 11]
print(df_obj )

Output:

      Name     Age   City        Country   Score
a    Rakesh    34    Agra          India      10
b    Rekha     30    Pune          India      20
c    Suhail     31     Mumbai    India      45
d    Neelam  32    Bangalore  India      33
e    Jay         16     Bengal       India      22
f    Mahak    17    Varanasi     India       11

So in the above example, you have seen we have added one extra column ‘score’ in our dataframe. So in this, we add a new column to Dataframe with Values in the list. In the above dataframe, there is no column name ‘score’ that’s why it added if there is any column with the same name that already exists then it will replace all its values.

Add new column to DataFrame with same default value

import pandas as pd
# List of Tuples
students = [('Rakesh', 34, 'Agra', 'India'),
            ('Rekha', 30, 'Pune', 'India'),
            ('Suhail', 31, 'Mumbai', 'India'),
            ('Neelam', 32, 'Bangalore', 'India'),
            ('Jay', 16, 'Bengal', 'India'),
            ('Mahak', 17, 'Varanasi', 'India')]
# Create a DataFrame object
df_obj = pd.DataFrame(students,
                      columns=['Name', 'Age', 'City', 'Country'],
                      index=['a', 'b', 'c', 'd', 'e', 'f'])

df_obj['Total'] = 100
print(df_obj)

Output:

         Name    Age    City     Country      Total
a       Rakesh   34    Agra       India          100
b       Rekha    30    Pune       India          100
c       Suhail     31   Mumbai  India          100
d       Neelam  32  Bangalore India        100
e       Jay          16  Bengal       India        100
f       Mahak     17 Varanasi     India        100

So in the above example, we have added a new column ‘Total’ with the same value of 100 in each index.

Add column based on another column

Let’s add a new column ‘Percentage‘ where entrance at each index will be added by the values in other columns at that index i.e.,

df_obj['Percentage'] = (df_obj['Marks'] / df_obj['Total']) * 100
df_obj

Output:

    Name  Age       City    Country  Marks  Total  Percentage
a   jack   34     Sydeny  Australia     10     50        20.0
b   Riti   30      Delhi      India     20     50        40.0
c  Vikas   31     Mumbai      India     45     50        90.0
d  Neelu   32  Bangalore      India     33     50        66.0
e   John   16   New York         US     22     50        44.0
f   Mike   17  las vegas         US     11     50        22.0

Append column to dataFrame using assign() function

So for this, we are going to use the same dataframe which we have created in starting.

Syntax:

DataFrame.assign(**kwargs)

Let’s add columns in DataFrame using assign().

import pandas as pd
# List of Tuples
students = [('Rakesh', 34, 'Agra', 'India'),
            ('Rekha', 30, 'Pune', 'India'),
            ('Suhail', 31, 'Mumbai', 'India'),
            ('Neelam', 32, 'Bangalore', 'India'),
            ('Jay', 16, 'Bengal', 'India'),
            ('Mahak', 17, 'Varanasi', 'India')]
# Create a DataFrame object
df_obj = pd.DataFrame(students,
                      columns=['Name', 'Age', 'City', 'Country'],
                      index=['a', 'b', 'c', 'd', 'e', 'f'])
mod_fd = df_obj.assign(Marks=[10, 20, 45, 33, 22, 11])
print(mod_fd)

Output:

Add-a-column-using-assign

It will return a new dataframe with a new column ‘Marks’ in that Dataframe. Values provided in the list will be used as column values.

Add column in DataFrame based on other column using lambda function

In this method using two existing columns i.e, score and total value we are going to create a new column i.e..’ percentage’.

import pandas as pd
# List of Tuples
students = [('Rakesh', 34, 'Agra', 'India'),
            ('Rekha', 30, 'Pune', 'India'),
            ('Suhail', 31, 'Mumbai', 'India'),
            ('Neelam', 32, 'Bangalore', 'India'),
            ('Jay', 16, 'Bengal', 'India'),
            ('Mahak', 17, 'Varanasi', 'India')]
# Create a DataFrame object
df_obj = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Country'],
                      index=['a', 'b', 'c', 'd', 'e', 'f'])
df_obj['Score'] = [10, 20, 45, 33, 22, 11]
df_obj['Total'] = 100
df_obj = df_obj.assign(Percentage=lambda x: (x['Score'] / x['Total']) * 100)
print(df_obj)

Output:

Add-column-based-on-another-column

Add new column to Dataframe using insert()

import pandas as pd
# List of Tuples
students = [('Rakesh', 34, 'Agra', 'India'),
            ('Rekha', 30, 'Pune', 'India'),
            ('Suhail', 31, 'Mumbai', 'India'),
            ('Neelam', 32, 'Bangalore', 'India'),
            ('Jay', 16, 'Bengal', 'India'),
            ('Mahak', 17, 'Varanasi', 'India')]
# Create a DataFrame object
df_obj = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Country'],
                      index=['a', 'b', 'c', 'd', 'e', 'f'])
# Insert column at the 2nd position of Dataframe
df_obj.insert(2, "Marks", [10, 20, 45, 33, 22, 11], True)
print(df_obj)

Output:

add-a-column-using-insert

 

In other examples, we have added a new column at the end of the dataframe, but in the above example, we insert a new column in between the other columns of the dataframe, then we can use the insert() function.

Add a column to Dataframe by dictionary

import pandas as pd
# List of Tuples
students = [('Rakesh', 34, 'Agra', 'India'),
            ('Rekha', 30, 'Pune', 'India'),
            ('Suhail', 31, 'Mumbai', 'India'),
            ('Neelam', 32, 'Bangalore', 'India'),
            ('Jay', 16, 'Bengal', 'India'),
            ('Mahak', 17, 'Varanasi', 'India')]
# Create a DataFrame object
df_obj = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Country'],
                      index=['a', 'b', 'c', 'd', 'e', 'f'])
ids = [11, 12, 13, 14, 15, 16]
# Provide 'ID' as the column name and for values provide dictionary
df_obj['ID'] = dict(zip(ids, df_obj['Name']))
print(df_obj)

Output:

Add-a-column-using-dictionary

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Add Contents to a Dataframe

Pandas- Loop or Iterate over all or certain columns of a dataframe

Pandas : Loop or Iterate over all or certain columns of a dataframe

In this article, we will discuss how to loop or Iterate overall or certain columns of a DataFrame. Also, you may learn and understand what is dataframe and how pandas dataframe iterate over columns with the help of great explanations and example codes.

About DataFrame

A Pandas DataFrame is a 2-dimensional data structure, like a 2-dimensional array, or a table with rows and columns.

First, we are going to create a dataframe that will use in our article.

import pandas as pd

employees = [('Abhishek', 34, 'Sydney') ,
           ('Sumit', 31, 'Delhi') ,
           ('Sampad', 16, 'New York') ,
           ('Shikha', 32,'Delhi') ,
            ]

#load data into a DataFrame object:
df = pd.DataFrame(employees, columns=['Name', 'Age', 'City'], index=['a', 'b', 'c', 'd'])

print(df)

Output:

    Name       Age    City
a  Abhishek  34    Sydney
b  Sumit       31    Delhi
c  Sampad   16     New York
d  Shikha     32     Delhi

Also Check:

Using DataFrame.iteritems()

We are going to iterate columns of a dataframe using DataFrame.iteritems().

Dataframe class provides a member function iteritems().

import pandas as pd
employees = [('Abhishek', 34, 'Sydney') ,
             ('Sumit', 31, 'Delhi') ,
             ('Sampad', 16, 'New York') ,
             ('Shikha', 32,'Delhi') , ]
#load data into a DataFrame object:
df = pd.DataFrame(employees, columns=['Name', 'Age', 'City'], index=['a', 'b', 'c', 'd'])
# Yields a tuple of column name and series for each column in the dataframe
for (columnName, columnData) in df.iteritems():
   print('Colunm Name : ', columnName)
   print('Column Contents : ', columnData.values)

Output:

Colunm Name : Name
Column Contents : ['Abhishek' 'Sumit' 'Sampad' 'Shikha']
Colunm Name : Age
Column Contents : [34 31 16 32]
Colunm Name : City
Column Contents : ['Sydney' 'Delhi' 'New York' 'Delhi']

In the above example, we have to return an iterator that can be used to iterate over all the columns. For each column, it returns a tuple containing the column name and column contents.

Iterate over columns in dataframe using Column Names

import pandas as pd
employees = [('Abhishek', 34, 'Sydney') ,
             ('Sumit', 31, 'Delhi') ,
             ('Sampad', 16, 'New York') ,
             ('Shikha', 32,'Delhi') , ]
#load data into a DataFrame object:
df = pd.DataFrame(employees, columns=['Name', 'Age', 'City'], index=['a', 'b', 'c', 'd'])
# Yields a tuple of column name and series for each column in the dataframe
for column in df:
   # Select column contents by column name using [] operator
   columnSeriesObj = df[column]
   print('Colunm Name : ', column)
   print('Column Contents : ', columnSeriesObj.values)

Output:

Colunm Name : Name
Column Contents : ['Abhishek' 'Sumit' 'Sampad' 'Shikha']
Colunm Name : Age
Column Contents : [34 31 16 32]
Colunm Name : City
Column Contents : ['Sydney' 'Delhi' 'New York' 'Delhi']

ln the above example, we can see that Dataframe.columns returns a sequence of column names on which we put iteration and return column name and content.

Iterate Over columns in dataframe in reverse order

import pandas as pd
employees = [('Abhishek', 34, 'Sydney') ,
             ('Sumit', 31, 'Delhi') ,
             ('Sampad', 16, 'New York') ,
             ('Shikha', 32,'Delhi') , ]
#load data into a DataFrame object:
df = pd.DataFrame(employees, columns=['Name', 'Age', 'City'], index=['a', 'b', 'c', 'd'])
# Yields a tuple of column name and series for each column in the dataframe
for column in reversed(df.columns):
   # Select column contents by column name using [] operator
   columnSeriesObj = df[column]
   print('Colunm Name : ', column)
   print('Column Contents : ', columnSeriesObj.values)

Output:

Colunm Name : City
Column Contents : ['Sydney' 'Delhi' 'New York' 'Delhi']
Colunm Name : Age
Column Contents : [34 31 16 32]
Colunm Name : Name
Column Contents : ['Abhishek' 'Sumit' 'Sampad' 'Shikha']

We have used reversed(df.columns)which given us the reverse column name and its content.

Iterate Over columns in dataframe by index using iloc[]

import pandas as pd
employees = [('Abhishek', 34, 'Sydney') ,
             ('Sumit', 31, 'Delhi') ,
             ('Sampad', 16, 'New York') ,
             ('Shikha', 32,'Delhi') , ]
#load data into a DataFrame object:
df = pd.DataFrame(employees, columns=['Name', 'Age', 'City'], index=['a', 'b', 'c', 'd'])
# Yields a tuple of column name and series for each column in the dataframe
for index in range(df.shape[1]):
   print('Column Number : ', index)
   # Select column by index position using iloc[]
   columnSeriesObj = df.iloc[: , index]
   print('Column Contents : ', columnSeriesObj.values)

Output:

Column Number : 0
Column Contents : ['Abhishek' 'Sumit' 'Sampad' 'Shikha']
Column Number : 1
Column Contents : [34 31 16 32]
Column Number : 2
Column Contents : ['Sydney' 'Delhi' 'New York' 'Delhi']

So in the above example, you can see that we have iterate over all columns of the dataframe from the 0th index to the last index column. We have selected the contents of the columns using iloc[].

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas

Conclusion:

At last, I can say that the above-explained different methods to iterate over all or certain columns of a dataframe. aids you a lot in understanding the Pandas: Loop or Iterate over all or certain columns of a dataframe. Thank you!

Sorting 2D Numpy Array by column or row in Python

Sorting 2D Numpy Array by column or row in Python | How to sort the NumPy array by column or row in Python?

In this tutorial, we are going to discuss how to sort the NumPy array by column or row in Python. Just click on the direct links available here and directly jump into the example codes on sorting 2D Numpy Array by Column or Row in Python.

How to Sort the NumPy Array by Column in Python?

In this section, you will be learning the concept of Sorting 2D Numpy Array by a column

Firstly, we have to import a numpy module ie.,

import numpy as np

After that, create a 2D Numpy array i.e.,

# Create a 2D Numpy array list of list
arr2D = np.array([[21, 22, 23, 20], [21, 17, 13, 14], [13, 10, 33, 19]])
print('2D Numpy Array')
print(arr2D)

Output:

2D Numpy Array
[[21 22 23 20]
[21 17 13 14]
[13 10 33 19]]

Suppose if we want to sort this 2D array by 2nd column like this,

[[21 7 23 14]
[31 10 33 7]
[11 12 13 22]]

To do that, first, we have to change the positioning of all rows in the 2D numpy array on the basis of sorted values of the 2nd column i.e. column at index 1.

Do Check:

Let’s see how to sort it,

Sorting 2D Numpy Array by column at index 1

In this, we will use arr2D[:,columnIndex].argsort()which will give the array of indices that sort this column.

import numpy as np

# Create a 2D Numpy array list of list
arr2D = np.array([[21, 22, 23, 20], [21, 17, 13, 14], [13, 10, 33, 19]])
print('2D Numpy Array')
print(arr2D)
columnIndex = 1
# Sort 2D numpy array by 2nd Column
sortedArr = arr2D[arr2D[:,columnIndex].argsort()]
print('Sorted 2D Numpy Array')
print(sortedArr)

Output:

2D Numpy Array
[[21 22 23 20]
[21 17 13 14]
[13 10 33 19]]

Sorted 2D Numpy Array
[[13 10 33 19]
[21 17 13 14]
[21 22 23 20]]

So in the above example, you have seen we changed the position of all rows in an array on sorted values of the 2nd column means column at index 1.

Sorting 2D Numpy Array by column at index 0

Let’s see how it will work when we give index 0.

import numpy as np

# Create a 2D Numpy array list of list
arr2D = np.array([[21, 22, 23, 20], [21, 17, 13, 14], [13, 10, 33, 19]])
print('2D Numpy Array')
print(arr2D)
# Sort 2D numpy array by first column
sortedArr = arr2D[arr2D[:,0].argsort()]
print('Sorted 2D Numpy Array')
print(sortedArr)

Output:

2D Numpy Array
[[21 22 23 20]
[21 17 13 14]
[13 10 33 19]]

Sorted 2D Numpy Array
[[13 10 33 19]
[21 22 23 20]
[21 17 13 14]]

Sorting 2D Numpy Array by the Last Column

import numpy as np

# Create a 2D Numpy array list of list
arr2D = np.array([[21, 22, 23, 20], [21, 17, 13, 14], [13, 10, 33, 19]])
print('2D Numpy Array')
print(arr2D)
# Sort 2D numpy array by last column
sortedArr = arr2D[arr2D[:, -1].argsort()]
print('Sorted 2D Numpy Array')
print(sortedArr)

Output:

2D Numpy Array
[[21 22 23 20]
[21 17 13 14]
[13 10 33 19]]

Sorted 2D Numpy Array
[[21 17 13 14]
[13 10 33 19]
[21 22 23 20]]

How to Sort the NumPy array by Row in Python?

By using similar logic, we can also sort a 2D Numpy array by a single row i.e. mix-up the columns of the 2D numpy array to get the furnished row sorted.

Look at the below examples and learn how it works easily,

Let’s assume, we have a 2D Numpy array i.e.

# Create a 2D Numpy array list of list
arr2D = np.array([[11, 12, 13, 22], [21, 7, 23, 14], [31, 10, 33, 7]])
print('2D Numpy Array')
print(arr2D)

Output:

2D Numpy Array
[[11 12 13 22]
[21 7 23 14]
[31 10 33 7]]

Sorting 2D Numpy Array by row at index position 1

So we are going to use the above example to show how we sort an array by row.

import numpy as np

# Create a 2D Numpy array list of list
arr2D = np.array([[21, 22, 23, 20], [21, 17, 13, 14], [13, 10, 33, 19]])
print('2D Numpy Array')
print(arr2D)
# Sort 2D numpy array by 2nd row
sortedArr = arr2D [ :, arr2D[1].argsort()]
print('Sorted 2D Numpy Array')
print(sortedArr)

Output:

2D Numpy Array
[[21 22 23 20]
[21 17 13 14]
[13 10 33 19]]

Sorted 2D Numpy Array
[[23 20 22 21]
[13 14 17 21]
[33 19 10 13]]

So you can see that it changed column value, as we selected row at given index position using [] operator and using argsort()we got sorted indices after that we have changed the position of the column to sort our row.

Sorting 2D Numpy Array by the Last Row

import numpy as np

# Create a 2D Numpy array list of list
arr2D = np.array([[21, 22, 23, 20], [21, 17, 13, 14], [13, 10, 33, 19]])
print('2D Numpy Array')
print(arr2D)
# Sort 2D numpy array by last row
sortedArr = arr2D[:, arr2D[-1].argsort()]
print('Sorted 2D Numpy Array')
print(sortedArr)

Output:

2D Numpy Array
[[21 22 23 20]
[21 17 13 14]
[13 10 33 19]]

Sorted 2D Numpy Array
[[22 21 20 23]
[17 21 14 13]
[10 13 19 33]]

Conclusion:

So in this article, I have shown you different ways to sorting 2D Numpy Array by column or row in Python.

Happy learning guys!

Count occurrences of a value in NumPy array in Python

Count occurrences of a value in NumPy array in Python | numpy.count() in Python

Count Occurences of a Value in Numpy Array in Python: In this article, we have seen different methods to count the number of occurrences of a value in a NumPy array in Python. Check out the below given direct links and gain the information about Count occurrences of a value in a NumPy array in Python.

Python numpy.count() Function

The function of numpy.count()in python aid in the counts for the non-overlapping occurrence of sub-string in the specified range. The syntax for phyton numpy.count() function is as follows:

Syntax:

numpy.core.defchararray.count(arr, substring, start=0, end=None)

Parameters:

  • arr: array-like or string to be searched.
  • substring: substring to search for.
  • start, end: [int, optional] Range to search in.

Returns:

An integer array with the number of non-overlapping occurrences of the substring.

Example Code for numpy.count() Function in Python

numpy.count() function in python

# Python Program illustrating 
# numpy.char.count() method 
import numpy as np 
  
# 2D array 
arr = ['vdsdsttetteteAAAa', 'AAAAAAAaattttds', 'AAaaxxxxtt', 'AAaaXDSDdscz']
  
print ("arr : ", arr)
  
print ("Count of 'Aa'", np.char.count(arr, 'Aa'))
print ("Count of 'Aa'", np.char.count(arr, 'Aa', start = 8))

Output:

arr : ['vdsdsttetteteAAAa', 'AAAAAAAaattttds', 'AAaaxxxxtt', 'AAaaXDSDdscz']

Count of 'Aa' [1 1 1 1]
Count of 'Aa' [1 0 0 0]

Also Check:

How to count the occurrences of a value in a NumPy array in Python

Counting the occurrences of a value in a NumPy array means returns the frequency of the value in the array. Here are the various methods used to count the occurrences of a value in a python numpy array.

Use count_nonzero()

We use the count_nonzero()function to count occurrences of a value in a NumPy array, which returns the count of values in a given numpy array. If the value of the axis argument is None, then it returns the count.

Let’s take an example, count all occurrences of value ‘6’ in an array,

import numpy as np
arr = np.array([9, 6, 7, 5, 6, 4, 5, 6, 5, 4, 7, 8, 6, 6, 7])
print('Numpy Array:')
print(arr)
# Count occurrence of element '3' in numpy array
count = np.count_nonzero(arr == 6)
print('Total occurences of "6" in array: ', count)

Output:

RESTART: C:/Users/HP/Desktop/article3.py
Numpy Array:
[9 6 7 5 6 4 5 6 5 4 7 8 6 6 7]
Total occurences of "6" in array: 5

In the above example, you have seen that we applied a condition to the numpy array ie., arr==6, then it applies the condition on each element of the array and stores the result as a bool value in a new array.

Use sum()

In this, we are using thesum()function to count occurrences of a value in an array.

import numpy as np
arr = np.array([9, 6, 7, 5, 6, 4, 5, 6, 5, 4, 7, 8, 6, 6, 7])
print('Numpy Array:')
print(arr)
# Count occurrence of element '6' in numpy array
count = (arr == 6).sum()

print('Total occurences of "6" in array: ', count)

Output:

RESTART: C:/Users/HP/Desktop/article3.py
Numpy Array: [9 6 7 5 6 4 5 6 5 4 7 8 6 6 7]
Total occurences of "6" in array: 5

In the above example, we have seen if the given condition is true then it is equivalent to one in python so we can add the True values in the array to get the sum of values in the array.

Use bincount()

We usebincount()function to count occurrences of a value in an array.

import numpy as np
arr = np.array([9, 6, 7, 5, 6, 4, 5, 6, 5, 4, 7, 8, 6, 6, 7])
count_arr = np.bincount(arr)
# Count occurrence of element '6' in numpy array
print('Total occurences of "6" in array: ', count_arr[6])
# Count occurrence of element '5' in numpy array
print('Total occurences of "5" in array: ', count_arr[5])

Output:

RESTART: C:/Users/HP/Desktop/article3.py
Total occurences of "6" in array: 5
Total occurences of "5" in array: 3

Convert numpy array to list and count occurrences of a value in an array

In this method, first we convert the array to a list and then we applycount()function on the list to get the count of occurrences of an element.

import numpy as np
arr = np.array([9, 6, 7, 5, 6, 4, 5, 6, 5, 4, 7, 8, 6, 6, 7])
# Count occurrence of element '6' in numpy array
count = arr.tolist().count(6)
print('Total occurences of "6" in array: ', count)

Output:

RESTART: C:/Users/HP/Desktop/article3.py
Total occurences of "6" in array:

Select elements from the array that matches the value and count them

We can choose only those elements from the numpy array that is similar to a presented value and then we can attain the length of this new array. It will furnish the count of occurrences of the value in the original array. For instance,

import numpy as np
# Create a 2D Numpy Array from list of lists
matrix = np.array( [[2, 3, 4],
                    [5, 3, 4],
                    [5, 3, 5],
                    [4, 7, 8],
                    [3, 6, 2]] )
# Count occurrence of element '3' in each column
count = np.count_nonzero(matrix == 3, axis=0)
print('Total occurrences  of "3" in each column of 2D array: ', count)

Output: 

Total occurrences of "3" in each column of 2D array: [1 3 0]

Count occurrences of a value in 2D NumPy Array

We can use the count_nonzero() function to count the occurrences of a value in a 2D array or matrix.

import numpy as np
# Create a 2D Numpy Array from list of lists
matrix = np.array( [[2, 3, 4],
                    [5, 3, 4],
                    [5, 6, 5],
                    [6, 7, 6],
                    [3, 6, 2]] )
# Count occurrence of element '6' in complete 2D Numpy Array
count = np.count_nonzero(matrix == 6)
print('Total occurrences of "6" in 2D array:')
print(count)

Output:

Total occurrences of "6" in 2D array: 4

Count occurrences of a value in each row of 2D NumPy Array

To count the occurrences of a value in each row of the 2D  array we will pass the axis value as 1 in the count_nonzero() function.

It will return an array containing the count of occurrences of a value in each row.

import numpy as np
# Create a 2D Numpy Array from list of lists
matrix = np.array( [[2, 3, 4],
                    [5, 3, 4],
                    [6, 6, 5],
                    [4, 7, 6],
                    [3, 6, 2]] )
# Count occurrence of element '6' in each row
count = np.count_nonzero(matrix == 6, axis=1)
print('Total occurrences  of "6" in each row of 2D array: ', count)

Output:

Total occurrences of "6" in each row of 2D array: [0 0 2 1 1]

Count occurrences of a value in each column of 2D NumPy Array

In order to count the occurrences of a value in each column of the 2D NumPy array transfer the axis value as 0 in thecount_nonzero()function. After that, it will result in an array including the count of occurrences of a value in each column. For instance,

import numpy as np
# Create a 2D Numpy Array from list of lists
matrix = np.array( [[2, 3, 4],
                    [5, 3, 4],
                    [5, 3, 5],
                    [4, 7, 8],
                    [3, 6, 2]] )
# Count occurrence of element '3' in each column
count = np.count_nonzero(matrix == 3, axis=0)
print('Total occurrences  of "3" in each column of 2D array: ', count)

Output:

Total occurrences of "3" in each column of 2D array: [1 3 0]
How to sort a Numpy Array in Python

How to sort a Numpy Array in Python? | numpy.sort() in Python | Python numpy.ndarray.sort() Function

In this article, we are going to show you sorting a NumpyArray in descending and in ascending order or sorts the elements from largest to smallest value or smallest to largest value. Stay tuned to this page and collect plenty of information about the numpy.sort() in Python and How to sort a Numpy Array in Python?

Sorting Arrays

Giving arrays to an ordered sequence is called sorting. An ordered sequence can be any sequence like numeric or alphabetical, ascending or descending.

Python’s Numpy module provides two different methods to sort a numpy array.

numpy.ndarray.sort() method in Python

A member function of the ndarray class is as follows:

ndarray.sort(axis=-1, kind='quicksort', order=None)

This can be ordered through a numpy array object (ndarray) and it classifies the incorporated numpy array in place.

Do Check:

Python numpy.sort() method

One more method is a global function in the numpy module i.e.

numpy.sort(array, axis=-1, kind='quicksort', order=None)

It allows a numpy array as an argument and results in a sorted copy of the Numpy array.

Where,

Sr.No. Parameter & Description
1 a

Array to be sorted

2 axis

The axis along which the array is to be sorted. If none, the array is flattened, sorting on the last axis

3 kind

Default is quicksort

4 order

If the array contains fields, the order of fields to be sorted

Sort a Numpy Array in Place

The NumPy ndarray object has a function called sort() that we will use for sorting.

import numpy as np
# Create a Numpy array from list of numbers
arr = np.array([9, 1, 3, 2, 17, 5,  2, 8, 14])
array = np.array([9, 1, 3, 2, 17, 5,  2, 8, 14])
# Sort the numpy array inplace
array.sort()
print('Original Array : ', arr)
print('Sorted Array : ', array)

Output:

RESTART: C:/Users/HP/Desktop/article3.py
Original Array : [ 9 1 3 2 17 5 2 8 14]
Sorted Array : [ 1 2 2 3 5 8 9 14 17]

So in the above example, you have seen that usingarray.sort()we have sorted our array in place.

Sort a Numpy array in Descending Order

In the above method, we have seen that by default both numpy.sort() and ndarray.sort() sorts the numpy array in ascending order. But now, you will observe how to sort an array in descending order?

import numpy as np
# Create a Numpy array from list of numbers
arr = np.array([9, 1, 3, 2, 17, 5,  2, 8, 14])
array = np.array([9, 1, 3, 2, 17, 5,  2, 8, 14])
# Sort the numpy array inplace
array = np.sort(arr)[::-1]

# Get a sorted copy of numpy array (Descending Order)
print('Original Array : ', arr)
print('Sorted Array : ', array)

Output:

RESTART: C:/Users/HP/Desktop/article3.py
Original Array : [ 9 1 3 2 17 5 2 8 14]
Sorted Array : [17 14 9 8 5 3 2 2 1]

Sorting a numpy array with different kinds of sorting algorithms

While sorting sort() function accepts a parameter ‘kind’ that tells about the sorting algorithm to be used. If not provided then the default value is quicksort. To sort numpy array with another sorting algorithm pass this ‘kind’ argument.

import numpy as np
# Create a Numpy array from list of numbers
arr = np.array([9, 1, 3, 2, 17, 5,  2, 8, 14])
array = np.array([9, 1, 3, 2, 17, 5,  2, 8, 14])
# Sort the numpy array using different algorithms

#Sort Using 'mergesort'
sortedArr1 = np.sort(arr, kind='mergesort')

# Sort Using 'heapsort'
sortedArr2 = np.sort(arr, kind='heapsort')

# Sort Using 'heapsort'
sortedArr3 = np.sort(arr, kind='stable')

# Get a sorted copy of numpy array (Descending Order)
print('Original Array : ', arr)

print('Sorted Array using mergesort: ', sortedArr1)

print('Sorted Array using heapsort : ', sortedArr2)

print('Sorted Array using stable : ', sortedArr3)

Output:

RESTART: C:/Users/HP/Desktop/article3.py
Original Array : [ 9 1 3 2 17 5 2 8 14]
Sorted Array using mergesort: [ 1 2 2 3 5 8 9 14 17]
Sorted Array using heapsort : [ 1 2 2 3 5 8 9 14 17]
Sorted Array using stable : [ 1 2 2 3 5 8 9 14 17]

Sorting a 2D numpy array along with the axis

numpy.sort() and numpy.ndarray.sort() provides an argument axis to sort the elements along the axis.
Let’s create a 2D Numpy Array.

import numpy as np
# Create a 2D Numpy array list of list

arr2D = np.array([[8, 7, 1, 2], [3, 2, 3, 1], [29, 32, 11, 9]])

print(arr2D)

Output:

RESTART: C:/Users/HP/Desktop/article3.py
[[ 8 7 1 2]
[ 3 2 3 1]
[29 32 11 9]]

Now sort contents of each column in 2D numpy Array,

import numpy as np
# Create a 2D Numpy array list of list

arr2D = np.array([[8, 7, 1, 2], [3, 2, 3, 1], [29, 32, 11, 9]])
arr2D.sort(axis=0)
print('Sorted Array : ')
print(arr2D)

Output:

RESTART: C:/Users/HP/Desktop/article3.py
Sorted Array :
[[ 3 2 1 1]
[ 8 7 3 2]
[29 32 11 9]]

So here is our sorted 2D numpy array.

Python Convert Matrix or 2D Numpy Array to a 1D Numpy Array

Python: Convert Matrix / 2D Numpy Array to a 1D Numpy Array | How to make a 2d Array into a 1d Array in Python?

This article is all about converting 2D Numpy Array to a 1D Numpy Array. Changing a 2D NumPy array into a 1D array returns in an array containing the same elements as the original, but with only one row. Want to learn how to convert 2d Array into 1d Array using Python? Then, stay tuned to this tutorial and jump into the main heads via the available links shown below:

Convert 2D Numpy array / Matrix to a 1D Numpy array using flatten()

Python Numpy provides a function flatten() to convert an array of any shape to a flat 1D array.

Firstly, it is required to import the numpy module,

import numpy as np

Syntax:

ndarray.flatten(order='C')
ndarray.flatten(order='F')
ndarray.flatten(order='A')

Order: In which items from the array will be read

Order=’C’: It will read items from array row-wise

Order=’F’: It will read items from array row-wise

Order=’A’: It will read items from array-based on memory order

Suppose we have a 2D Numpy array or matrix,

[7 4 2]
[5 3 6]
[2 9 5]

Which we have to convert in a 1D array. Let’s use this to convert a 2D numpy array or matrix to a new flat 1D numpy array,

import numpy as np
# Create a 2D numpy array from list of lists
arr = np.array([[7, 4, 2],
                [5, 3, 6],
                [2, 9, 5]])
# get a flatten 1D copy of 2D Numpy array
flat_array = arr.flatten()
print('1D Numpy Array:')
print(flat_array)

Output:

1D Numpy Array:
[7 4 2 5 3 6 2 9 5]

If we made any changes in our 1D array it will not affect our original 2D array.

import numpy as np
# Create a 2D numpy array from list of lists
arr = np.array([[7, 4, 2],
                [5, 3, 6],
                [2, 9, 5]])
# get a flatten 1D copy of 2D Numpy array
flat_array = arr.flatten()
print('1D Numpy Array:')
print(flat_array)
# Modify the flat 1D array
flat_array[0] = 50
print('Modified Flat Array: ')
print(flat_array)
print('Original Input Array: ')
print(arr)

Output:

1D Numpy Array:
[7 4 2 5 3 6 2 9 5]

Modified Flat Array:
[50 4 2 5 3 6 2 9 5]

Original Input Array:
[[7 4 2]
[5 3 6]
[2 9 5]]

Also Check:

Convert 2D Numpy array to 1D Numpy array using numpy.ravel()

Numpy have  a built-in function ‘numpy.ravel()’ that accepts an array element as parameter and returns a flatten 1D array.

Syntax:

numpy.ravel(input_arr, order='C')

Let’s make use of this syntax to convert 2D array to 1D array,

import numpy as np
# Create a 2D numpy array from list of lists
arr = np.array([[7, 4, 2],
                [5, 3, 6],
                [2, 9, 5]])
# Get a flattened view of 2D Numpy array
flat_array = np.ravel(arr)
print('Flattened 1D Numpy array:')
print(flat_array)

Output:

Flattened 1D Numpy array:
[7 4 2 5 3 6 2 9 5]

If we made any changes in our 1D array using numpy.ravel() it will also affect our original 2D array.

import numpy as np
# Create a 2D numpy array from list of lists
arr = np.array([[7, 4, 2],
                [5, 3, 6],
                [2, 9, 5]])
# Get a flattened view of 2D Numpy array
flat_array = np.ravel(arr)
print('Flattened 1D Numpy array:')
print(flat_array)
# Modify the 2nd element  in flat array
flat_array[1] = 12
# Changes will be reflected in both flat array and original 2D array
print('Modified Flattened 1D Numpy array:')
print(flat_array)
print('2D Numpy Array:')
print(arr)

Output:

Flattened 1D Numpy array:
[7 4 2 5 3 6 2 9 5]
Modified Flattened 1D Numpy array:
[ 7 12 2 5 3 6 2 9 5]
2D Numpy Array:
[[ 7 12 2]
[ 5 3 6]
[ 2 9 5]]

Convert a 2D Numpy array to a 1D array using numpy.reshape()

Numpy provides a built-in function reshape() to convert the shape of a numpy array,

It accepts three arguments-

  • a: Array which we have to be reshaped
  • newshape: Newshape can be a tuple or integer
  • order: The order in which items from the input array will be used
import numpy as np
# Create a 2D numpy array from list of lists
arr = np.array([[7, 4, 2],
                [5, 3, 6],
                [2, 9, 5]])

# convert 2D array to a 1D array of size 9
flat_arr = np.reshape(arr, 9)
print('1D Numpy Array:')
print(flat_arr)

Output:

1D Numpy Array:
[7 4 2 5 3 6 2 9 5]

In the above example, we have pass 9 as an argument because there were a total of 9 elements (3X3) in the 2D input array.

numpy.reshape() and -1 size

This function can be used when the input array is too big and multidimensional or we just don’t know the total elements in the array. In such scenarios, we can pass the size as -1.

import numpy as np
# Create a 2D numpy array from list of lists
arr = np.array([[7, 4, 2],
                [5, 3, 6],
                [2, 9, 5]])

# convert 2D array to a 1D array without mentioning the actual size
flat_arr = np.reshape(arr, -1)
print('1D Numpy Array:')
print(flat_arr)

Output:

1D Numpy Array:
[7 4 2 5 3 6 2 9 5]

numpy.reshape() returns a new view object if possible

With the help of reshape() function, we can view the input array and any modification done in the view object will be reflected in the original input too.

import numpy as np
# Create a 2D numpy array from list of lists
arr = np.array([[7, 4, 2],
[5, 3, 6],
[2, 9, 5]])
flat_arr = np.reshape(arr,-1)
print('1D Numpy Array:')
print(flat_arr)
# Modify the element at the first row and first column in the 1D array
arr[0][0] = 11
print('1D Numpy Array:')
print(flat_arr)
print('2D Numpy Array:')
print(arr)

Output:

1D Numpy Array:
[7 4 2 5 3 6 2 9 5]

1D Numpy Array:
[11 4 2 5 3 6 2 9 5]

2D Numpy Array:

[[11 4 2]
[ 5 3 6]
[ 2 9 5]]

Convert 2D Numpy array to 1D array but Column Wise

If we pass the order parameter in reshape() function as “F” then it will read 2D input array column-wise. As we will show below-

import numpy as np
# Create a 2D numpy array from list of lists
arr = np.array([[7, 4, 2],
                [5, 3, 6],
                [2, 9, 5]])
# Read 2D array column by column and create 1D array from it
flat_arr = np.reshape(arr, -1, order='F')
print('1D Numpy Array:')
print(flat_arr)

Output:

1D Numpy Array:
[7 5 2 4 3 9 2 6 5]