Author name: Satyabrata Jena

R: Set working directory

How to set work directory in R studio ?

In this article we are going to discuss about how we can set work directory R and also we will verify if that directory has been set perfectly or not. So, let’s start exploring the topic.

R is an interpreted programming language which is created by Ross Ihaka and Robert Gentleman. This programming language and open source environment is used for statistical computing, graphical presentation, data analytics and scientific research.

R studio is an  Integrated Development Environment (IDE) which provides free and open source tools for Rn language.

When working with R language sometimes we need to work with external files and for that we have to set that file directory as working directory otherwise the file won’t be accessible.

So, let’s see how we can set a directory as an working directory in R studio.

Setting up a Working Directory in R studio :

Steps to set working directory in R studio.

  1. Go to session menu.
  2. Select set working directory.
  3. Then select Choose Directory.
  4. From there browse the directory.

Set up Working Directory in R using setwd() function :

By using setwd() function we can set the working directory.

Syntax : setwd("D:/R_workingfile")

For example, we have a directory /Users/BtechGeeks and we want to set it as  working directory then we can do it like this

# command to set working directory.
setwd("Users/BtechGeeks")

If the path does not exist :

If you want to set a directory as working directory but that directory does not exist then it will give Error.

For example, a directory /Users/Btech which does not exists but we want to make it working directory.

Then

# Directory does not exist
setwd("/Users/Btech")
# Raising error as directory does not exit
Error in setwd("/Users/Btech") : cannot change working directory

Verifying the current working directory is set in R :

We can use getwd() function to check the current working directory.

If it returns the correct directory then the working directory has been set perfectly.

# It will give the current working directory
getwd()
>> "/Users/BtechGeeks"

R: Set working directory Read More »

Python: Read CSV into a list of lists or tuples or dictionaries | Import csv to list

Read CSV into a list of lists or tuples or dictionaries | Import csv to list in Python.

In this article, we will demonstrate how we can import a CSV into a list, list of lists or a list of tuples in python. We will be using pandas module for importing CSV contents to the list without headers.

Example Dataset :

CSV File name – data.csv

Id,Name,Course,City,Session
21,Jill,DSA,Texas,Night
22,Rachel,DSA,Tokyo,Day
23,Kirti,ML,Paris,Day
32,Veena,DSA,New York,Night

Read a CSV into list of lists in python :

1. Importing csv to a list of lists using csv.reader :

CSV.reader is a python built-in function from the CSV module which will help us read the CSV file into the python. Then passing the reader object into the list() will return a list of lists.

Let’s see the implementation of it.

#Program :

from csv import reader

#Opening the csv file as a list of lists in read mode
with open('data.csv', 'r') as csvObj:
 #The object having the file is passed into the reader
 csv_reader = reader(csvObj)
 #The reader object is passed into the list( ) to generate a list of lists
 rowList = list(csv_reader)
 print(rowList)
Output :
[['Id', 'Name', 'Course', 'City', 'Session'], 
['21', 'Jill', 'DSA', 'Texas', 'Night'], 
['22', 'Rachel', 'DSA', 'Tokyo', 'Day'], 
['23', 'Kirti', 'ML', 'Paris', 'Day'], 
['32', 'Veena', 'DSA', 'New York', 'Night']]

2. Selecting specific value in csv by specific row and column number :

 We can also select particular rows and columns from the CSV file by using Pandas. We have to read the CSV into a dataframe excluding the header and create a list of lists.

Let’s see the implementation of it.

#Program :

import pandas as pd

# Create a dataframe from the csv file
dfObj = pd.read_csv('data.csv', delimiter=',')
# User list comprehension 
# for creating a list of lists from Dataframe rows
rowList = [list(row) for row in dfObj.values]
# Print the list of lists i.e. only rows without the header
print(rowList)
Output :
[[21, 'Jill', 'DSA', 'Texas', 'Night'], 
[22, 'Rachel', 'DSA', 'Tokyo', 'Day'], 
[23, 'Kirti', 'ML', 'Paris', 'Day'], 
[32, 'Veena', 'DSA', 'New York', 'Night']]

3. Using Pandas to read csv into a list of lists with header :

To include the header row, we can first read the other rows like the previous example and then add the header to the list.

Let’s see the implementation of it.

#Program :

import pandas as pd

# Create a dataframe from the csv file
dfObj = pd.read_csv('data.csv', delimiter=',')
# User list comprehension 
# for creating a list of lists from Dataframe rows
rowList = [list(row) for row in dfObj.values]
#Adding the header
rowList.insert(0, dfObj.columns.to_list())
# Print the list of lists with the header
print(rowList)
Output :
[['Id', 'Name', 'Course', 'City', 'Session'], 
[21, 'Jill', 'DSA', 'Texas', 'Night'], 
[22, 'Rachel', 'DSA', 'Tokyo', 'Day'], 
[23, 'Kirti', 'ML', 'Paris', 'Day'], 
[32, 'Veena', 'DSA', 'New York', 'Night']]

Reading csv into list of tuples using Python :

Let’s add the contents of CSV file as a list of tuples. Each tuple will be representing a row and each value in the tuple represents a column value. Just like the way we added the contents into a list of lists from CSV, we will read the CSV file and then pass it into list function to create a list of tuples. The only difference here is the map( ) function that accepts function and input list arguments.

Let’s see the implementation of it.

#Program :

from csv import reader
# open file in read mode
with open('data.csv', 'r') as readerObj:
    # here passing the file object to reader() to get the reader object
    csv_reader = reader(readerObj)
    #Read all CSV files into the tuples
    tuplesList = list(map(tuple, csv_reader))
    # display the list of tuples
    print(tuplesList)
Output :

[('Id', 'Name', 'Course', 'City', 'Session'), ('21', 'Jill', 'DSA', 'Texas', 'Night'), ('22', 'Rachel', 'DSA', 'Tokyo', 'Day'), ('23', 'Kirti', 'ML', 'Paris', 'Day'), ('32', 'Veena', 'DSA', 'New York', 'Night')]

Reading csv into list of tuples using pandas & list comprehension :

We can load the contents of a CSV file into a dataframe by using read_csv( ) . Then using list comprehension we can convert the 2D numpy array into a list of tuples.

Let’s see the implementation of it.

#Program :

import pandas as pd
# Create a dataframe object from the csv file
dfObj = pd.read_csv('data.csv', delimiter=',')
# Create a list of tuples for Dataframe rows using list comprehension
tuplesList = [tuple(row) for row in dfObj.values]
# Print the list of tuple
print(tuplesList)
Output :
[(21, 'Jill', 'DSA', 'Texas', 'Night'), (22, 'Rachel', 'DSA', 'Tokyo', 'Day'), (23, 'Kirti', 'ML', 'Paris', 'Day'), (32, 'Veena', 'DSA', 'New York', 'Night')]

Reading csv into list of dictionaries using python :

We can also read the contents of a CSV file into dictionaries in python where each dictionary in the list will be a row from the CSV file. The CSV file contents are opened in read mode then they are passed into the Dict_reader( ) as a reader object, then it is passed into the list.

Let’s see the implementation of it.

#Program :

from csv import DictReader
# open file in read mode
with open('data.csv', 'r') as readerObj:
    # pass the reader file object to DictReader() to get the DictReader object
    dict_reader = DictReader(readerObj)
    # get a list of dictionaries from dct_reader
    dictList = list(dict_reader)
    # print the list of dict
    print(dictList)
Output :

[OrderedDict([('Id', '21'), ('Name', 'Jill'), ('Course', 'DSA'), ('City', 'Texas'), ('Session', 'Night')]), OrderedDict([('Id', '22'), ('Name', 'Rachel'), ('Course', 'DSA'), ('City', 'Tokyo'), ('Session', 'Day')]), OrderedDict([('Id', '23'), ('Name', 'Kirti'), ('Course', 'ML'), ('City', 'Paris'), ('Session', 'Day')]), OrderedDict([('Id', '32'), ('Name', 'Veena'), ('Course', 'DSA'), ('City', 'New York'), ('Session', 'Night')])]

Python: Read CSV into a list of lists or tuples or dictionaries | Import csv to list Read More »

Python: Open a file using “open with” statement and benefits explained with examples

Opening a file using ‘open with’ statement and benefits in Python.

In this article we will discuss about how to open a file using ‘open with’ statement, how to open multiple files in a single ‘open with’ statement and finally its benefits. So, let’s start the topic.

The need for “open with” statement :

To understand the “open with” statement we have to go through opening a file in python. For that we can make use of the open( ) function that is in-built in python

File.txt-

New File Being Read.
DONE!!
#program :

# opened a file 
fileObj = open('file.txt')
# Reading the file content into a placeholder
data = fileObj.read()
# print file content
print(data)
#close the file
fileObj.close()
Output :
New File Being Read.
DONE!!

In case the file does not exist it will throw a FileNotFoundError .

How to open a file using “open with” statement in python :

#Program :

# opened a file using open-with
with open('file.txt', "r") as fileObj:
    # Reading the file content into a placeholder
    data = fileObj.read()
    # print file content
    print(data)
# Check if file is closed
if fileObj.closed == False:
    print('File is not closed')
else:
    print('File is already closed')
New File Being Read.
DONE!!
File is closed

The with statements created an execution block that will automatically delete any object that was created in the program, in this case even if it was not closed the reader object was deleted that closed the file automatically. This saves us some memory in case we forgot to close the file.

Benefits of calling open() using “with statement” :

  • Fewer chances of bug due to coding error

With “with” statement we don’t have to close the opened file manually. It takes care of that when the compiler goes out of the block and automatically closes file. So it reduces the chances of bugs, lines of code and releases the memory for other operations.

  • Excellent handling in case of exception

If we have used “open-with” statement to open a file, and an exception occurs inside the with block, the file will be closed and the control moves to the except block.

# Python :

# Before handling the exception file will be closed 
try:
    # using "with statement" with open() function
    with open('file.txt', "r") as fileObj:
        # reading the file content
        data = fileObj.read()
        # Division by zero error
        x = 1 / 0
        print(data)
except:
    # handling the exception caused above
    print('Error occurred')
    if fileObj.closed == False:
        print('File is not closed')
    else:
        print('File is closed')
Output :
Error occurred
File is closed
  • Open multiple files in a single “with statement” :

We can use open with statement to open multiple files at the same time. Let’s try reading from one file and writing into another-

# Program :

# Read from file.txt and write in output.txt
with open('output.txt', 'w') as fileObj2, open('file.txt', 'r') as fileObj1:
    data = fileObj1.read()
    fileObj2.write(data)
    # Both the files are automatically close when the control moves out of the with block.

This will generate a “outuput.txt” file that will have the same contents as our old “file.txt”.

Output : 
Output.txt- 
New File Being Read. 
DONE !!

The files will automatically close when the control moves outside the with block.

Python: Open a file using “open with” statement and benefits explained with examples Read More »

Solved- TypeError: dict_keys object does not support indexing

Getting and resolving ‘TypeError: dict_keys object does not support indexing in Python’.

In this article we will discuss about

  • Reason of getting ‘TypeError: ‘dict_keys’ object does not support indexing’
  • Resolving the type error.

So let’s start exploring the topic.

To fetch keys, values or key-value pair from a dictionary in python we use functions like keys(), values() and items() which return view object so that we get a dynamic view on the dictionary entries.

The important point is that when dictionary changes then these views reflects these changes and we can iterate over it also. But when we want to use indexing on these objects then it causes TypeError.

Getting TypeError :

#Program :

# Dictionary created of string and int
word_freq = {
    'Aa' : 56,
    "Bb"    : 23,
    'Cc'  : 43,
    'Dd'  : 78,
    'Ee'   : 11
}

# Here, fetching a view object 
# by pointing to all keys of dictionary
keys = word_freq.keys()
print('dict_keys view object:')
print(keys)
print('Try to perform indexing:')

# Here, trying to perform indexing on the key's view object 
# Which will cause error
first_key = keys[0]
print('First Key: ', first_key)
Output :

Try to perform indexing:
Traceback (most recent call last):
File “temp.py”, line 18, in <module>
first_key = keys[0]
TypeError: ‘dict_keys’ object does not support indexing

Here, in the above example we got Type error as because we tryied to select value at index 0 from the dict_keys object, which is a view object and we know view object does not support indexing.

Resolving TypeError :

The solution to TypeError: dict_keys object does not support indexing is very simple. We just need to convert these view object dict_keys into a list and then we can perform indexing on that. Means we will cast the dict_keys object to list object and then selecting elements at any index position.

#Program :

# Dictionary created
word_freq = {
    'Aa' : 10,
    "Bb" : 20,
    'Cc' : 30,
    'Dd' : 40,
    'ee' : 50
}
# Here, fetching a view object 
# by pointing to all keys of dictionary
keys = list(word_freq.keys())
print('List of Keys:')
print(keys)

# Selecting 1st element from keys list
first_key = keys[0]
print('First Key: ', first_key)
Output :
List of Keys:
['Aa', 'Bb', 'Cc', 'Dd', 'Ee']
Second Key: Aa
In this example we converted all the keys of the dictionary to list and then we selected 1st element from the list which is present at index position 0 and it also returned the first key which is present at index position 0.

Solved- TypeError: dict_keys object does not support indexing Read More »

Python Pandas : How to drop rows in DataFrame by index labels

How to drop rows in DataFrame by index labels in Python ?

In this article we are going to learn how to delete single or multiple rows from a Dataframe.

For this we are going to use the drop( ) function.

Syntax - DataFrame.drop( labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise' )

Where, the function accepts name/series of names in the label and deletes the rows or columns it points to. The axis is used to alter between rows and columns, 0 means rows and 1 means columns(default value is 0).

Also we have to pass inplace = True if we want the modified values to be updated in our dataframe object, as the drop( ) function returns the modified values into a new Dataframe object. To explain this properly we will be using inplace = True in all our programs.

We are going to use the following dataset as example

      Name          Age       Location       Country
a     Jill               16         Tokyo             Japan
b    Phoebe        38        New York        USA
c     Kirti             39         New York        USA
d     Veena         40         Delhi               India
e     John           54          Mumbai         India
f     Michael       21         Tokyo              Japan

Deleting a single Row in DataFrame by Row Index Label :

To delete a single row by the label we can just pass the label into the function.

Here let’s try to delete‘b’ row.

#program :

import numpy as np
import pandas as pd

#Examole data
students = [('Jill',   16,  'Tokyo',     'Japan'),
('Phoebe', 38,  'New York',  'USA'),
('Kirti',  39,  'New York',  'USA'),
('Veena',  40,  'Delhi',     'India'),
('John',   54,  'Mumbai',    'India'),
("Michael",21,  'Tokyo',     'Japan')]

#Creating an object of dataframe class
dfObj = pd.DataFrame(students, columns = ['Name' , 'Age', 'Location' , 'Country'], index=['a', 'b', 'c' , 'd' , 'e' , 'f'])
#Deleting 'b' row
dfObj.drop('b',inplace=True)
print(dfObj)

Output :

        Name     Age      Location       Country
a        Jill          16       Tokyo            Japan
c        Kirti        39       New York      USA
d      Veena      40       Delhi             India
e      John         54       Mumbai        India
f     Michael     21       Tokyo            Japan

Deleting Multiple Rows in DataFrame by Index Labels :

To delete multiple rows by their labels we can just pass the labels into the function inside a square bracket [ ].

Here let’s try to delete 'a' and 'b' row.

#program :

import numpy as np
import pandas as pd

#Examole data
students = [('Jill',   16,  'Tokyo',     'Japan'),
('Phoebe', 38,  'New York',  'USA'),
('Kirti',  39,  'New York',  'USA'),
('Veena',  40,  'Delhi',     'India'),
('John',   54,  'Mumbai',    'India'),
("Michael",21,  'Tokyo',     'Japan')]

#Creating an object of dataframe class
dfObj = pd.DataFrame(students, columns = ['Name' , 'Age', 'Location' , 'Country'], index=['a', 'b', 'c' , 'd' , 'e' , 'f'])

#Deleting 'a' and 'b' row
dfObj.drop(['a','b'],inplace=True)
print(dfObj)
Output :
      Name       Age     Location       Country
c     Kirti          39      New York      USA
d     Veena      40      Delhi             India
e     John         54     Mumbai         India
f     Michael     21     Tokyo             Japan

Deleting Multiple Rows by Index Position in DataFrame :

To delete multiple rows we know the index position, however, the function drop( ) doesn’t take indices as parameters. So we create the list of labels and pass them into the drop( ) function. Let’s try deleting the same rows again but by index.

#program :

import numpy as np
import pandas as pd

#Examole data
students = [('Jill',   16,  'Tokyo',     'Japan'),
('Phoebe', 38,  'New York',  'USA'),
('Kirti',  39,  'New York',  'USA'),
('Veena',  40,  'Delhi',     'India'),
('John',   54,  'Mumbai',    'India'),
("Michael",21,  'Tokyo',     'Japan')]

#Creating an object of dataframe class
dfObj = pd.DataFrame(students, columns = ['Name' , 'Age', 'Location' , 'Country'], index=['a', 'b', 'c' , 'd' , 'e' , 'f'])

#Deleting 1st and 2nd row
dfObj.drop([dfObj.index[0] , dfObj.index[1]],inplace=True)
print(dfObj)
Output :
      Name      Age     Location     Country
c     Kirti          39     New York     USA
d     Veena      40     Delhi            India
e     John         54    Mumbai        India
f     Michael     21     Tokyo          Japan

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Remove Contents from a Dataframe

Python Pandas : How to drop rows in DataFrame by index labels Read More »

Python Tuple : Append , Insert , Modify & delete elements in Tuple

How to append, insert, modify and delete elements in Tuple in Python ?

This article is about how we can append, insert, modify and delete elements in Tuple.

As we know a tuple in Python stores ordered and immutable objects. It is one of data type which stores multiple items in a single variable where all the elements are placed inside parentheses () and separated by commas.

Syntax : sample_tuple = (element1, element2, element3, ...)

As tuple is immutable so once created values can not be changed. Still if we want to modify the existing tuple, then in that case we have to create a new tuple with updated elements only from the existing tuple. So let’s start exploring the topic to know how we can append, insert, modify and delete elements in Tuple.

Append an element in Tuple at end :

If we have a tuple and to append an element into it , then we will create copy of the existing tuple first then we will append the new element by using + operator.

So , let’s see the implementation of it.

#Program :

# A tuple created
tuple_Obj = (1 , 3, 4, 2, 5 )

#printing old tuple
print("Old tuple is :")
print(tuple_Obj)

# Appending 9 at the end of tuple
tuple_Obj = tuple_Obj + (9 ,)

#printing new tuple
print("After appending new tuple is :")
print(tuple_Obj)
Output  :
Old tuple is :
(1 , 3, 4, 2, 5 )
After appending new tuple is :
(1 , 3, 4, 2, 5, 9 )

Insert an element at specific position in tuple :

If we want to insert a specific element at particular index, then we have to create a new tuple by slicing the existing tuple and copying elements of old tuple from it.

Suppose we have to insert at index n then we have to create two sliced copies of existing tuple from (0 to n) and (n to end). Like

# containing elements from 0 to n-1 
tuple_Obj [ : n] 
# containing elements from n to end 
tuple_Obj [n : ]

So, let’s see the implementation of it.

#Program :

# A tuple created
tuple_Obj = (1 , 3, 4, 2, 5 )

#printing old tuple
print("Old tuple is :")
print(tuple_Obj)

n = 2
# Insert 9 in tuple at index 2
tuple_Obj = tuple_Obj[ : n ] + (9 ,) + tuple_Obj[n : ]

#printing new tuple
print("After appending new tuple is :")
print(tuple_Obj)
Output  :
Old tuple is :
(1 , 3, 4, 2, 5 )
After appending new tuple is :
(1 , 3, 9, 4, 2, 5 )

Modify / Replace the element at specific index in tuple :

If we want to replace the element at index n in tuple then we have to use the same slicing logic as we used in the above example, But in this we have to slice the tuple from from (0 to n-1) and (n+1 to end) , as we will replace the element at index n, so after replacing we are copying the elements again from n+1 index of the old tuple.

So, let’s see the implementation of it.

#Program :

# A tuple created
tuple_Obj = (1 , 3, 4, 2, 5 )

#printing old tuple
print("Old tuple is :")
print(tuple_Obj)

n = 2
# Insert 'program' in tuple at index 2
tuple_Obj = tuple_Obj[ : n] + ('program' ,) + tuple_Obj[n + 1 : ]

#printing new tuple
print("After appending new tuple is :")
print(tuple_Obj)
Output :
Old tuple is : 
(1 , 3, 4, 2, 5 ) 
After appending new tuple is : 
(1 , 3, 'program', 2, 5 )

Delete an element at specific index in tuple :

If we want to delete an element at index n in tuple then we have to use the same slicing logic as  we used in the above example, means we will slice the tuple from from (0 to n-1) and (n+1 to end) , like

# containing elements from 0 to n-1 
tuple_Obj [ : n] 
# containing elements from n to end 
tuple_Obj [n+1 : ]

So, let’s see the implementation of it.

#Program :

# A tuple created
tuple_Obj = (1 ,3, 'program', 4, 2, 5 )

#printing old tuple
print("Old tuple is :")
print(tuple_Obj)

n = 2
# Deleting the element at index 2 
tuple_Obj = tuple_Obj[ : n ] + tuple_Obj[n+1 : ]

#printing new tuple
print("After appending new tuple is :")
print(tuple_Obj)
Output : 
Old tuple is : (1 , 3, 'program',  4, 2, 5 ) 
After appending new tuple is : (1 , 3, 4, 2, 5 )

Python Tuple : Append , Insert , Modify & delete elements in Tuple Read More »

Python : filter() function | Tutorial and Examples

filter() function in Python

This article is about using filter() with lambda.

filter() function :

This filter() function is used to filter some content from a given sequence where the sequence may be a list, tuple or string etc.

Syntax : filter(function, iterable)

where,

  • function refers to a function that accepts an argument and returns bool i.e True or False based on some condition.
  • iterable refers to the sequence which will be filtered.

How it works ?

Actually filter() function iterates over all elemnts of the sequence and for each element the given callback function is called.

  • If False is returned by the function then that element is skipped.
  • If True is returned by the function then that element is added into a new list.

Filter a list of strings in Python using filter() :

#Program :

# A list containg string as elemnt
words = ['ant', 'bat', 'dog', 'eye', 'ink', 'job']

# function that filters words that begin with vowel
def filter_vowels(words):
    vowels = ['ant', 'eye', 'ink']

    if(words in vowels):
      return True
    else:
      return False

filtered_vowels = filter(filter_vowels, words)


# Print words that start with vowel
print('The filtered words are:')
for vowel in filtered_vowels:
print(vowel)
Output :
ant
eye
ink

In the above example we had created one separate function as filtered_vowels and we passed it to filter() function.

Using filter() with Lambda function :

In this we will pass an lambda function rather passing a separate function in filter() function. And the condition is to select words whose length is 6.

So, let’s see an example of it.

#Program :

# A list containg string as elemnt
words = ['ant', 'basket', 'dog', 'table', 'ink', 'school']

#It will return string whose length is 6
filtered_vowels = list(filter(lambda x : len(x) == 6 , words))


# Print words that start with vowel
print('The filtered words are:')
for vowel in filtered_vowels:
    print(vowel)
Output :
The filtered words are:
basket
school

Filter characters from a string in Python using filter() :

Now, let’s take example how to filter a character in a string and remove that character in the string.

#Program :

# A list containg string as elemnt
str_sample = "Hello, you are studying from Btech Geeks."

#It will return particular character in the string
#then those chharacters are removed from the string
filteredChars = ''.join((filter(lambda x: x not in ['e', 's'], str_sample)))


# Print the new string 
print('Filtered Characters  : ', filteredChars)
Output : 
Hllo, you ar tudying from Btch Gk.

Filter an array in Python using filter() :

Suppose we have an array. So, let’s see it how we can filter the elements from an array.

#Program :

# Two arrays 
sample_array1 = [1,3,4,5,21,33,45,66,77,88,99,5,3,32,55,66,77,22,3,4,5]
sample_array2 = [5,3,66]

#It will return a new array
#It will filter
#if array2 elemnts are present in array1 then that element will not be removed
filtered_Array = list(filter(lambda x : x not in sample_array2, sample_array1))
print('Filtered Array  : ', filtered_Array)
Output :

[1, 4, 21, 33, 45, 77, 88, 99, 32, 55, 77, 22 ,4]

Python : filter() function | Tutorial and Examples Read More »

Python : How to Compare Strings ? | Ignore case | regex | is vs == operator

How to Compare Strings ? | Ignore case | regex | is vs == operator in Python ?

In this article we will discuss various ways to compare strings in python.

Python provides various operators for comparing strings i.e less than(<), greater than(>), less than or equal to(<=), greater than or equal to(>=), not equal(!=), etc and whenever they are used they return Boolean values i.e True or False.

Compare strings using == operator to check if they are equal using python :

Let’s see it with an example:

#program :

firststr = 'First'
secondstr = 'second'

if firststr == secondstr:
    print('Strings are same')
else:
    print('Strings are not same')
Output:
Strings are not same

As the content of both the string are not same so it returned False.

#Program :

firststr = 'done'
secondstr = 'done'

if firststr == secondstr:
    print('Strings are same')
else:
    print('Strings are not same')
Output:
Strings are same

Here, as the content of both the string are not same so it returned True.

Compare strings by ignoring case using python :

Let’s see it with an example:

#program :

firststr = 'PROGRAM'
secondstr = 'program'

if firststr.lower() == secondstr.lower():
    print('Strings are same')
else:
    print('Strings are not same')
Output :
Strings are same

As we can see that both the strings are same but are in different case. Now let’s try with another operator.

Check if string are not equal using != operator using python :

Let’s see it with an example:

#Program :

firststr = 'python'
secondstr = 'program'

if firststr != secondstr:   
  print('Strings are not same')
else:   
  print('Strings are same')
Output: 
Strings are not same

Here, the two strings are not same so it returned True.

Let’s try some other operators.

Check if one string is less than or greater than the other string :

Let’s see it with an example:

#Program :

if 45 > 29:
    print('"45" is greater than "29"')
if "abc" > "abb":
    print('"abc" is greater than "abb"')
if "ABC" < "abc":
    print('"ABC" is less than "abc"')
if 32 >= 32:
    print('Both are equal')
if 62 <= 65:
    print('"62" is less than 65')
Output :
"45" is greater than "29"
"abc" is greater than "abb"
"ABC" is less than "abc"
Both are equal
"62" is less than 65

Comparing strings : is vs == operator :

Python has the two comparison operator == and is. At first sight they seem to be the same, but actually they are not.

== compares two variable based on their actual value but is operator compares two variables based on the object id and returns True if the two variables refer to the same object and returns False if two variables refer to the different object.

Sometimes is operator is also used to compare strings to check if they are equal or not.

Compare contents using is operator :

Example-1

#Program :

a = 5
b = 5
if a is b:
 print("they are same")
print('id of "a"',id(a))
print('id of "b"',id(b))
Output :
they are same
id of "a" 140710783821728
id of "b" 140710783821728

Example 2:

#Program :

a = "python"
b = "program"
if a is b:
  print("they are same")
else:
  print("they are not same")

print('id of "a"',id(a))
print('id of "b"',id(b))
Output:
they are not same
id of "a" 2104787270768
id of "b" 2104783497712

Compare contents using == operator :

The == operator compares the value or equality of two objects.

#Program :

a = 5
b = 5

if a == b:
  print('Both are same')
Output:

Both are same

Compare strings using regex in python :

A regular expression(re)  or regex is a special text string used for describing a search pattern. We can use that to compare strings.

#program :


import re
repattern = re.compile("88.*")
#list of numbers
List = ["88.3","88.8","87.1","88.0","28.2"]

#check if strings in list matches the regex pattern
for n in List:
 match = repattern.fullmatch(n)
 if match:
  print('string',n ,'matched')
 else:
  print('string',n ,'do not matched')
Output :
string 88.3 matched
string 88.8 matched
string 87.1 do not matched
string 88.0 matched
string 28.2 do not matched

Python : How to Compare Strings ? | Ignore case | regex | is vs == operator Read More »

Pandas: Get sum of column values in a Dataframe

How to get the sum of column values in a dataframe in Python ?

In this article, we will discuss about how to get the sum To find the sum of values in a dataframe. So, let’s start exploring the topic.

Select the column by name and get the sum of all values in that column :

To find the sum of values of a single column we have to use the sum( ) or the loc[ ] function.

Using sum() :

Here by using sum( ) only, we selected a column from a dataframe by the column name and from that we can get the sum of values in that column.

Syntax- dataFrame_Object[‘column_name’].sum( )
#Program :

import numpy as np
import pandas as pd
# Example data
students = [('Jill',    16,     'Tokyo',  150),
('Rachel',    38,     'Texas',   177),
('Kirti',    39,     'New York',  97),
('Veena',   40,     'Texas',   np.NaN),
('Lucifer',   np.NaN, 'Texas',   130),
('Pablo', 30,     'New York',  155),
('Lionel',   45,     'Colombia', 121) ]
dfObj = pd.DataFrame(students, columns=['Name','Age','City','Score'])
#Sum of all values in the 'Score' column of the dataframe
totalSum = dfObj['Score'].sum()
print(totalSum)
Output :
830.0

Using loc[ ] :

Here by using loc[] and  sum( ) only, we selected a column from a dataframe by the column name and from that we can get the sum of values in that column.

Syntax- dataFrame_Object_name.loc[:, ‘column_name’].sum( )

So, let’s see the implementation of it by taking an example.

#Program :

import numpy as np
import pandas as pd
# Example data
students = [('Jill',    16,     'Tokyo',  150),
('Rachel',    38,     'Texas',   177),
('Kirti',    39,     'New York',  97),
('Veena',   40,     'Texas',   np.NaN),
('Lucifer',   np.NaN, 'Texas',   130),
('Pablo', 30,     'New York',  155),
('Lionel',   45,     'Colombia', 121) ]
dfObj = pd.DataFrame(students, columns=['Name','Age','City','Score'])
#Sum of all values in the 'Score' column of the dataframe using loc[ ]
totalSum = dfObj.loc[:, 'Score'].sum()
print(totalSum)
Output :
830.0

Select the column by position and get the sum of all values in that column :

In case we don’t know about the column name but we know its position, we can find the sum of all value in that column using both iloc[ ] and sum( ). The iloc[ ] returns a series of values which is then passed into the sum( ) function.

So, let’s see the implementation of it by taking an example.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [('Jill',    16,     'Tokyo',  150),
('Rachel',    38,     'Texas',   177),
('Kirti',    39,     'New York',  97),
('Veena',   40,     'Texas',   np.NaN),
('Lucifer',   np.NaN, 'Texas',   130),
('Pablo', 30,     'New York',  155),
('Lionel',   45,     'Colombia', 121) ]
dfObj = pd.DataFrame(students, columns=['Name','Age','City','Score'])
column_number = 4
# Total sum of values in 4th column i.e. ‘Score’
totalSum = dfObj.iloc[:, column_number-1:column_number].sum()
print(totalSum)
Output :
Score    830.0
dtype: float64

Find the sum of columns values for selected rows only in Dataframe :

If we need the sum of values from a column’s specific entries we can-

So, let’s see the implementation of it by taking an example.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [('Jill',    16,     'Tokyo',  150),
('Rachel',    38,     'Texas',   177),
('Kirti',    39,     'New York',  97),
('Veena',   40,     'Texas',   np.NaN),
('Lucifer',   np.NaN, 'Texas',   130),
('Pablo', 30,     'New York',  155),
('Lionel',   45,     'Colombia', 121) ]
dfObj = pd.DataFrame(students, columns=['Name','Age','City','Score'])
column_number = 4
entries = 3
#Sum of the first three values from the 4th column
totalSum = dfObj.iloc[0:entries, column_number-1:column_number].sum()
print(totalSum)
Output :
Score    424.0
dtype: float64

Find the sum of column values in a dataframe based on condition :

In case we want the sum of all values that follows our conditions, for example scores of a particular city like New York can be found out by –

So, let’s see the implementation of it by taking an example.

#Program :

import numpy as np
import pandas as pd

# Example data
students = [('Jill',    16,     'Tokyo',  150),
('Rachel',    38,     'Texas',   177),
('Kirti',    39,     'New York',  97),
('Veena',   40,     'Texas',   np.NaN),
('Lucifer',   np.NaN, 'Texas',   130),
('Pablo', 30,     'New York',  155),
('Lionel',   45,     'Colombia', 121) ]
dfObj = pd.DataFrame(students, columns=['Name','Age','City','Score'])
#Sum of all the scores from New York city
totalSum = dfObj.loc[dfObj['City'] == 'New York', 'Score'].sum()
print(totalSum)
Output :
252.0

Pandas: Get sum of column values in a Dataframe Read More »

Python Pandas : How to Drop rows in DataFrame by conditions on column values

How to Drop rows in DataFrame by conditions on column values in Python ?

In this article we will discuss how we can delete rows in a dataframe by certain conditions on column values.

DataFrame provides a member function drop() which is used to drop specified labels from rows or columns in dataframe.

DataFrame.drop(self, labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors=’raise’)

Let’s try with an example:

#Program :

import pandas as pd
#list of tuples
game = [('riya',37,'delhi','cat','rose'),
   ('anjali',28,'agra','dog','lily'),
   ('tia',42,'jaipur','elephant','lotus'),
   ('kapil',51,'patna','cow','tulip'),
   ('raj',30,'banglore','lion','orchid')]

#Create a dataframe object
df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e'])
print(df)
Output:
     Name  Age     Place       Animal       Flower
a    riya     37       delhi           cat             rose
b  anjali    28       agra           dog             lily
c     tia      42       jaipur     elephant        lotus
d   kapil    51      patna          cow           tulip
e     raj      30     banglore      lion          orchid

Delete rows based on condition on a column

Let’s try with an example by deleting a row:

deleteRow = df[df['Place'] == 'patna'].index
df.drop(deleteRow, inplace=True)
print(df)
Output:
 Name    Age     Place      Animal       Flower
a    riya    37     delhi           cat            rose
b  anjali   28      agra          dog            lily
c     tia     42      jaipur     elephant      lotus
e     raj     30    banglore      lion         orchid

Here, we give the condition i.e

df[‘Place’] == ‘patna’

Internally if we will see it is giving series object with True and False.

a   False
b   False
c   False
d   True
e   False

Name: Place, dtype: bool

Delete rows based on multiple conditions on a column :

Let’s try with multiple conditions

deleteRow = df[(df['Age'] >= 30) & (df['Age'] <= 40)].index
df.drop(deleteRow, inplace=True)
print(df)
Output:
    Name  Age     Place     Animal      Flower
b  anjali   28       agra        dog          lily 
c     tia     42       jaipur    elephant    lotus

Here, we join two conditions i.e df[‘Age’]>=30 and df[‘Age’]<=40 by putting ’&’ between two conditions.

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Remove Contents from a Dataframe

Python Pandas : How to Drop rows in DataFrame by conditions on column values Read More »