Vikram Chiluka, Author at Python Programs

Python qrcode Module with Examples

We can generate our own QR codes using python

QR (Quick Response) code:

QR codes stores a large amount of data and, when scanned, provide fast access to the information.

It saves all data as a series of pixels in a square grid. In general, QR codes are used for the following purposes:

Link app download link
Accounts login details
Making payments

The three big squares outside the QR code are the major components of a conventional QR code. When the QR Reader recognizes them, it knows the entire information stored within the square.

qrcode Module with Examples in Python

Using Built-in Functions (Static Input)

Using Built-in Functions (Static Input)

Approach:

Import qrcode module using the import keyword.
Create an object to the qrcode using the QRCode() function and store it in a variable.
Add data to the above QRcode using the add_data() function by passing some random string as an argument.
Get or build the QRcode using the make() function.
Convert the QRcode into an image using the make_image() function.
Store it in another variable.
Save the above image with some random name using the save() function.
The Exit of the Program.

Below is the implementation:

# Import qrcode module using the import keyword.
import qrcode
# Create an object to the qrcode using the QRCode() function and store it in a
# variable.
our_QRcode = qrcode.QRCode()
# Add data to the above QRcode uisng the add_data() function by passing some
# random string as an argument.
our_QRcode.add_data('generating our own qr code')
# Get or build the QRcode using the make() function
our_QRcode.make()
# Convert the QRcode into an image using the make_image() function
# store it in another variable.
rslt_imge = our_QRcode.make_image()
# Save the above image with some random name using the save() function
rslt_imge.save('myqrcode.png')

Output:

Altering or Customizing the QR code

We can further alter the QR code’s look and structure by adding some properties to the qr object we created earlier with the QRCode function.

The following are some of the properties that are included in the object:

version: This defines the size of the QR code and has a value ranging from 1 to 40. ( 1 is the smallest)
box size: This specifies how many pixels must be present in the QR box.
Below is the code with added a few properties to the make image method that allows you to adjust the color of the backdrop and QR code by using the back color and fill color parameters.

Approach:

Import qrcode module using the import keyword.
Pass the version and box size as the arguments to the QRCode() function and store it in a variable.
Add data to the above QRcode using the add_data() function by passing some random string as an argument.
Get or build the QRcode using the make() function.
Convert the QRcode into an image using the make_image() function by passing some random fill color and background colors as the arguments to it.
Store it in another variable.
Save the above image with some random name using the save() function.
The Exit of the Program.

Below is the implementation:

# Import qrcode module using the import keyword.
import qrcode
# Pass the version and box size as the arguments to the QRCode() function
# and store it in a variable.
our_QRcode = qrcode.QRCode(version=1, box_size=15)
# Add data to the above QRcode using the add_data() function by passing some
# random string as an argument.
our_QRcode.add_data('generating our own qr code')
# Get or build the QRcode using the make() function.
our_QRcode.make()
# Convert the QRcode into an image using the make_image() function by passing
# some random fill color and background colors as the arguments to it.
# store it in another variable.
rslt_imge = our_QRcode.make_image(fill_color="green", back_color="yellow")
# Save the above image with some random name using the save() function
rslt_imge.save('myqrcode_2.png')

Output:

Python qrcode Module with Examples Read More »

Learn Python and Enjoy The Benefits of These Exciting Job Opportunities

Python / By Vikram Chiluka

There are numerous programming languages that programmers utilize, whether it be web development, app development, or any other type of software development. With the passage of time, the popularity of certain programming languages has skyrocketed. Python is one of these languages.

Python, in contrast to other programming languages, has a very straightforward learning framework. As a result, it is simple for anyone to learn, even if they are new to programming. Python has become one of the most popular programming languages due to its ease of use. Python is used in the backend programming of Google, Facebook, YouTube, and many other services, which may surprise you.

In brief, Python has the ability to find you the highest-paying employment in the industry. If you can master the difficult language, which will take you about 5-6 months, you will be rewarded handsomely in the end.

1)Python Developer

When learning Python, this is one of the first jobs that spring to mind. You will be designing apps as a Python developer. This entails writing code. If you take this career path, you will largely use Python to solve problems for your organization and its clients. You will create or modify software that will make others’ jobs easier and more efficient.

Depending on your area of specialization, you may be referred to as a programmer, mobile application developer, web developer, or something else. Python will be used in all circumstances. A Python developer may also be in charge of various responsibilities related to the creation of software applications. For example, you could be in charge of product documentation and other implementation-related activities.

You will almost definitely begin as a member of a development team. As you gain more experience, you will be able to work more independently in the future.

Python developers’ salaries are often determined by their level of expertise. A junior Python developer, who is new to the programming field and has minimal experience creating code and only a few projects in his or her portfolio, can earn around $60,000. A senior Python developer who understands the

You will almost definitely begin as a member of a development team. As you gain more experience, you will be able to work more independently in the future.

2)Data Analyst

What if you’re given a spreadsheet with a million entries and told to make a bar graph from it? It may take some time for you to complete this task. However, if you are familiar with Python, you can do the task in a matter of minutes. How?

Python, on the other hand, is much more than basic addition and subtraction. The language is endowed with a plethora of novel libraries, like NumPy, Pandas, SciPy, Matplotlib, and others, which may be used to do large and difficult mathematical calculations.

Businesses today are searching for someone who can mine and analyze data in order to provide important insights. Professionals like these are gold mines for companies looking to build AI-based solutions and provide cutting-edge customer service.

3)Data Scientist

This is an interesting approach to data. While data analysts provide measurements and reports, data scientists deal with data in a broader and more in-depth manner. In this role, you will design predictive and categorization models, as well as forecast trends that may effect the company’s future development. You will inform the business of numerous possibilities and lay the groundwork for its strategies by applying your understanding of data analysis, statistics, and model creation. Data scientists are frequently in charge of A/B testing and analysing the outcomes of implemented solutions.

Data scientists are employed by the majority of large corporations. Because their services are in high demand, you can expect to earn a solid living. But be careful — in order to stay relevant in this fast-changing field, you must know a lot and continue to learn. You must develop a strong skill set. But it’s well worth it because it’s a very profitable and intriguing career that nearly assures job stability. There is still a scarcity of these specialists on the market.

For Example, do you watch movies and TV shows on Netflix or Amazon? Have you noticed how these sites propose specific works that you might enjoy? They are based on algorithms that use data about you and your preferences that is collected on an ongoing basis.

There are other models that instruct Facebook on which advertisements to display to you. Someone must create and handle this. Data scientists work on these algorithms to determine what is ideal for the audience.

A strong portfolio is essential for a successful data scientist.

What can you expect to earn as a data scientist? It is an average of $113,000 every year. The top will earn up to $150,000 per year.

4) AI and ML Engineer

In the year 2020, the global AI market was valued at $62.35 billion. It is also predicted to expand at a CAGR of 40.2 percent between 2021 and 2028. It is obvious evidence that not just one industry sector, but all of them, is significantly investing in AI.

The fitness sector offers all of its services to customers while they are at home. Medical apps use user data to create health programs for them. Banking and other service providers are employing AI-powered chatbots to improve customer service and other functions.

All of these factors indicate that AI is the next big thing, and you can be a part of it by learning Python from the ground up. TensorFlow, Scikit-Learn, and other libraries will come in helpful while creating neural networks.

5)Data Engineer

Someone must first collect and process the data used by a data scientist for whatever he or she develops. This is when data engineers come in handy. Their job is to design a suitable structure for data collecting, storage, and processing. Massive volumes of data from all over the place must be collected and formatted so that it may be conveniently evaluated afterwards. Data engineers also design data pipelines and flows, and they ensure that large amounts of information are processed accurately and efficiently.

The work of data engineers benefits more than just data scientists. On a daily basis, data analysts and business analysts who gather insights may use data provided by data engineers to advise relevant actions.

Python is a great place to start if you want to become a data engineer. With tools for Big Data (e.g., Apache Spark), SQL, and scripting languages like Bash that enable process automation, your skill set will increase and expand over time.

These excellent skills has a good salary. Data engineers can earn an annual salary of $103,000, according to glassdoor. Advanced experts can earn more than $158,000 per year for their services.

6)As a Tutor

Educating others about your skills is an excellent approach to put them to use. Learning Python may be touted as a one-month project, however, this is not the case. You would know if you tried to learn it. And, if you are skilled in a programming language, it can open up new doors for you.

There are numerous online platforms available to help you get the most out of your knowledge. You can use these venues to display your talents as a tutor. If this does not work, you can start your own YouTube channel, which is free and can earn you money. Instagram videos and reels are another excellent approaches to market oneself.

It’s not just that, but you can also get a position in a reputable college or university to put your abilities to use.

7) As a Freelancer

This is a considerably broader group. Because Python programming does not have to imply working for a single company, many Python developers prefer to make a career by taking remote jobs from a variety of clients.

There are numerous websites where you can find per-project chances. Normally, you must log in and create a profile. UPWORK.COM, FREELANCER.COM, and INDEED.COM are some of the sites. Make sure you have a strong portfolio ready — the possible client doesn’t know who you are. The client will determine whether or whether you are eligible by analyzing your completed tasks.

Pay attention to Python learning since every line of code, every course finished, and every exercise or project you complete will make you a better programmer. As a freelancer, this is a significant part of how you’ll be able to take bigger tasks for more money.

Learn Python and Enjoy The Benefits of These Exciting Job Opportunities Read More »

Python Set symmetric_difference_update() Method with Examples

Python / By Vikram Chiluka

Set symmetric_difference_update() Method:

The symmetric_difference_update() method updates the original set by removing items from both sets and inserting new items.

The set of elements that are in either P or Q but not in their intersection is the symmetric difference of two sets P and Q.

For Example:

If P and Q are two distinct sets. The symmetric_difference_update of the given two sets is :

Let P={4, 5, 6, 7}

Q={5, 6, 8, 9}

Here {5, 6} are the common elements in both sets.

so, Set P is updated with all the other items from both sets except the common elements.

Set Q, on the other hand, remains unchanged.

Output : P={4, 7, 8, 9} and Q={5, 6, 8, 9}

Syntax:

set.symmetric_difference_update(set)

Parameters

set: This is Required. The set to look for matches in.

Return Value:

None is returned by symmetric difference update() (returns nothing). Rather, it calls the set and updates it.

Examples:

Example1:

Input:

Given first set = {10, 20, 50, 60, 30}
Given second set = {20, 30, 80, 90}

Output:

The given first set =  {10, 80, 50, 90, 60}
The given second set =  {80, 90, 20, 30}
The result after applying symmetric_difference_update() method =  None

Example2:

Input:

Given first set = {'a', 'b', 'c'}
Given second set = {'a', 'b', 'c', 'd', 'e'}

Output:

The given first set =  {'d', 'e'}
The given second set =  {'a', 'e', 'b', 'd', 'c'}
The result after applying symmetric_difference_update() method =  None

Set symmetric_difference_update() Method with Examples in Python

Using Built-in Functions (Static Input)
Using Built-in Functions (User Input)

Method #1: Using Built-in Functions (Static Input)

Approach:

Give the first set as static input and store it in a variable.
Give the second set as static input and store it in another variable.
Apply symmetric_difference_update() method to the given two sets and Store it in another variable.
Print the given first set.
Print the given second set.
Print the result after applying the symmetric_difference_update() method for the given two sets.
The Exit of the Program.

Below is the implementation:

# Give the first set as static input and store it in a variable.
fst_set = {10, 20, 50, 60, 30}
# Give the second set as static input and store it in another variable.
scnd_set = {20, 30, 80, 90}
# Apply symmetric_difference_update() method to the given two sets and
# Store it in another variable.
rslt = fst_set.symmetric_difference_update(scnd_set)
# Print the given first set
print("The given first set = ", fst_set)
# Print the given second set
print("The given second set = ", scnd_set)
# Print the result after applying symmetric_difference_update() method for the
# given two sets
print("The result after applying symmetric_difference_update() method = ", rslt)

Output:

The given first set =  {10, 80, 50, 90, 60}
The given second set =  {80, 90, 20, 30}
The result after applying symmetric_difference_update() method =  None

Method #2: Using Built-in Functions (User Input)

Approach:

Give the first set as user input using set(),map(),input(),and split() functions.
Store it in a variable.
Give the second set as user input using set(),map(),input(),and split() functions.
Store it in another variable.
Apply symmetric_difference_update() method to the given two sets and Store it in another variable.
Print the given first set.
Print the given second set.
Print the result after applying the symmetric_difference_update() method for the given two sets.
The Exit of the Program.

Below is the implementation:

# Give the first set as user input using set(),map(),input(),and split() functions.
# Store it in a variable.
fst_set = set(map(int, input(
   'Enter some random Set Elements separated by spaces = ').split()))
# Give the second set as user input using set(),map(),input(),and split() functions.
# Store it in another variable.
scnd_set = set(map(int, input(
   'Enter some random Set Elements separated by spaces = ').split()))

# Apply symmetric_difference_update() method to the given two sets and
# Store it in another variable.
rslt = fst_set.symmetric_difference_update(scnd_set)
# Print the given first set
print("The given first set = ", fst_set)
# Print the given second set
print("The given second set = ", scnd_set)
# Print the result after applying symmetric_difference_update() method for the
# given two sets
print("The result after applying symmetric_difference_update() method = ", rslt)

Output:

Enter some random Set Elements separated by spaces = 2 5 6 7 1
Enter some random Set Elements separated by spaces = 6 9 1 4 0
The given first set = {0, 2, 4, 5, 7, 9}
The given second set = {0, 1, 4, 6, 9}
The result after applying symmetric_difference_update() method = None

Python Set symmetric_difference_update() Method with Examples Read More »

Python: Differences Between List and Array

Python / By Vikram Chiluka

List:

In Python, a list is a collection of items that can contain elements of multiple data types, such as numeric, character logical values, and so on. It is an ordered collection that allows for negative indexing. Using [], you can create a list with data values.
List contents may be simply merged and copied using Python’s built-in functions.

The core difference between a Python list and a Python array is that a list is included in the Python standard package, whereas an array requires the “array” module to be imported. With a few exceptions, lists in Python replace the array data structure.

Array:

An array is a vector that contains homogenous items, that is, elements of the same data type. Elements are assigned contiguous memory addresses, allowing for easy change, i.e., addition, deletion, and access to elements. To declare arrays in Python, we must utilize the array module. If the array’s items are of different data types, an exception “Incompatible data types” is issued.

Differences: List vs Array

LIST	ARRAY
Elements of many data types may be present in a list.	Only elements of the same data type are included in an array
There is no need to import a module manually for declaration.	It is necessary to explicitly import a module for declaration.
Cannot do mathematical operations directly.	Can do arithmetic operations directly in an array
Preferred for shorter data item sequences	Preferred for longer data sequences.
It is possible to nest elements to contain multiple types of elements.	All nested items of the same size must be present.
Data can be easily modified (added, deleted) with greater freedom.	Less flexibility because addition and deletion must be done element by element.
Without any explicit looping, the complete list can be printed.	To print or access the array’s components, a loop must be created.
Lists Consume more memory to facilitate the insertion of elements.	The consumed Memory size is somewhat smaller.

1)Storing Data – List vs Array:

Data structures, as we all know, are used to effectively store data.
In this scenario, a list can contain heterogeneous data values. In other words, data objects of various data types can be handled in a Python List.

list:

# Give the list as static input and store it in a variable.
# List may contain heterogenous datatypes like int, float, strings etc.
gvn_lst = [8, 2.5, 1, "hello", 'Python-programs']
# Print the given list
print(gvn_lst)

Output:

[8, 2.5, 1, 'hello', 'Python-programs']

Arrays:

Arrays, on the other hand, store homogenous elements into them, that is, elements of the same kind.

# Import array using the import keyword
import array
# Give the array as static input by passing it as an argument to
# the array() function and store it in a variable.
# Arrays contain homogeneous datatypes
gvn_arry = array.array('i', [15, 65, 25, 48])
# Print the given array
print(gvn_arry)

Output:

array('i', [15, 65, 25, 48])

2)Declaration

Lists:

Python provides a built-in data structure called “List.” As a result, lists in Python do not need to be specified.

# Give the list as static input and store it in a variable.
gvn_lst = [8, 2.5, 1, "hello", 'Python-programs']

Arrays:

Arrays must be declared in Python. We can declare an array using the following methods:

Array Module:

import array
ArrayName = array.array('format-code', [items])

Numpy Module:

import numpy
ArrayName = numpy.array([items])

3)Mathematical Operations:

Arrays:

When it comes to conducting Mathematical operations, arrays have an advantage. The NumPy module provides us with an array structure to store and manipulate data values.

Example

# Import numpy module using the import keyword
import numpy
# Give the array as static input by passing it as an argument to
# the array() function and store it in a variable.
gvn_arry = numpy.array([15, 10, 5, 4])
# Multiply each element of the array with 5 and and store it in another variable.
rslt = gvn_arry*5
# Print the above result
print(rslt)

Output:

[75 50 25 20]

In contrast to lists, where the operations performed on the list do not reflect in the results, as shown in the example below using list operations.

In this case, we attempted to multiply the constant value (5) by the list, but the result does not have any effect. Because Lists cannot be directly mathematically manipulated with any data values.

So, if we wish to multiply 5 with the elements of the list, we must multiply 5 with each element of the list individually.

Lists:

# Give the list as static input and store it in a variable.
gvn_lst = [15, 10, 5, 4]
# Multiply the list with 5 and and store it in another variable.
rslt = gvn_lst*5
# Print the given lst
print(gvn_lst)

Output:

[15, 10, 5, 4]

4)Changing the size of the data structure

Python Lists, as an inbuilt data structure, may be enlarged or resized quickly and easily.

Arrays, on the other hand, show very poor performance when it comes to resizing the array’s memory. Instead, we’ll have to duplicate the array in order to scale and resize it.

Python: Differences Between List and Array Read More »

Python Programs for Calculus Using SymPy Module

Python / By Vikram Chiluka

Calculus:

Calculus is a branch of mathematics. Limits, functions, derivatives, integrals, and infinite series are all topics in calculus. To do calculus in Python, we will utilize the SymPy package.

Derivatives:

How steep is a function at a given point? Derivatives can be used to get the answer to this question. It measures the rate of change of a function at a specific point.

Integration:
what is the area beneath the graph over a particular region? Integration can be used to get the answer to this question. It combines the function’s values over a range of numbers.

SymPy Module:

SymPy is a Python symbolic mathematics library. It aspires to be a full-featured computer algebra system (CAS) while keeping the code as basic or simple as possible in order to be understandable and easily expandable. SymPy is entirely written in Python.

Before performing calculus operations install the sympy module as shown below:

Installation of sympy:

pip install sympy

To write any sympy expression, we must first declare its symbolic variables. To accomplish this, we can employ the following two functions:

sympy.Symbol(): This function is used to declare a single variable by passing it as a string into its parameter.

sympy.symbols(): This function is used to declare multivariable by passing the variables as a string as an argument. Each variable must be separated by a space to produce a string.

1)Limits Calculation in Python

Limits are used in calculus to define the continuity, derivatives, and integrals of a function sequence. In Python, we use the following syntax to calculate limits:

Syntax:

sympy.limit(function,variable,value)

For Example:

limit = f(x)
x–>k

The parameters specified in the preceding syntax for computing the limit in Python are function, variable, and value.

f(x): The function on which the limit operation will be conducted is denoted by f(x).
x: The function’s variable is x.

k: k is the value to which the limit tends to.

Example1: limit y–>0.4= sin(y) / y

Approach:

Import sympy module as ‘sp’ using the import keyword
Pass the argument y to symbol() function which is LHS in given limit and store it in a variable
Create the RHS of the limit using the above LHS limit and sin function and sympy module.
Pass the given function, variable, value as the arguments to the limit() function to get the limit value.
Store it in a variable.
Print the above-obtained limit value for the given function.
The Exit of the Program.

Below is the implementation:

# limit y–>0.4= sin(y) / y

# Import sympy module as 'sp' using the import keyword
import sympy as sp
# pass the argument y to symbol function which is LHS in given limit and store it in a variable
y = sp.Symbol('y')
# Create the RHS of the limit using the above LHS limit and sin function and sympy module
func = sp.sin(y)/y
# Pass the given function, variable, value as the arguments to the limit() function
# to get the limit value.
# Store it in a variable.
rslt_lmt = sp.limit(func, y, 0.4)
# Print the above obtained limit value for the given function.
print("The result limit value for the given function = ", rslt_lmt)

Output:

The result limit value for the given function = 0.973545855771626

Example2: limit _y–>0 = sin(3y) / y

# limit y–>0= sin(3y) / y

# Import sympy module as 'sp' using the import keyword
import sympy as sp
# pass the argument y to symbol function which is LHS in given limit and store it in a variable
y = sp.Symbol('y')
# Create the RHS of the limit using the above LHS limit and sin function and sympy module
func = sp.sin(3*y)/y
# Pass the given function, variable, value as the arguments to the limit() function
# to get the limit value.
# Store it in a variable.
rslt_lmt = sp.limit(func, y, 0)
# Print the above obtained limit value for the given function.
print("The result limit value for the given function = ", rslt_lmt)

Output:

The result limit value for the given function = 3

2)Derivatives Calculation in Python

Derivatives are an important aspect of conducting calculus in Python. We use the following syntax to differentiate or find the derivatives in limits:

Syntax:

sympy.diff(function,variable)

Example: f(y) = sin(y) + y² + e^3y

Below is the implementation:

# f(y) = sin(y) + y2 + e^3y

# Import sympy module as 'sp' using the import keyword
import sympy as sp
# pass the argument y to symbol function which is LHS in given limit and store it in a variable

y=sp.Symbol('y')
#Create the RHS of the limit using the above LHS limit and sin function,exp function and sympy module
func=sp.sin(y)+y**2+sp.exp(3*y)
#get the first differentiation value by passing the function and lhs to diff function and print it
fst_diff=sp.diff(func,y)
print('The value of first differentation of function',func,'is :\n',fst_diff)
#get the second differentiation value by passing the function and lhs to diff function and extra argument 2(which implies 2nd differentitation) and print it
scnd_diff=sp.diff(func,y,2)
print('The value of second differentation of function',func,'is :\n',scnd_diff)

Output:

The value of first differentation of function y**2 + exp(3*y) + sin(y) is :
2*y + 3*exp(3*y) + cos(y) 
The value of second differentation of function y**2 + exp(3*y) + sin(y) is : 
9*exp(3*y) - sin(y) + 2

Example: f(y) = cos(y) + y² + e^3y

# f(y) = cos(y) + y2 + e^3y

# Import sympy module as 'sp' using the import keyword
import sympy as sp
# pass the argument y to symbol function which is LHS in given limit and store it in a variable

y=sp.Symbol('y')
# Create the RHS of the limit using the above LHS limit and cos function,exp function and sympy module
func=sp.cos(y)+y**2+sp.exp(3*y)
# get the first differentiation value by passing the function and lhs to diff function and print it
fst_diff=sp.diff(func,y)
print('The value of first differentation of function',func,'is :\n',fst_diff)
# get the second differentiation value by passing the function and lhs to diff function and extra argument 2
# (which implies 2nd differentitation) and print it
scnd_diff=sp.diff(func,y,2)
print('The value of second differentation of function',func,'is :\n',scnd_diff)

Output:

The value of first differentation of function y**2 + exp(3*y) + cos(y) is : 
2*y + 3*exp(3*y) - sin(y) 
The value of second differentation of function y**2 + exp(3*y) + cos(y) is : 
9*exp(3*y) - cos(y) + 2

3)Integration Calculation in Python

Integration’s SymPy module is made up of integral modules. In Python, the syntax for calculating integration is as follows:

Syntax:

integrate(function, value)

Example: x³ + 2x + 5

Below is the implementation:

# Function: x^3 + 2x + 5

# Import all functions from sympy module using the import keyword
from sympy import*
x,y=symbols('x y')
gvn_expresn = x**3+2*x+ 5
print("The integration for the given expression is:")
integrate(gvn_expresn ,x)

Output:

Python Programs for Calculus Using SymPy Module Read More »

Python Numpy Broadcasting

Python / By Vikram Chiluka

“The word broadcasting refers to how numpy handles arrays of varying shapes during arithmetic operations.” The smaller array is “broadcast” across the bigger array, subject to specific limits so that their shapes are consistent. Broadcasting allows you to vectorize array operations such that looping happens in C rather than Python.”

For Example:

To understand NumPy’s broadcasting method, we add two arrays of different dimensions.

#import numpy as np using the import keyword
import numpy as np
# Pass some random number(length of array) to the arange() function and store it in a variable.
gvn_arry = np.arange(4)
# Add a number to the given array and and store it in another variable.
result = gvn_arry + 6
# Print the above result
print(result)

Output:

[6 7 8 9]

In this case, the given array has one dimension (axis), which has a length of 4, whereas 6. is a simple integer with no dimensions. Because they have different dimensions, Numpy attempts to broadcast (simply stretch) the smaller array along a specific axis, making it suitable for the mathematical operation.

The Numpy Broadcasting Rules

Numpy broadcasting follows a strict set of rules to ensure that array operations are consistent and fail-safe. The following are two general broadcasting rules in numpy:

When we perform an operation on a NumPy array, NumPy compares the array’s shape element by element from right to left. Only when two dimensions are equal or one of them is 1, are they compatible. If two dimensions are equal, the array is preserved.

The array is broadcasted along the dimension if it is one. NumPy throws a ValueError if neither of the two conditions is met, indicating that the array cannot be broadcasted. If and only if all dimensions are compatible, the arrays are broadcasted.
The arrays being compared do not have to have the same number of dimensions. The array with fewer dimensions can be easily scaled along the missing dimension.

Implementation

Let the two arrays be arr_1= (5, 3) and arr_2= (5, 1)

The sum of arrays with compatible dimensions: The arrays have compatible dimensions (5, 3) and (5, 1). To match the dimension of arr_1 the array arr_2 is expanded along the second dimension.

# Import numpy module as np using the import keyword.
import numpy as np
# Pass the rowsize*columnsize as argument to the arrange() function and 
# pass the rowsize, columnsize  as arguments to the reshape() function
# Store it in a variable.
arry1 = np.arange(15).reshape(5, 3)
# Print the shape(rowsize, columnsize) using the shape function
print("First Array shape = ", arry1.shape)
# similary get the other array
arry2 = np.arange(5).reshape(5, 1)
print("Second Array shape = ", arry2.shape)
# Print the sum of both the arrays by adding the above two variables.
print("Adding both arrays and printing the sum of it:")
print(arry1 + arry2)

Output:

First Array shape = (5, 3) 
Second Array shape = (5, 1) 
Adding both arrays and printing the sum of it: 
[[ 0 1 2] 
 [ 4 5 6] 
 [ 8 9 10] 
 [12 13 14] 
 [16 17 18]]

Example2:

# Import numpy module as np using the import keyword.
import numpy as np
# Pass the rowsize*columnsize as argument to the arrange() function and 
# pass the rowsize, columnsize  as arguments to the reshape() function
# Store it in a variable.
arry1 = np.arange(15).reshape(5, 4)
# Print the shape(rowsize, columnsize) using the shape function
print("First Array shape = ", arry1.shape)
# similary get the other array
arry2 = np.arange(5).reshape(5, 1)
print("Second Array shape = ", arry2.shape)
# Print the sum of both the arrays by adding the above two variables.
print("Adding both arrays and printing the sum of it:")
print(arry1 + arry2)

Output:

ValueError                                Traceback (most recent call last)
<ipython-input-17-3bb527438935> in <module>()
      4 # pass the rowsize, columnsize  as arguments to the reshape() function
      5 # Store it in a variable.
----> 6 arry1 = np.arange(15).reshape(5, 4)
      7 # Print the shape(rowsize, columnsize) using the shape function
      8 print("First Array shape = ", arry1.shape)

ValueError: cannot reshape array of size 15 into shape (5,4)

Explanation:

Here, the number of rows is 5, while the number of columns is 4.
It cannot be placed in a matrix of size 16 (a matrix of size 5*4 = 20 
is required).

Example3:

# Import numpy module as np using the import keyword.
import numpy as np
# Pass the rowsize*columnsize as argument to the arrange() function and 
# pass the rowsize, columnsize  as arguments to the reshape() function
# Store it in a variable.
arry1 = np.arange(18).reshape(6, 3)
# Print the shape(rowsize, columnsize) using the shape function
print("First Array shape = ", arry1.shape)
# similary get the other array
arry2 = np.arange(3)
print("Second Array shape = ", arry2.shape)
# Print the sum of both the arrays by adding the above two variables.
print("Adding both arrays and printing the sum of it:")
print(arry1 + arry2)

Output:

First Array shape = (6, 3) 
Second Array shape = (3,) 
Adding both arrays and printing the sum of it: 
[[ 0 2 4] 
 [ 3 5 7] 
 [ 6 8 10] 
 [ 9 11 13] 
 [12 14 16] 
 [15 17 19]]

Example4:

arry1 = np.arange(120).reshape(5, 4, 3, 2)
print("First Array shape = ", arry1.shape)
 
arry2 = np.arange(24).reshape(4, 3, 2)
print("Second Array shape = ", arry2.shape)
 
print("Adding both arrays and printing the sum of it: \n", (arry1 + arry2).shape)

Output:

First Array shape = (5, 4, 3, 2) 
Second Array shape = (4, 3, 2) 
Adding both arrays and printing the sum of it: 
(5, 4, 3, 2)

Explanation:

It is vital to realize that several arrays can be propagated along many 
dimensions. Array1 has dimensions (5, 4, 3, 2), while array2 has dimensions 
( 4, 3, 2).Array1 is extended along the third dimension, whereas array2 is 
stretched along the first and second dimensions, yielding the dimension array 
(5, 4, 3, 2).

Broadcasting’s Speed Advantages

Numpy broadcasting is more efficient than looping through the array. Let’s start with the first example. The user can choose not to use the broadcasting mechanism and instead loop through an array, adding the same number to each element in the array. This can be slow for two reasons: looping involves interacting with the Python loop, which reduces the speed of the C implementation. Second, NumPy employs strides rather than loops. Setting strides to 0 allows you to repeat the elements indefinitely without incurring any memory overhead.

Python Numpy Broadcasting Read More »

In Python, How do you Normalize Data?

Python / By Vikram Chiluka

This post will teach you how to normalize data in Pandas.

Pandas:

Pandas is an open-source library developed on top of the NumPy library. It is a Python module that contains a variety of data structures and procedures for manipulating numerical data and statistics. It’s mostly used to make importing and evaluating data easier. Pandas is fast, high-performance, and productive for users.

Data Normalization:

Data Normalization is a common approach in machine learning that involves translating numeric columns to a standard scale. Some feature values in machine learning differ from others numerous times. The characteristics with the highest values will dominate the learning process.

Before we get into normalisation, let us first grasp why it is necessary.

Feature scaling is an important stage in data analysis and data preparation for modelling. In this section, we make the data scale-free for easier analysis.
One of the feature scaling strategies is normalisation. We use normalisation most often when the data is skewed on either axis, i.e. when the data does not match the Gaussian distribution.
Normalization converts data features from different scales to a similar scale, making it easier to handle the data for modelling. As a result, all of the data features (variables) have a similar impact on the modelling section.

We normalise each feature using the formula below by subtracting the minimum data value from the data variable and then dividing it by the variable’s range, as shown below:

Formula:

As a result, we convert the data to a range between [0,1].

Methods for Normalizing Data in Python

Python has several approaches that you can use to do normalization.

Let us take an example of a dummy dataset here. You can download some other dataset and test it out.

1) MinMaxScaler

)Importing the Dataset

Import the dataset into a Pandas Dataframe.

# Import pandas module as pd using the import keyword
import pandas as pd
# Import dataset using read_csv() function by pasing the dataset name as
# an argument to it.
# Store it in a variable.
dummy_dataset = pd.read_csv("dummy_data.csv")
print(dummy_dataset)

Output:

    id  calories  protein  fat
0    0        70        4    1
1    1       120        3    5
2    2        70        4    1
3    3        50        4    0
4    4       110        2    2
5    5       110        2    2
6    6       110        2    0
7    7       130        3    2
8    8        90        2    1
9    9        90        3    0
10  10       120        1    2
11  11       110        6    2
12  12       120        1    3
13  13       110        3    2
14  14       110        1    1
15  15       110        2    0
16  16       100        2    0
17  17       110        1    0
18  18       110        1    1

Normalizing the above-given dataset by applying the MinMaxScaler function

Approach:

Import pandas module as pd using the import keyword.
Import MinMaxScaler function from sklearn.preprocessing module using the import keyword.
Import dataset using read_csv() function by pasing the dataset name as an argument to it.
Store it in a variable.
Create an object for the MinMaxScaler() function and store it in a variable.
Normalize(transform the data to 0’s and 1)the given dataset using the fit_transform() function and store it in another variable.
Print the Normalized data values.
The Exit of the Program.

Below is the implementation:

# Import pandas module as pd using the import keyword
import pandas as pd
# Import MinMaxScaler function from sklearn.preprocessing module using the import keyword
from sklearn.preprocessing import MinMaxScaler
# Import dataset using read_csv() function by pasing the dataset name as
# an argument to it.
# Store it in a variable.
dummy_dataset = pd.read_csv("dummy_data.csv")
# Create an object for the MinMaxScaler() function and store it in a variable.
scaler_val= MinMaxScaler()

# Normalize(transform the data to 0's and 1)the given dataset using the fit_transform() function and 
# store it in another variable.
normalizd_data= pd.DataFrame(scaler_val.fit_transform(dummy_dataset),
            columns=dummy_dataset.columns, index=dummy_dataset.index) 
# Print the Normalized data values
print(normalizd_data)

Output:

As can be seen, we have processed and normalised the data values between 0 and 1.

          id  calories  protein  fat
0   0.000000     0.250      0.6  0.2
1   0.055556     0.875      0.4  1.0
2   0.111111     0.250      0.6  0.2
3   0.166667     0.000      0.6  0.0
4   0.222222     0.750      0.2  0.4
5   0.277778     0.750      0.2  0.4
6   0.333333     0.750      0.2  0.0
7   0.388889     1.000      0.4  0.4
8   0.444444     0.500      0.2  0.2
9   0.500000     0.500      0.4  0.0
10  0.555556     0.875      0.0  0.4
11  0.611111     0.750      1.0  0.4
12  0.666667     0.875      0.0  0.6
13  0.722222     0.750      0.4  0.4
14  0.777778     0.750      0.0  0.2
15  0.833333     0.750      0.2  0.0
16  0.888889     0.625      0.2  0.0
17  0.944444     0.750      0.0  0.0
18  1.000000     0.750      0.0  0.2

In Brief:

As a result of the preceding explanation, the following conclusions can be drawn–

When the data values are skewed normalisation is used and do not follow a gaussian distribution,
The data values are transformed between 0 and 1.
Normalization frees the data’s scale.

We can also another method for Normalization i.e;

The maximum absolute scaling:

By dividing each observation by its maximum absolute value, maximum absolute scaling rescales each feature between -1 and 1. Using the.max() and.abs() methods in Pandas, we may achieve maximum absolute scaling.

But, the MinMaxScaler function is the popular one. Hence we have gone through only this in this article.

In Python, How do you Normalize Data? Read More »

Python Program to Remove Stop Words with NLTK

Python / By Vikram Chiluka

Pre-processing is the process of transforming data into something that a computer can understand. Filtering out worthless data is a common type of pre-processing. In natural language processing, stop words are worthless (useless) words (data).

Stop Words:

A stop word is a regularly used term for example, “the,” “a,” “an,”,”is” or “in” that a search engine has been configured to ignore, both while indexing entries for searching and retrieving them as the result of a search query.
We don’t want these terms taking up space in our database or using precious processing time. We can easily eliminate them by storing a list of terms that you believe to stop words. Python’s NLTK (Natural Language Toolkit) contains a list of stopwords in 16 different languages. You may find them in the nltk data directory, which is located at home/folder/nltk data/corpora/stopwords.

Note: Don’t forget to modify the name of your home directory.

Before going to the coding part, download the corpus including stop words from the NLTK module.

# Import nltk module using the import keyword.
import nltk
# Pass the 'stopwords' as an argument to the download() function to download all the
# stop words package
nltk.download('stopwords')

Output:

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
True

Printing the stop words list from the corpus:

# Import stopwords from nltk.corpus using the import keyword.
from nltk.corpus import stopwords
# Print all the stopwords in english language using the words() function in
# stopwords.
print(stopwords.words('english'))

Output:

['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', "you're",
 "you've", "you'll", "you'd", 'your', 'yours', 'yourself', 'yourselves', 'he',
 'him', 'his', 'himself', 'she', "she's", 'her', 'hers', 'herself', 'it', 
"it's", 'its', 'itself', 'they', 'them', 'their', 'theirs', 'themselves',
 'what', 'which', 'who', 'whom', 'this', 'that', "that'll", 'these', 'those',
 'am', 'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 
'having', 'do', 'does', 'did', 'doing', 'a', 'an', 'the', 'and', 'but', 'if',
 'or', 'because', 'as', 'until', 'while', 'of', 'at', 'by', 'for', 'with', 
'about', 'against', 'between', 'into', 'through', 'during', 'before', 'after',
 'above', 'below', 'to', 'from', 'up', 'down', 'in', 'out', 'on', 'off',
 'over', 'under', 'again', 'further', 'then', 'once', 'here', 'there', 'when',
 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more', 'most',
 'other', 'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same', 'so', 
'than', 'too', 'very', 's', 't', 'can', 'will', 'just', 'don', "don't",
 'should', "should've", 'now', 'd', 'll', 'm', 'o', 're', 've', 'y', 'ain',
 'aren', "aren't", 'couldn', "couldn't", 'didn', "didn't", 'doesn',
 "doesn't", 'hadn', "hadn't", 'hasn', "hasn't", 'haven', "haven't", 'isn',
 "isn't", 'ma', 'mightn', "mightn't", 'mustn', "mustn't", 'needn',
 "needn't", 'shan', "shan't", 'shouldn', "shouldn't", 'wasn', "wasn't", 
'weren', "weren't", 'won', "won't", 'wouldn', "wouldn't"]

You can also select stopwords from the other languages based on requirements.

Get all the Languages list that can be used:

The below are the languages that are available in the NLTK ‘stopwords’ corpus.

# Import stopwords from nltk.corpus using the import keyword.
from nltk.corpus import stopwords
# Get all the Languages list that can be used using the fileids() function in
# stopwords
print(stopwords.fileids())

Output:

['arabic', 'azerbaijani', 'bengali', 'danish', 'dutch', 'english', 'finnish',
 'french', 'german', 'greek', 'hungarian', 'indonesian', 'italian', 'kazakh', 
'nepali', 'norwegian', 'portuguese', 'romanian', 'russian', 'slovene', 
'spanish', 'swedish', 'tajik', 'turkish']

Adding our own stop words to the corpus:

# Import stopwords from nltk.corpus using the import keyword.
from nltk.corpus import stopwords
# Get all the stopwords in english language using the words() function in
# stopwords.
# Store it in a variable
our_stopwords = stopwords.words('english')
# Append some random stop word to the above obtained stopwords list using the
# append() function
our_stopwords.append('forexample')
print(our_stopwords)

Output:

['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', "you're",
 "you've", "you'll", "you'd", 'your', 'yours', 'yourself', 'yourselves', 'he', 
'him', 'his', 'himself', 'she', "she's", 'her', 'hers', 'herself', 'it', "it's",
 'its', 'itself', 'they', 'them', 'their', 'theirs', 'themselves', 'what', 
'which', 'who', 'whom', 'this', 'that', "that'll", 'these', 'those', 'am', 
'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 
'having', 'do', 'does', 'did', 'doing', 'a', 'an', 'the', 'and', 'but', 
'if', 'or', 'because', 'as', 'until', 'while', 'of', 'at', 'by', 'for',
 'with', 'about', 'against', 'between', 'into', 'through', 'during', 'before',
 'after', 'above', 'below', 'to', 'from', 'up', 'down', 'in', 'out', 'on',
 'off', 'over', 'under', 'again', 'further', 'then', 'once', 'here', 'there', 
'when', 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more',
 'most', 'other', 'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same',
 'so', 'than', 'too', 'very', 's', 't', 'can', 'will', 'just', 'don', "don't",
 'should', "should've", 'now', 'd', 'll', 'm', 'o', 're', 've', 'y', 'ain', 
'aren', "aren't", 'couldn', "couldn't", 'didn', "didn't", 'doesn', "doesn't",
 'hadn', "hadn't", 'hasn', "hasn't", 'haven', "haven't", 'isn', "isn't", 'ma',
 'mightn', "mightn't", 'mustn', "mustn't", 'needn', "needn't", 'shan', 
"shan't", 'shouldn', "shouldn't", 'wasn', "wasn't", 'weren', "weren't", 
'won', "won't", 'wouldn', "wouldn't", 'forexample']

The user-given stop word is added at the end. Check it out in the Output.

Removal of stop words:

The below is the code for removing all the stop words from a random string/sentence.

Tokenization:

Tokenization is the process of converting a piece of text into smaller parts known as tokens. These tokens are the core of NLP.

Tokenization is used to convert a sentence into a list of words.

Approach:

Import word_tokenize from nltk.tokenize using the import keyword.
Import stopwords from nltk.corpus using the import keyword.
Download ‘stopwords’,’punkt’ from nltk module using the download() function.
Import word_tokenize from nltk.tokenize using the import keyword.
Give the random string as static input and store it in a variable.
Pass the given string to the word_tokenize() function to convert the given string into a list of words.
Remove the stop words from the given string using the list comprehension and store it in another variable.
Print the string after removing stopwords.
The Exit of the Program.

Below is the implementation:

# Import nltk module using the import keyword.
import nltk
# Import stopwords from nltk.corpus using the import keyword.
from nltk.corpus import stopwords
# Download 'stopwords','punkt' from nltk module using the download() function.
nltk.download('stopwords')
nltk.download('punkt')
# Import word_tokenize from nltk.tokenize using the import keyword.
from nltk.tokenize import word_tokenize
# Give the random string as static input and store it in a variable.
gvn_str = "hello this is btechgeeks in is good morning all is a"
# Pass the given string to the word_tokenize() function to convert the given
# string into a list of words.
text_tokens = word_tokenize(gvn_str)
# Remove the stop words from the given string using the list comprehension 
# and store it in another variable.
stopwrds_removd = [word for word in text_tokens if not word in stopwords.words()]
# Print the string after removing stopwords.
print(stopwrds_removd)

Output:

['hello', 'btechgeeks', 'good', 'morning']

Python Program to Remove Stop Words with NLTK Read More »

Python Program to Extract Digits from a String – 2 Easy Ways

Python / By Vikram Chiluka

When working with strings, we frequently run into the problem of needing to get all of the numerical occurrences. This type of issue is common in competitive programming as well as online development. Let’s solve the issue now!.

Program to Extract Digits from a String – 2 Easy Ways in Python

Using Built-in Functions (Static Input)
Using Built-in Functions (User Input)

Method #1: Using Built-in Functions (Static Input)

1) Using Python isdigit() Function:

If the given string contains digit characters, the Python isdigit() method returns True.

Syntax:

string.isdigit()

Approach:

Give the string as static input and store it in a variable.
Take a variable and initialize it with an empty string.
Iterate in the given string using the for loop.
Inside the for loop, check if the character in a given string is a digit or not using the isdigit() function and if conditional statement.
If it is true, then concatenate the character to the above declared empty string using the ‘+’ operator and store it in the same variable.
Print all the digits from a given string.
The Exit of the Program.

Below is the implementation:

# Give the string as static input and store it in a variable.
gvn_strng = "678_Goodmorning 123 hello all"
print("The Given string = ", gvn_strng)
# Take a variable and initialize it with an empty string.
new_str = ""
# Iterate in the given string using the for loop.
for chrctr in gvn_strng:
    # Inside the for loop, check if the character in a given string is a digit
        # or not using the isdigit() function and if conditional statement.
    if chrctr.isdigit():
        # If it is true, then concatenate the character to the above declared empty
        # string using the '+' operator and store it in the same variable.
        new_str = new_str + chrctr
# Print all the digits from a given string.
print("The digits present in a given string = ", new_str)

Output:

The Given string =  678_Goodmorning 123 hello all
The digits present in a given string =  678123

2)Using List comprehension:

# Give the string as static input and store it in a variable.
gvn_strng = "678_Goodmorning 123 hello all"
# Print the given string
print("The Given string = ", gvn_strng)
# Using list comprehension to get all the digits present in a given string
new_lst = [int(chrctr) for chrctr in gvn_strng if chrctr.isdigit()]
# Print all the digits from a given string.
print("The digits present in a given string = ", new_lst)

Output:

The Given string =  678_Goodmorning 123 hello all
The digits present in a given string =  [6, 7, 8, 1, 2, 3]

3)Using regex Library:

The Python regular expressions library, known as the regex library,’ allows us to detect the presence of specific characters in a string, such as numbers, special characters, and so on.
Before proceeding, import the regex library into the Python environment.

import re

r’\d+’ – to extract numbers from the string

‘\d+’ helps the findall() function in identifying the existence of any digit.

Approach:

Import regex library using the import keyword.
Give the string as static input and store it in a variable.
Print the given string.
Pass r’\d+’ and given string as arguments to the re.findall() function to extract numbers from the string and store it in another variable.
Here ‘\d+’ helps the findall() function in identifying the existence of any digit.
Print all the digits from a given string.
The Exit of the Program.

Below is the implementation:

# Import regex library using the import keyword.
import re
# Give the string as static input and store it in a variable.
gvn_strng = "6 Goodmorning 17 hello all"
# Print the given string
print("The Given string = ", gvn_strng)
# Pass r'\d+'  and given string as arguments to the re.findall() function to
# extract numbers from the string and store it in another variable.
# Here '\d+' helps the findall() function in identifying the existence of any digit.
rslt_digts = re.findall(r'\d+', gvn_strng)
# Print all the digits from a given string.
print(rslt_digts)

Output:

The Given string = 6 Goodmorning 17 hello all
['6', '17']

Method #2: Using Built-in Functions (User Input)

1) Using Python isdigit() Function:

Approach:

Give the string as user input using the input() function and store it in a variable.
Take a variable and initialize it with an empty string.
Iterate in the given string using the for loop.
Inside the for loop, check if the character in a given string is a digit or not using the isdigit() function and if conditional statement.
If it is true, then concatenate the character to the above declared empty string using the ‘+’ operator and store it in the same variable.
Print all the digits from a given string.
The Exit of the Program.

Below is the implementation:

# Give the string as user input using the input() function and store it in a variable.
gvn_strng = input("Enter some random string = ")
print("The Given string = ", gvn_strng)
# Take a variable and initialize it with an empty string.
new_str = ""
# Iterate in the given string using the for loop.
for chrctr in gvn_strng:
    # Inside the for loop, check if the character in a given string is a digit
        # or not using the isdigit() function and if conditional statement.
    if chrctr.isdigit():
        # If it is true, then concatenate the character to the above declared empty
        # string using the '+' operator and store it in the same variable.
        new_str = new_str + chrctr
# Print all the digits from a given string.
print("The digits present in a given string = ", new_str)

Output:

Enter some random string = welcome6477 to Python-programs
The Given string = welcome6477 to Python-programs
The digits present in a given string = 6477

2)Using List comprehension:

# Give the string as user input using the input() function and store it in a variable.
gvn_strng = input("Enter some random string = ")
# Print the given string
print("The Given string = ", gvn_strng)
# Using list comprehension to get all the digits present in a given string
new_lst = [int(chrctr) for chrctr in gvn_strng if chrctr.isdigit()]
# Print all the digits from a given string.
print("The digits present in a given string = ", new_lst)

Output:

Enter some random string = 65 hello 231 all
The Given string = 65 hello 231 all
The digits present in a given string = [6, 5, 2, 3, 1]

3)Using regex Library:

Approach:

Import regex library using the import keyword.
Give the string as static input and store it in a variable.
Print the given string.
Pass r’\d+’ and given string as arguments to the re.findall() function to extract numbers from the string and store it in another variable.
Here ‘\d+’ helps the findall() function in identifying the existence of any digit.
Print all the digits from a given string.
The Exit of the Program.

Below is the implementation:

# Import regex library using the import keyword.
import re
# Give the string as user input using the input() function and store it in a variable.
gvn_strng = input("Enter some random string = ")
# Print the given string
print("The Given string = ", gvn_strng)
# Pass r'\d+'  and given string as arguments to the re.findall() function to
# extract numbers from the string and store it in another variable.
# Here '\d+' helps the findall() function in identifying the existence of any digit.
rslt_digts = re.findall(r'\d+', gvn_strng)
# Print all the digits from a given string.
print(rslt_digts)

Output:

Enter some random string = hello this is python 35 program
The Given string = hello this is python 35 program
['35']

Python Program to Extract Digits from a String – 2 Easy Ways Read More »

Python astype() Method with Examples

Python / By Vikram Chiluka

In this tutorial, we will go over an important idea in detail: Data Type Conversion of Columns in a DataFrame Using Python astype() Method.

Python is a superb language for data analysis, owing to its fantastic ecosystem of data-centric python programmes. Pandas is one of these packages, and it greatly simplifies data import and analysis.

astype() Method:

DataFrame.astype() method is used to convert pandas object to a given datatype. The astype() function can also convert any acceptable existing column to a categorical type.

We frequently come across a stage in the realm of Data Science and Machine Learning when we need to pre-process and transform the data. To be more specific, the transformation of data values is the first step toward modeling.
This is when data column conversion comes into play.

The Python astype() method allows us to convert the data type of an existing data column in a dataset or data frame.

Using the astype() function, we can modify or transform the type of data values or single or multiple columns to a completely different form.

Syntax:

DataFrame.astype(dtype, copy=True, errors='raise')

Parameters

dtype: The data type that should be applied to the entire data frame.
copy: If we set it to True, it makes a new copy of the dataset with the changes incorporated.
errors: By setting it to ‘raise,’ we allow the function to raise exceptions. If it isn’t, we can set it to ‘ignore.’

1)astype() – with DataFrame

Below is the implementation:

# Import pandas module using the import keyword
import pandas as pd
# Give the dictionary as static input and store it in a variable.
# (data given in the dictionary form)
gvn_data = {"ID": [11, 12, 13, 14, 15, 16], "Name": ["peter", "irfan", "mary",
                                                     "riya", "virat", "sunny"], "salary": [10000, 25000, 15000, 50000, 30000, 22000]}
# Pass the given data to the DataFrame() function and store it in another variable
block_data = pd.DataFrame(gvn_data)
# Print the above result
print("The given input Dataframe: ")
print(block_data)
print()
# Apply dtypes to the above block data
block_data.dtypes

Output:

The given input Dataframe: 
   ID   Name  salary
0  11  peter   10000
1  12  irfan   25000
2  13   mary   15000
3  14   riya   50000
4  15  virat   30000
5  16  sunny   22000

ID         int64
Name      object
salary     int64
dtype: object

Now, apply the astype() method on the ‘Name’ column to change the data type to ‘category’

# Import pandas module using the import keyword
import pandas as pd
# Give the dictionary as static input and store it in a variable.
# (data given in the dictionary form)
gvn_data = {"ID": [11, 12, 13, 14, 15, 16], "Name": ["peter", "irfan", "mary",
                                                     "riya", "virat", "sunny"], "salary": [10000, 25000, 15000, 50000, 30000, 22000]}
# Pass the given data to the DataFrame() function and store it in another variable
block_data = pd.DataFrame(gvn_data)
# Apply the astype() method on the 'Name' column to change the data type to 'category'
block_data['Name'] = block_data['Name'].astype('category')
# Apply dtypes to the above block data
block_data.dtypes

Output:

ID           int64
Name      category
salary       int64
dtype: object

Note:

 You can also change to datatype 'string'

2)astype() Method – with a Dataset in Python

Use the pandas.read csv() function to import the dataset. The dataset can be found here.

Approach:

Import pandas library using the import keyword.
Import some random dataset using the pandas.read_csv() function by passing the filename as an argument to it.
Store it in a variable.
Apply dtypes to the above dataset.
The Exit of the Program.

Below is the implementation:

# Import pandas library using the import keyword
import pandas
# Import some random dataset using the pandas.read_csv() function by passing
# the filename as an argument to it.
# Store it in a variable.
cereal_dataset = pandas.read_csv("cereal.csv")
# Apply dtypes to the above dataset
cereal_dataset.dtypes

Output:

name         object
mfr          object
type         object
calories      int64
protein       int64
fat           int64
sodium        int64
fiber       float64
carbo       float64
sugars        int64
potass        int64
vitamins      int64
shelf         int64
weight      float64
cups        float64
rating      float64
dtype: object

Now attempt to change the datatype of the variables ‘name’ and ‘fat’ to string, float64 respectively. As a result, we can say that the astype() function allows us to change the data types of multiple columns in one go.

# Import pandas library using the import keyword
import pandas
# Import some random dataset using the pandas.read_csv() function by passing
# filename as an argument to it.
# Store it in a variable.
cereal_dataset = pandas.read_csv("cereal.csv")
# Change the datatype of the variables 'name' and 'fat'using the astype() function
print("The dataset after changing datatypes:")
cereal_dataset = cereal_dataset.astype({"name":'string', "fat":'float64'}) 
# Apply dtypes to the above dataset
cereal_dataset.dtypes

Output:

The dataset after changing datatypes:
name         string
mfr          object
type         object
calories      int64
protein       int64
fat         float64
sodium        int64
fiber       float64
carbo       float64
sugars        int64
potass        int64
vitamins      int64
shelf         int64
weight      float64
cups        float64
rating      float64
dtype: object

Python astype() Method with Examples Read More »

Author name: Vikram Chiluka

qrcode Module with Examples in Python

Using Built-in Functions (Static Input)

1)Python Developer

2)Data Analyst

3)Data Scientist

4) AI and ML Engineer

5)Data Engineer

6)As a Tutor

7) As a Freelancer

Set symmetric_difference_update() Method with Examples in Python

Method #1: Using Built-in Functions (Static Input)

Method #2: Using Built-in Functions (User Input)

Differences: List vs Array

1)Storing Data – List vs Array:

2)Declaration

3)Mathematical Operations:

4)Changing the size of the data structure

SymPy Module:

1)Limits Calculation in Python

2)Derivatives Calculation in Python

3)Integration Calculation in Python

Methods for Normalizing Data in Python

)Importing the Dataset

In Brief:

Removal of stop words:

Program to Extract Digits from a String – 2 Easy Ways in Python

Method #1: Using Built-in Functions (Static Input)

Method #2: Using Built-in Functions (User Input)

1)astype() – with DataFrame

2)astype() Method – with a Dataset in Python