Python

Create Numpy Array of different shapes & initialize with identical values using numpy.full() in Python

Creating Numpy Array of different shapes & initialize with identical values using numpy.full()

In this article we will see how we can create a numpy array of different shapes but initialized with identical values. So, let’s start the explore the concept to understand it well.

numpy.full() :

Numpy module provides a function numpy.full() to create a numpy array of given shape and initialized with a given value.

Syntax : numpy.full(shape, given_value, dtype=None, order='C')

Where,

shape : Represents shape of the array.
given_value : Represents initialization value.
dtype : Represents the datatype of elements(Optional).

But to use Numpy we have to include following module i.e.

import numpy as np

Let’s see the below example to understand the concept.

Example-1 : Create a 1D Numpy Array of length 8 and all elements initialized with value 2

Here array length is 8 and array elements to be initialized with 2.

Let’s see the below the program.

import numpy as np
# 1D Numpy Array created of length 8 & all elements initialized with value 2
sample_arr = np.full(8,2)
print(sample_arr)

Output :
[2 2 2 2 2 2 2 2]

Example-2 : Create a 2D Numpy Array of 3 rows | 4 columns and all elements initialized with value 5

Here 2D array of row 3 and column 4 and array elements to be initialized with 5.

Let’s see the below the program.

import numpy as np
#Create a 2D Numpy Array of 3 rows & 4 columns. All intialized with value 5
sample_arr = np.full((3,4), 5)
print(sample_arr)

Output :
[[5 5 5 5]
[5 5 5 5]
[5 5 5 5]]

Example-3 : Create a 3D Numpy Array of shape (3,3,4) & all elements initialized with value 1

Here initialized value is 1.

Let’s see the below the program.

import numpy as np
# Create a 3D Numpy array & all elements initialized with value 1
sample_arr = np.full((3,3,4), 1)
print(sample_arr)

Output :

[[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]

[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]

[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]]

Example-4 : Create initialized Numpy array of specified data type

Here, array length is 8 and value is 4 and data type is float.

import numpy as np
# 1D Numpy array craeted & all float elements initialized with value 4
sample_arr = np.full(8, 4, dtype=float)
print(sample_arr)

Output :
[4. 4. 4. 4. 4. 4. 4. 4.]

Create Numpy Array of different shapes & initialize with identical values using numpy.full() in Python Read More »

Pandas: Create Dataframe from List of Dictionaries

Methods of creating a dataframe from a list of dictionaries

In this article, we discuss different methods by which we can create a dataframe from a list of dictionaries. Before going to the actual article let us done some observations that help to understand the concept easily. Suppose we have a list of dictionary:-

list_of_dict = [ {'Name': 'Mayank' , 'Age': 25, 'Marks': 91}, {'Name': 'Raj', 'Age': 21, 'Marks': 97}, {'Name': 'Rahul', 'Age': 23, 'Marks': 79}, {'Name': 'Manish' , 'Age': 23}, ]

Here we know that dictionaries consist of key-value pairs. So we can analyze that if we make the key as our column name and values as the column value then a dataframe is easily created. And we have a list of dictionaries so a dataframe with multiple rows also.

pandas.DataFrame

This methods helps us to create dataframe in python

syntax: pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

Let us see different methods to create dataframe from a list of dictionaries

Method 1-Create Dataframe from list of dictionaries with default indexes

As we see in in pandas.Datframe() method there is parameter name data.We have to simply pass our list of dictionaries in this method and it will return the dataframe.Let see this with the help of an example.

import pandas as pd
import numpy as np

list_of_dict = [
    {'Name': 'Mayank' ,  'Age': 25,  'Marks': 91},
    {'Name': 'Raj',  'Age': 21,  'Marks': 97},
    {'Name': 'Rahul',  'Age': 23,  'Marks': 79},
    {'Name': 'Manish' ,  'Age': 23,  'Marks': 86},
]
#create dataframe
df=pd.DataFrame(list_of_dict)
print(df)

Output

   Age  Marks    Name
0   25     91  Mayank
1   21     97     Raj
2   23     79   Rahul
3   23     86  Manish

Here we see that dataframe is created with default indexes 0,1,2,3….

Now a question may arise if from any dictionary key-value pair is less than other dictionaries.So in this case what happened.Let understand it with the help of an example.

import pandas as pd
import numpy as np

list_of_dict = [
    {'Name': 'Mayank' ,  'Age': 25,  'Marks': 91},
    {'Name': 'Raj',  'Age': 21,  'Marks': 97},
    {'Name': 'Rahul',  'Marks': 79},
    {'Name': 'Manish' ,  'Age': 23},
]
#create dataframe
df=pd.DataFrame(list_of_dict)
print(df)

Output

    Age  Marks    Name
0  25.0   91.0  Mayank
1  21.0   97.0     Raj
2   NaN   79.0   Rahul
3  23.0    NaN  Manish

Here we see in case of missing key value pair NaN value is there in the output.

Method 2- Create Dataframe from list of dictionary with custom indexes

Unlike the previous method where we have default indexes we can also give custom indexes by passes list of indexes in index parameter of pandas.DataFrame() function.Let see this with the help of an example.

import pandas as pd
import numpy as np

list_of_dict = [
    {'Name': 'Mayank' ,  'Age': 25,  'Marks': 91},
    {'Name': 'Raj',  'Age': 21,  'Marks': 97},
    {'Name': 'Rahul',  'Marks': 79},
    {'Name': 'Manish' ,  'Age': 23},
]
#create dataframe
df=pd.DataFrame(list_of_dict,index=['a','b','c','d'])
print(df)

Output

    Age  Marks    Name
a  25.0   91.0  Mayank
b  21.0   97.0     Raj
c   NaN   79.0   Rahul
d  23.0    NaN  Manish

Here we see that instead of default index 1,2,3….. we have now indes a,b,c,d.

Method 3-Create Dataframe from list of dictionaries with changed order of columns

With the help of pandas.DataFrame() method we can easily arrange order of column by simply passes list ozf columns in columns parameter in the order in which we want to display it in our dataframe.Let see this with the help of example.

import pandas as pd
import numpy as np

list_of_dict = [
    {'Name': 'Mayank' ,  'Age': 25,  'Marks': 91},
    {'Name': 'Raj',  'Age': 21,  'Marks': 97},
    {'Name': 'Rahul',  'Age': 23,  'Marks': 79},
    {'Name': 'Manish' ,  'Age': 23,  'Marks': 86},
]
#create dataframe
df=pd.DataFrame(list_of_dict,columns=['Name', 'Marks', 'Age'])
print(df)

Output

     Name  Marks  Age
0  Mayank     91   25
1     Raj     97   21
2   Rahul     79   23
3  Manish     86   23

Here also a question may arise if we pass less column in columns parameter or we pass more column in parameter then what happened.Let see this with the help of an example.

Case 1: Less column in column parameter

In this case the column which we don’t pass will be drop from the dataframe.Let see this with the help of an example.

import pandas as pd
import numpy as np

list_of_dict = [
    {'Name': 'Mayank' ,  'Age': 25,  'Marks': 91},
    {'Name': 'Raj',  'Age': 21,  'Marks': 97},
    {'Name': 'Rahul',  'Age': 23,  'Marks': 79},
    {'Name': 'Manish' ,  'Age': 23,  'Marks': 86},
]
#create dataframe
df=pd.DataFrame(list_of_dict,columns=['Name', 'Marks'])
print(df)

Output

     Name  Marks
0  Mayank     91
1     Raj     97
2   Rahul     79
3  Manish     86

Here we see that we didn’t pass Age column that’s why Age clumn is also not in our dataframe.

Case 2: More column in column parameter

In this case a new column will be added in dataframe but its all the value will be NaN.Let see this with the help of an example.

import pandas as pd
import numpy as np

list_of_dict = [
    {'Name': 'Mayank' ,  'Age': 25,  'Marks': 91},
    {'Name': 'Raj',  'Age': 21,  'Marks': 97},
    {'Name': 'Rahul',  'Age': 23,  'Marks': 79},
    {'Name': 'Manish' ,  'Age': 23,  'Marks': 86},
]
#create dataframe
df=pd.DataFrame(list_of_dict,columns=['Name', 'Marks', 'Age','city'])
print(df)

Output

     Name  Marks  Age  city
0  Mayank     91   25   NaN
1     Raj     97   21   NaN
2   Rahul     79   23   NaN
3  Manish     86   23   NaN

So these are the methods to create dataframe from list of dictionary in pandas.

Pandas: Create Dataframe from List of Dictionaries Read More »

Pandas: Add Two Columns into a New Column in Dataframe

Python / By Mayank Gupta

Methods to add two columns into a new column in Dataframe

In this article, we discuss how to add to column to an existing column in the dataframe and how to add two columns to make a new column in the dataframe using pandas. We will also discuss how to deal with NaN values.

Method 1-Sum two columns together to make a new series

In this method, we simply select two-column by their column name and then simply add them.Let see this with the help of an example.

import pandas as pd 
import numpy as np 
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, np.NaN, 81) , 
            ('Abhay', 25,'Rajasthan' , 90) , 
            ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n') 
total = df['Age'] + df['Marks']
print("New Series \n") 
print(total)
print(type(total))

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22        NaN     81
3  Abhay   25  Rajasthan     90
4  Ajjet   21      Delhi     74 

New Series 

0    119
1    118
2    103
3    115
4     95
dtype: int64
<class 'pandas.core.series.Series'>

Here we see that when we add two columns then a series will be formed.]

Note: We can’t add a string with int or float. We can only add a string with a string or a number with a number.

Let see the example of adding string with string.

import pandas as pd 
import numpy as np 
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 25,'Rajasthan' , 90) , 
            ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n') 
total = df['Name'] + " "+df['City']
print("New Series \n") 
print(total)
print(type(total))

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   25  Rajasthan     90
4  Ajjet   21      Delhi     74 

New Series 

0         Raj Mumbai
1        Rahul Delhi
2       Aadi Kolkata
3    Abhay Rajasthan
4        Ajjet Delhi
dtype: object
<class 'pandas.core.series.Series'>

Method 2-Sum two columns together having NaN values to make a new series

In the previous method, there is no NaN or missing values but in this case, we also have NaN values. So when we add two columns in which one or two-column contains NaN values then we will see that we also get the result as NaN. Let see this with the help of an example.

import pandas as pd 
import numpy as np 
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', np.NaN) , 
            ('Abhay', np.NaN,'Rajasthan' , 90) , 
            ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n') 
total = df['Marks'] + df['Age']
print("New Series \n") 
print(total)
print(type(total))

Output

Original Dataframe

    Name   Age       City  Marks
0    Raj  24.0     Mumbai   95.0
1  Rahul  21.0      Delhi   97.0
2   Aadi  22.0    Kolkata    NaN
3  Abhay   NaN  Rajasthan   90.0
4  Ajjet  21.0      Delhi   74.0 

New Series 

0    119.0
1    118.0
2      NaN
3      NaN
4     95.0
dtype: float64
<class 'pandas.core.series.Series'>

Method 3-Add two columns to make a new column

We know that a dataframe is a group of series. We see that when we add two columns it gives us a series and we store that sum in a variable. If we make that variable a column in the dataframe then our work will be easily done. Let see this with the help of an example.

import pandas as pd 
import numpy as np 
students = [('Raj', 24, 'Mumbai', 95) , 
('Rahul', 21, 'Delhi' , 97) , 
('Aadi', 22, 'Kolkata',76) , 
('Abhay',23,'Rajasthan' , 90) , 
('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n') 
df['total'] = df['Marks'] + df['Age']
print("New Dataframe \n") 
print(df)
 
print(df)

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     76
3  Abhay   23  Rajasthan     90
4  Ajjet   21      Delhi     74 

New Dataframe 

    Name  Age       City  Marks  total
0    Raj   24     Mumbai     95    119
1  Rahul   21      Delhi     97    118
2   Aadi   22    Kolkata     76     98
3  Abhay   23  Rajasthan     90    113
4  Ajjet   21      Delhi     74     95

Method 4-Add two columns with NaN values to make a new column

The same is the case with NaN values. But here NaN values will be shown.Let see this with the help of an example.

import pandas as pd 
import numpy as np 
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', np.NaN) , 
            ('Abhay', np.NaN,'Rajasthan' , 90) , 
            ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n') 
df['total'] = df['Marks'] + df['Age']
print("New Dataframe \n") 
print(df)

Output

Original Dataframe

    Name   Age       City  Marks
0    Raj  24.0     Mumbai   95.0
1  Rahul  21.0      Delhi   97.0
2   Aadi  22.0    Kolkata    NaN
3  Abhay   NaN  Rajasthan   90.0
4  Ajjet  21.0      Delhi   74.0 

New Dataframe 

    Name   Age       City  Marks  total
0    Raj  24.0     Mumbai   95.0  119.0
1  Rahul  21.0      Delhi   97.0  118.0
2   Aadi  22.0    Kolkata    NaN    NaN
3  Abhay   NaN  Rajasthan   90.0    NaN
4  Ajjet  21.0      Delhi   74.0   95.0

So these are the methods to add two columns in the dataframe.

Pandas: Add Two Columns into a New Column in Dataframe Read More »

Matplotlib: Line plot with markers

Python / By Mayank Gupta

Methods to draw line plot with markers with the help of Matplotlib

In this article, we will discuss some basics of matplotlib and then discuss how to draw line plots with markers.

Matplotlib

We know that data that is in the form of numbers is difficult and boring to analyze. But if we convert that number into graphs, bar plots, piecharts, etc then it will be easy and interesting to visualize the data. Here Matplotlib library of python came into use. Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.

For using this library we have to first import it into the program. For importing this we can use

from matplotlib import pyplot as plt or import matplotlib. pyplot as plt.

In this article, we only discuss the line plot. So let see the function in matplotlib to draw a line plot.

syntax: plt.plot(x,y, scalex=True, scaley=True, data=None, marker=’marker style’, **kwargs)

Parameters

x,y: They represent vertical and horizontal axis.
scalex, scaley: These parameters determine if the view limits are adapted to the data limits. The default value is True.
marker: It contains types of markers that can be used. Like point marker, circle marker, etc.

Here is the list of markers used in this

“’.’“ point marker
“’,’“ pixel marker
“’o’“ circle marker
“’v’“ triangle_down marker
“’^’“ triangle_up marker
“'<‘“ triangle_left marker
“’>’“ triangle_right marker
“’1’“ tri_down marker
“’2’“ tri_up marker
“’3’“ tri_left marker
“’4’“ tri_right marker
“’s’“ square marker
“’p’“ pentagon marker
“’*’“ star marker
“’h’“ hexagon1 marker
“’H’“ hexagon2 marker
“’+’“ plus marker
“’x’“ x marker
“’D’“ diamond marker
“’d’“ thin_diamond marker
“’|’“ vline marker
“’_’“ hline marker

Examples of Line plot with markers in matplotlib

Line Plot with the Point marker

Here we use marker='.'.Let see this with the help of an example.

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(-5,40,.5)
y = np.sin(x)
plt.plot(x,y, marker='.')
plt.title('Sin Function')
plt.xlabel('x values')
plt.ylabel('y= sin(x)')
plt.show()

Output

Line Plot with the Point marker and give marker some color

In the above example, we see the color of the marker is the same as the color of the line plot. So there is an attribute in plt.plot() function marker face color or mfc: color which is used to give color to the marker. Let see this with the help of an example.

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(-5,40,.5)
y = np.sin(x)
plt.plot(x,y, marker='.',mfc='red')
plt.title('Sin Function')
plt.xlabel('x values')
plt.ylabel('y= sin(x)')
plt.show()

Output

Here we see that color of the pointer changes to red.

Line Plot with the Point marker and change the size of the marker

To change the size of the marker there is an attribute in pointer ply.plot() function that is used to achieve this. marker size or ms attribute is used to achieve this. We can pass an int value in ms and then its size increases or decreases according to this. Let see this with the help of an example.

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(-5,40,.5)
y = np.sin(x)
plt.plot(x,y, marker='.',mfc='red',ms='17')
plt.title('Sin Function')
plt.xlabel('x values')
plt.ylabel('y= sin(x)')
plt.show()

Output

Here we see that size of the pointer changes.

Line Plot with the Point marker and change the color of the edge of the marker

We can also change the color of the edge of marker with the help of markeredgecolor or mec attribute. Let see this with the help of an example.

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(-5,40,.5)
y = np.sin(x)
plt.plot(x,y, marker='.',mfc='red',ms='17', mec='yellow')
plt.title('Sin Function')
plt.xlabel('x values')
plt.ylabel('y= sin(x)')
plt.show()

Output

Here we see that the color of the edge of the pointer changes to yellow.

So here are some examples of how we can work with markers in line plots.

Note: These examples are applicable to any of the marker.

Matplotlib: Line plot with markers Read More »

Read Csv File to Dataframe With Custom Delimiter in Python

Python / By Mayank Gupta

Different methods to read CSV files with custom delimiter in python

In this article, we will see what are CSV files, how to use them in pandas, and then we see how and why to use custom delimiter with CSV files in pandas.

CSV file

A simple way to store big data sets is to use CSV files (comma-separated files).CSV files contain plain text and is a well know format that can be read by everyone including Pandas. Generally, CSV files contain columns separated by commas, but they can also contain content separated by a tab, or underscore or hyphen, etc. Generally, CSV files look like this:-

total_bill,tip,sex,smoker,day,time,size
16.99,1.01,Female,No,Sun,Dinner,2
10.34,1.66,Male,No,Sun,Dinner,3
21.01,3.5,Male,No,Sun,Dinner,3
23.68,3.31,Male,No,Sun,Dinner,2
24.59,3.61,Female,No,Sun,Dinner,4

Here we see different columns and their values are separated by commas.

Use CSV file in pandas

read_csv() method is used to import and read CSV files in pandas. After this step, a CSV file act as a normal dataframe and we can use operation in CSV file as we use in dataframe.

syntax: pandas.read_csv(filepath_or_buffer, sep=‘, ‘, delimiter=None, header=‘infer’, names=None, index_col=None, ….)

',' is default separator in read_csv() method.

Let see this with an example

import pandas as pd
data=pd.read_csv('example1.csv')
data.head()

Output

	total_bill	tip	sex	smoker	day	time	size
0	16.99	1.01	Female	No	Sun	Dinner	2
1	10.34	1.66	Male	No	Sun	Dinner	3
2	21.01	3.50	Male	No	Sun	Dinner	3
3	23.68	3.31	Male	No	Sun	Dinner	2
4	24.59	3.61	Female	No	Sun	Dinner	4

Why use separator or delimiter with read_csv() method

Till now we understand that generally, CSV files contain data separated data that is separated by comma but sometimes it can contain data separated by tab or hyphen, etc. So to handle this we use a seperator. Let understand this with the help of an example. Suppose we have a CSV file separated by an underscore and we try to read that CSV file without using a separator or with using default separator i.e. comma. So let see what happens in this case.

"total_bill"_tip_sex_smoker_day_time_size
16.99_1.01_Female_No_Sun_Dinner_2
10.34_1.66_Male_No_Sun_Dinner_3
21.01_3.5_Male_No_Sun_Dinner_3
23.68_3.31_Male_No_Sun_Dinner_2
24.59_3.61_Female_No_Sun_Dinner_4
25.29_4.71_Male_No_Sun_Dinner_4
8.77_2_Male_No_Sun_Dinner_2

Suppose this is our CSV file separated by an underscore.

	total_bill_tip_sex_smoker_day_time_size
0	16.99_1.01_Female_No_Sun_Dinner_2
1	10.34_1.66_Male_No_Sun_Dinner_3
2	21.01_3.5_Male_No_Sun_Dinner_3
3	23.68_3.31_Male_No_Sun_Dinner_2
4	24.59_3.61_Female_No_Sun_Dinner_4

Now see when we didn’t use a default separator here how unordered our data look like. So to solve this issue we use Separator. Now we will see when we use a separator to underscore how we get the same data in an ordered manner.

import pandas as pd 
data=pd.read_csv('example2.csv',sep = '_',engine = 'python') 
data.head()

Output

	total_bill	tip	sex	smoker	day	time	size
0	16.99	1.01	Female	No	Sun	Dinner	2
1	10.34	1.66	Male	No	Sun	Dinner	3
2	21.01	3.50	Male	No	Sun	Dinner	3
3	23.68	3.31	Male	No	Sun	Dinner	2
4	24.59	3.61	Female	No	Sun	Dinner	4

So this example is sufficient to understand why there is a need of using a separator of delimiter in pandas while working on a CSV file.

Now suppose there is a CSV file in while data is separated by multiple separators. For example:-

totalbill_tip,sex:smoker,day_time,size
16.99,1.01:Female|No,Sun,Dinner,2
10.34,1.66,Male,No|Sun:Dinner,3
21.01:3.5_Male,No:Sun,Dinner,3
23.68,3.31,Male|No,Sun_Dinner,2
24.59:3.61,Female_No,Sun,Dinner,4
25.29,4.71|Male,No:Sun,Dinner,4

Here we see there are multiple seperator used. So here we can not use any custom delimiter. To solve this problem regex or regular expression is used. Let see with the help of an example.

import pandas as pd 
data=pd.read_csv('example4.csv',sep = '[:, |_]') 
data.head()

Output

	totalbill	tip	sex	smoker	day	time	size
0	16.99	1.01	Female	No	Sun	Dinner	2
1	10.34	1.66	Male	No	Sun	Dinner	3
2	21.01	3.50	Male	No	Sun	Dinner	3
3	23.68	3.31	Male	No	Sun	Dinner	2
4	24.59	3.61	Female	No	Sun	Dinner	4

When we notice we pass a list of separators in the sep parameter that is contained in our CSV file.

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read Csv File to Dataframe With Custom Delimiter in Python Read More »

Get Rows And Columns Names In Dataframe Using Python

Python / By Mayank Gupta

Methods to get rows and columns names in dataframe

In this we will study different methods to get rows and column names in a dataframe.

Methods to get column name in dataframe

Method 1: By iterating over columns

In this method, we will simply be iterating over all the columns and print the names of each column. Point to remember that dataframe_name. columns give a list of columns.Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print(df.columns,'\n')
print("columns are:")
for column in df.columns:
  print(column,end=" ")

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

Index(['Name', 'Age', 'City', 'Marks'], dtype='object') 

columns are:
Name Age City Marks

Here we see that df. columns give a list of columns and by iterating over this list we can easily get column names.

Method 2-Using columns.values

columns. values return an array of column names. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("columns are:")
print(df.columns.values,'\n')

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

columns are:
['Name' 'Age' 'City' 'Marks']

Method 3- using tolist() method

Using tolist() method with values with given the list of columns. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("columns are:")
print(df.columns.values.tolist(),'\n')

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

columns are:
['Name', 'Age', 'City', 'Marks']

Method 4- Access specific column name using index

As we know that columns. values give an array of columns and we can access array elements using an index. So in this method, we use this concept. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("columns at second index:")
print(df.columns.values[2],'\n')

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

columns at second index:
City

So these are the methods to get column names.

Method to get rows name in dataframe

Method 1-Using index.values

As columns., values give a list or array of columns similarly index. values give a list of array of indexes. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("Rows are:")
print(df.index.values,'\n')

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

Rows are:
[0 1 2 3 4]

Method 2- Get Row name at a specific index

As we know that index. values give an array of indexes and we can access array elements using an index. So in this method, we use this concept. Let see this with the help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
('Rahul', 21, 'Delhi' , 97) , 
('Aadi', 22, 'Kolkata', 81) , 
('Abhay', 24,'Rajasthan' ,76) , 
('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("Row at index 2:")
print(df.index.values[2],'\n')

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

Row at index 2:
2

Method 3-By iterating over indices

As dataframe_names.columns give a list of columns similarly dataframe_name.index gives the list of indexes. Hence we can simply be iterating over all lists of indexes and print rows names. Let see this with help of an example.

import pandas as pd
import numpy as np
students = [('Raj', 24, 'Mumbai', 95) , 
            ('Rahul', 21, 'Delhi' , 97) , 
            ('Aadi', 22, 'Kolkata', 81) , 
            ('Abhay', 24,'Rajasthan' ,76) , 
              ('Ajjet', 21, 'Delhi' , 74)] 
# Create a DataFrame object 
df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Marks']) 
print("Original Dataframe\n") 
print(df,'\n')
print("List of indexes:")
print(df.index,'\n')
print("Indexes or rows names are:")
for row in df.index:
  print(row,end=" ")

Output

Original Dataframe

    Name  Age       City  Marks
0    Raj   24     Mumbai     95
1  Rahul   21      Delhi     97
2   Aadi   22    Kolkata     81
3  Abhay   24  Rajasthan     76
4  Ajjet   21      Delhi     74 

List of indexes:
RangeIndex(start=0, stop=5, step=1) 

Indexes or rows names are:
0 1 2 3 4

So these are the methods to get rows and column names in the dataframe using python.

Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.

Read more Articles on Python Data Analysis Using Padas – Select items from a Dataframe

Get Rows And Columns Names In Dataframe Using Python Read More »

Create Numpy Array of different shapes & initialize with identical values using numpy.full() in Python

Creating Numpy Array of different shapes & initialize with identical values using numpy.full()

numpy.full() :

Example-1 : Create a 1D Numpy Array of length 8 and all elements initialized with value 2

Example-2 : Create a 2D Numpy Array of 3 rows | 4 columns and all elements initialized with value 5

Example-3 : Create a 3D Numpy Array of shape (3,3,4) & all elements initialized with value 1

Example-4 : Create initialized Numpy array of specified data type

Pandas: Create Dataframe from List of Dictionaries

Methods of creating a dataframe from a list of dictionaries

pandas.DataFrame

Let us see different methods to create dataframe from a list of dictionaries

Method 1-Create Dataframe from list of dictionaries with default indexes

Method 2- Create Dataframe from list of dictionary with custom indexes

Method 3-Create Dataframe from list of dictionaries with changed order of columns

Case 1: Less column in column parameter

Case 2: More column in column parameter

Pandas: Add Two Columns into a New Column in Dataframe

Methods to add two columns into a new column in Dataframe

Method 1-Sum two columns together to make a new series

Method 2-Sum two columns together having NaN values to make a new series

Method 3-Add two columns to make a new column

Method 4-Add two columns with NaN values to make a new column

Matplotlib: Line plot with markers

Methods to draw line plot with markers with the help of Matplotlib

Matplotlib

Examples of Line plot with markers in matplotlib

Line Plot with the Point marker

Line Plot with the Point marker and give marker some color

Line Plot with the Point marker and change the size of the marker

Line Plot with the Point marker and change the color of the edge of the marker

Read Csv File to Dataframe With Custom Delimiter in Python

Different methods to read CSV files with custom delimiter in python

CSV file

Use CSV file in pandas

Why use separator or delimiter with read_csv() method

Get Rows And Columns Names In Dataframe Using Python

Methods to get rows and columns names in dataframe

Methods to get column name in dataframe

Method 1: By iterating over columns

Method 2-Using columns.values

Method 3- using tolist() method

Method 4- Access specific column name using index

Method to get rows name in dataframe

Method 1-Using index.values

Method 2- Get Row name at a specific index

Method 3-By iterating over indices