Converting a DataFrame into a list of rows or columns in python | (list of lists)
In this article, we will discuss how we can convert a dataframe into a list, by converting each row or column into a list and creating a python lists from them.
Let’s first, create a dataframe,
import pandas as pd #The List of Tuples students = [('Arun', 23, 'Chennai', 127), ('Priya', 31, 'Delhi', 174.5), ('Ritik', 24, 'Mumbai', 181), ('Kimun', 37, 'Hyderabad', 125), ('Sinvee', 16, 'Delhi', 175.5), ('Kunu', 28, 'Mumbai', 115), ('Lisa', 31, 'Pune', 191) ] # Creating DataFrame object studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score']) print(studentId)
Output :   Name   Age      City      Score 0   Arun   23   Chennai     127.0 1  Priya    31     Delhi      174.5 2  Ritik    24    Mumbai     181.0 3  Kimun   37  Hyderabad   125.0 4  Sinvee  16     Delhi       175.5 5   Kunu  28    Mumbai      115.0 6   Lisa   31      Pune       191.0
Convert a Dataframe into a list of lists – Rows Wise :
In the dataframe created above, we must fetch each line as a list and create a list of these lists.
Let’s see how we can do this
import pandas as pd #The List of Tuples students = [('Arun', 23, 'Chennai', 127), ('Priya', 31, 'Delhi', 174.5), ('Ritik', 24, 'Mumbai', 181), ('Kimun', 37, 'Hyderabad', 125), ('Sinvee', 16, 'Delhi', 175.5), ('Kunu', 28, 'Mumbai', 115), ('Lisa', 31, 'Pune', 191) ] # Creating DataFrame object studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score']) # By Converting a dataframe to the list of rows (list of lists) listOfRows = studentId.to_numpy().tolist() print(listOfRows) print(type(listOfRows))
Output : [['Arun', 23, 'Chennai', 127.0], ['Priya', 31, 'Delhi', 174.5], ['Ritik', 24, 'Mumbai', 181.0], ['Kimun', 37, 'Hyderabad', 125.0], ['Sinvee', 16, 'Delhi', 175.5], ['Kunu', 28, 'Mumbai', 115.0], ['Lisa', 31, 'Pune', 191.0]] <class 'list'>
It Converted the data name into a sequential target list, that is, each linked list contains a line of data names. But what happened in one line ?
How did it work?
Let’s divide one line above into several lines to understand the concept behind it.
Step 1: Convert the Dataframe to a nested Numpy array using DataFrame.to_numpy() :
import pandas as pd #The List of Tuples students = [('Arun', 23, 'Chennai', 127), ('Priya', 31, 'Delhi', 174.5), ('Ritik', 24, 'Mumbai', 181), ('Kimun', 37, 'Hyderabad', 125), ('Sinvee', 16, 'Delhi', 175.5), ('Kunu', 28, 'Mumbai', 115), ('Lisa', 31, 'Pune', 191) ] # Creating DataFrame object studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score']) # By getting rows of a dataframe as a nested numpy array numpy_2d_array = studentId.to_numpy() print(numpy_2d_array) print(type(numpy_2d_array))
Output : [['Arun' 23 'Chennai' 127.0] ['Priya' 31 'Delhi' 174.5] ['Ritik' 24 'Mumbai' 181.0] ['Kimun' 37 'Hyderabad' 125.0] ['Sinvee' 16 'Delhi' 175.5] ['Kunu' 28 'Mumbai' 115.0] ['Lisa' 31 'Pune' 191.0]] <class 'numpy.ndarray'>
Actually DataFrame.to_numpy()
converts data name into Numpy array. So we have a 2D Numpy array here. We have confirmed that by printing the type of returned item.
Step 2: Convert 2D Numpy array into a list of lists :
Numpy provides a function tolist()
, which converts Numpy Array into a list. Let’s call that function in the object built above 2D Numpy,
import pandas as pd #The List of Tuples students = [('Arun', 23, 'Chennai', 127), ('Priya', 31, 'Delhi', 174.5), ('Ritik', 24, 'Mumbai', 181), ('Kimun', 37, 'Hyderabad', 125), ('Sinvee', 16, 'Delhi', 175.5), ('Kunu', 28, 'Mumbai', 115), ('Lisa', 31, 'Pune', 191) ] # Creating DataFrame object studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score']) # By getting rows of a dataframe as a nested numpy array numpy_2d_array = studentId.to_numpy() # By Converting 2D numpy array to the list of lists listOfRows = numpy_2d_array.tolist() print(listOfRows) print(type(listOfRows))
Output : [['Arun', 23, 'Chennai', 127.0], ['Priya', 31, 'Delhi', 174.5], ['Ritik', 24, 'Mumbai', 181.0], ['Kimun', 37, 'Hyderabad', 125.0], ['Sinvee', 16, 'Delhi', 175.5], ['Kunu', 28, 'Mumbai', 115.0], ['Lisa', 31, 'Pune', 191.0]] <class 'list'>
It converted 2D Numpy Array into a list.
So, this is how we changed the dataframe to 2D Numpy Array and then List of Lists, where each nested list represents a dataframe line.
Convert a Dataframe into a list of lists – Column Wise :
Now turn each column into a list and create a list of these lists,
import pandas as pd #The List of Tuples students = [('Arun', 23, 'Chennai', 127), ('Priya', 31, 'Delhi', 174.5), ('Ritik', 24, 'Mumbai', 181), ('Kimun', 37, 'Hyderabad', 125), ('Sinvee', 16, 'Delhi', 175.5), ('Kunu', 28, 'Mumbai', 115), ('Lisa', 31, 'Pune', 191) ] # Creating DataFrame object studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score']) # Convert a dataframe to the list of columns i.e. list of lists listOfRows = studentId.transpose().values.tolist() print(listOfRows) print(type(listOfRows))
Output : [['Arun', 'Priya', 'Ritik', 'Kimun', 'Sinvee', 'Kunu', 'Lisa'], [23, 31, 24, 37, 16, 28, 31], ['Chennai', 'Delhi', 'Mumbai', 'Hyderabad', 'Delhi', 'Mumbai', 'Pune'], [127.0, 174.5, 181.0, 125.0, 175.5, 115.0, 191.0]] <class 'list'>
How did it work?
It works on the same concept we discussed above, just one more step here i.e.
Step 1: Transpose the dataframe to convert rows as columns and columns as rows :
import pandas as pd #The List of Tuples students = [('Arun', 23, 'Chennai', 127), ('Priya', 31, 'Delhi', 174.5), ('Ritik', 24, 'Mumbai', 181), ('Kimun', 37, 'Hyderabad', 125), ('Sinvee', 16, 'Delhi', 175.5), ('Kunu', 28, 'Mumbai', 115), ('Lisa', 31, 'Pune', 191) ] # Creating DataFrame object studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score']) # Transposing the dataframe, rows are now columns and columns are now rows transposedObj = studentId.transpose() print(transposedObj)
Output : 0     1      2         3      4       5     6 Name     Arun Priya  Ritik     Kimun Sinvee   Kunu  Lisa Age        23    31     24        37     16     28    31 City  Chennai Delhi Mumbai Hyderabad  Delhi Mumbai  Pune Score   127.0 174.5  181.0     125.0  175.5   115.0 191.0
transposedObj is a transpose of the original data i.e. lines in studentId with columns in transposedObj and columns in studentId are lines in transposedObj.
Step 2: Convert the Dataframe to a nested Numpy array using DataFrame.to_numpy() :
import pandas as pd #The List of Tuples students = [('Arun', 23, 'Chennai', 127), ('Priya', 31, 'Delhi', 174.5), ('Ritik', 24, 'Mumbai', 181), ('Kimun', 37, 'Hyderabad', 125), ('Sinvee', 16, 'Delhi', 175.5), ('Kunu', 28, 'Mumbai', 115), ('Lisa', 31, 'Pune', 191) ] # Creating DataFrame object studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score']) # Transposing the dataframe, rows are now columns and columns are now rows transposedObj = studentId.transpose() # By getting rows of a dataframe as a nested numpy array numpy_2d_array = transposedObj.to_numpy() print(numpy_2d_array) print(type(numpy_2d_array))
Output : [['Arun' 'Priya' 'Ritik' 'Kimun' 'Sinvee' 'Kunu' 'Lisa'] [23 31 24 37 16 28 31] ['Chennai' 'Delhi' 'Mumbai' 'Hyderabad' 'Delhi' 'Mumbai' 'Pune'] [127.0 174.5 181.0 125.0 175.5 115.0 191.0]] <class 'numpy.ndarray'>
Step 3: Convert 2D Numpy array into a list of lists. :
import pandas as pd #The List of Tuples students = [('Arun', 23, 'Chennai', 127), ('Priya', 31, 'Delhi', 174.5), ('Ritik', 24, 'Mumbai', 181), ('Kimun', 37, 'Hyderabad', 125), ('Sinvee', 16, 'Delhi', 175.5), ('Kunu', 28, 'Mumbai', 115), ('Lisa', 31, 'Pune', 191) ] # Creating DataFrame object studentId = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Score']) # Transposing the dataframe, rows are now columns and columns are now rows transposedObj = studentId.transpose() # By getting rows of a dataframe as a nested numpy array numpy_2d_array = transposedObj.to_numpy() #By Converting 2D numpy array to the list of lists listOfRows = numpy_2d_array.tolist() print(listOfRows) print(type(listOfRows))
Output : [['Arun', 'Priya', 'Ritik', 'Kimun', 'Sinvee', 'Kunu', 'Lisa'], [23, 31, 24, 37, 16, 28, 31], ['Chennai', 'Delhi', 'Mumbai', 'Hyderabad', 'Delhi', 'Mumbai', 'Pune'], [127.0, 174.5, 181.0, 125.0, 175.5, 115.0, 191.0]] <class 'list'>
Want to expert in the python programming language? Exploring Python Data Analysis using Pandas tutorial changes your knowledge from basic to advance level in python concepts.
Read more Articles on Python Data Analysis Using Padas – Select items from a Dataframe
- Select Rows & Columns in a Dataframe using loc & iloc in
- Select Rows in a Dataframe based on conditions
- Get minimum values in rows or columns & their index position in Dataframe
- Get unique values in columns of a Dataframe
- Select first or last N rows in a Dataframe using head() & tail()
- Get a list of column and row names in a DataFrame