How to Convert Python Pandas DataFrame into a List

There are scenarios when you need to convert Pandas DataFrame to Python list.

I will be using college.csv data which has details about university admissions.

Lets start with importing pandas library and read_csv to read the csv file

import pandas as pd

df = pd.read_csv('College.csv')

df.head(1)

For the this exercise there are too many columns in this data. Lets just drop all but 3 columns from dataframe.

Lets just keep the columns Private, Apps, and Accept from the dataframe above.

dfn = df[['Private','Apps','Accept']]

Lets check how many rows are there in this dataframe using pd.DataFrame.shape

dfn.shape

(777, 3)

Ok lets just select first 5 rows from our dataframe. checkout tutorial Select Pandas Dataframe Rows And Columns Using iloc loc and ix

df5r = dfn.loc[:4,:]

df5r.shape

(5, 3)

Remember pd.DataFrame.size will give you the size of the dataframe rowsxcolumns

So We got first 5 rows and 3 columns.

df5r.size

15

df5r.head()

Now we got our desired Dataframe in the desired shape. lets proceed with the our current tutorial of converting DataFrame to list.

The command to convert Dataframe to list is pd.DataFrame.values.tolist(). Lets go step by step. Lets get the values first.

df5r.values

array([['Yes', 1660, 1232],
       ['Yes', 2186, 1924],
       ['Yes', 1428, 1097],
       ['Yes', 417, 349],
       ['Yes', 193, 146]], dtype=object)

Note DataFrame.values is giving us array object. To convert it to list use tolist()

Lets try values.tolist() on top of it.

df5r.values.tolist()

[['Yes', 1660, 1232],
 ['Yes', 2186, 1924],
 ['Yes', 1428, 1097],
 ['Yes', 417, 349],
 ['Yes', 193, 146]]

So we get list of lists. we can loop through it as any normal Python list. Lets try that.

for l in df5r.values.tolist():
    print(l)

['Yes', 1660, 1232]
['Yes', 2186, 1924]
['Yes', 1428, 1097]
['Yes', 417, 349]
['Yes', 193, 146]

Ok that is good. But notice we lost the column names. How do we retain the column names when using values.tolist() method.

It is very simple. We will use Pythons zip method.  Lets see how we can do this.

Lets first save the columns and save it to a seperate list.

cnames = df5r.columns.values.tolist()

Lets also save our columns to a variable.

cvalues = df5r.values.tolist()

Ok we have now our two lists, we can simply use zip method as shown below.

for c,v in zip(cnames,cvalues):
    print(c,v)

Private ['Yes', 1660, 1232]
Apps ['Yes', 2186, 1924]
Accept ['Yes', 1428, 1097]

Lets flatten the list so it appears better.

for c,value in zip(cnames,cvalues):
    print(c, "-"," ".join(str(v) for v in value))

Private - Yes 1660 1232
Apps - Yes 2186 1924
Accept - Yes 1428 1097

Ok so far so good. But there is better way to retain the spreadsheet format. Lets try that.

final_list = [cnames] + cvalues

final_list

[['Private', 'Apps', 'Accept'],
 ['Yes', 1660, 1232],
 ['Yes', 2186, 1924],
 ['Yes', 1428, 1097],
 ['Yes', 417, 349],
 ['Yes', 193, 146]]

Lets check the data type.

final_list.__class__()

[]

It is still a python list. Lets loop through the list again.

f = '{:<10}|{:<10}|{:<10}'
for l in final_list:
    print(f.format(*l))

Private   |Apps      |Accept    
Yes       |1660      |1232      
Yes       |2186      |1924      
Yes       |1428      |1097      
Yes       |417       |349       
Yes       |193       |146

There we go, it looks better now.

How to Convert Python Pandas DataFrame into a List

Related Notebooks