How to Convert Python Pandas DataFrame into a List
There are scenarios when you need to convert Pandas DataFrame to Python list.
I will be using college.csv data which has details about university admissions.
Lets start with importing pandas library and read_csv to read the csv file
import pandas as pd
df = pd.read_csv('College.csv')
df.head(1)
For the this exercise there are too many columns in this data. Lets just drop all but 3 columns from dataframe.
Lets just keep the columns Private, Apps, and Accept from the dataframe above.
dfn = df[['Private','Apps','Accept']]
Lets check how many rows are there in this dataframe using pd.DataFrame.shape
dfn.shape
Ok lets just select first 5 rows from our dataframe. checkout tutorial Select Pandas Dataframe Rows And Columns Using iloc loc and ix
df5r = dfn.loc[:4,:]
df5r.shape
Remember pd.DataFrame.size will give you the size of the dataframe rowsxcolumns
So We got first 5 rows and 3 columns.
df5r.size
df5r.head()
Now we got our desired Dataframe in the desired shape. lets proceed with the our current tutorial of converting DataFrame to list.
The command to convert Dataframe to list is pd.DataFrame.values.tolist(). Lets go step by step. Lets get the values first.
df5r.values
Note DataFrame.values is giving us array object. To convert it to list use tolist()
Lets try values.tolist() on top of it.
df5r.values.tolist()
So we get list of lists. we can loop through it as any normal Python list. Lets try that.
for l in df5r.values.tolist():
print(l)
Ok that is good. But notice we lost the column names. How do we retain the column names when using values.tolist() method.
It is very simple. We will use Pythons zip method. Lets see how we can do this.
Lets first save the columns and save it to a seperate list.
cnames = df5r.columns.values.tolist()
Lets also save our columns to a variable.
cvalues = df5r.values.tolist()
Ok we have now our two lists, we can simply use zip method as shown below.
for c,v in zip(cnames,cvalues):
print(c,v)
Lets flatten the list so it appears better.
for c,value in zip(cnames,cvalues):
print(c, "-"," ".join(str(v) for v in value))
Ok so far so good. But there is better way to retain the spreadsheet format. Lets try that.
final_list = [cnames] + cvalues
final_list
Lets check the data type.
final_list.__class__()
It is still a python list. Lets loop through the list again.
f = '{:<10}|{:<10}|{:<10}'
for l in final_list:
print(f.format(*l))
There we go, it looks better now.
Related Notebooks
- How To Convert Python List To Pandas DataFrame
- Convert Pandas DataFrame To Numpy Arrays
- How to Export Pandas DataFrame to a CSV File
- How To Append Rows With Concat to a Pandas DataFrame
- How To Iterate Over Rows In A Dataframe In Pandas
- How to Sort Pandas DataFrame with Examples
- How to Plot a Histogram in Python
- How To Drop One Or More Columns In Pandas Dataframe
- Python Pandas String To Integer And Integer To String DataFrame