{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1>How to Convert Python Pandas DataFrame into a List</h1>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are scenarios when you need to convert Pandas DataFrame to Python list."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "I will be using college.csv data which has details about university admissions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets start with importing pandas library and read_csv to read the csv file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv('College.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Unnamed: 0</th>\n",
       "      <th>Private</th>\n",
       "      <th>Apps</th>\n",
       "      <th>Accept</th>\n",
       "      <th>Enroll</th>\n",
       "      <th>Top10perc</th>\n",
       "      <th>Top25perc</th>\n",
       "      <th>F.Undergrad</th>\n",
       "      <th>P.Undergrad</th>\n",
       "      <th>Outstate</th>\n",
       "      <th>Room.Board</th>\n",
       "      <th>Books</th>\n",
       "      <th>Personal</th>\n",
       "      <th>PhD</th>\n",
       "      <th>Terminal</th>\n",
       "      <th>S.F.Ratio</th>\n",
       "      <th>perc.alumni</th>\n",
       "      <th>Expend</th>\n",
       "      <th>Grad.Rate</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Abilene Christian University</td>\n",
       "      <td>Yes</td>\n",
       "      <td>1660</td>\n",
       "      <td>1232</td>\n",
       "      <td>721</td>\n",
       "      <td>23</td>\n",
       "      <td>52</td>\n",
       "      <td>2885</td>\n",
       "      <td>537</td>\n",
       "      <td>7440</td>\n",
       "      <td>3300</td>\n",
       "      <td>450</td>\n",
       "      <td>2200</td>\n",
       "      <td>70</td>\n",
       "      <td>78</td>\n",
       "      <td>18.1</td>\n",
       "      <td>12</td>\n",
       "      <td>7041</td>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     Unnamed: 0 Private  Apps  Accept  Enroll  Top10perc  \\\n",
       "0  Abilene Christian University     Yes  1660    1232     721         23   \n",
       "\n",
       "   Top25perc  F.Undergrad  P.Undergrad  Outstate  Room.Board  Books  Personal  \\\n",
       "0         52         2885          537      7440        3300    450      2200   \n",
       "\n",
       "   PhD  Terminal  S.F.Ratio  perc.alumni  Expend  Grad.Rate  \n",
       "0   70        78       18.1           12    7041         60  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head(1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the this exercise there are too many columns in this data. Lets just drop all but 3 columns from dataframe."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets just keep the columns Private, Apps, and Accept from the dataframe above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "dfn = df[['Private','Apps','Accept']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets check how many rows are there in this dataframe using pd.DataFrame.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(777, 3)"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dfn.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p>Ok lets just select first 5 rows from our dataframe. checkout tutorial <a href=\"https://www.nbshare.io/notebook/87764859/Select-Pandas-Dataframe-Rows-And-Columns-Using-iloc-loc-and-ix/\">Select Pandas Dataframe Rows And Columns Using iloc loc and ix</a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "df5r = dfn.loc[:4,:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(5, 3)"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5r.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Remember pd.DataFrame.size will give you the size of the dataframe rowsxcolumns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So We got first 5 rows and 3 columns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "15"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5r.size"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Private</th>\n",
       "      <th>Apps</th>\n",
       "      <th>Accept</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Yes</td>\n",
       "      <td>1660</td>\n",
       "      <td>1232</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Yes</td>\n",
       "      <td>2186</td>\n",
       "      <td>1924</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Yes</td>\n",
       "      <td>1428</td>\n",
       "      <td>1097</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Yes</td>\n",
       "      <td>417</td>\n",
       "      <td>349</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Yes</td>\n",
       "      <td>193</td>\n",
       "      <td>146</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Private  Apps  Accept\n",
       "0     Yes  1660    1232\n",
       "1     Yes  2186    1924\n",
       "2     Yes  1428    1097\n",
       "3     Yes   417     349\n",
       "4     Yes   193     146"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5r.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we got our desired Dataframe in the desired shape. lets proceed with the our current tutorial of converting DataFrame to list."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The command to convert Dataframe to list is pd.DataFrame.values.tolist(). Lets go step by step. Lets get the values first."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([['Yes', 1660, 1232],\n",
       "       ['Yes', 2186, 1924],\n",
       "       ['Yes', 1428, 1097],\n",
       "       ['Yes', 417, 349],\n",
       "       ['Yes', 193, 146]], dtype=object)"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5r.values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Note DataFrame.values is giving us array object. To convert it to list use tolist()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets try values.tolist() on top of it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['Yes', 1660, 1232],\n",
       " ['Yes', 2186, 1924],\n",
       " ['Yes', 1428, 1097],\n",
       " ['Yes', 417, 349],\n",
       " ['Yes', 193, 146]]"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df5r.values.tolist()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So we get list of lists. we can loop through it as any normal Python list. Lets try that."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['Yes', 1660, 1232]\n",
      "['Yes', 2186, 1924]\n",
      "['Yes', 1428, 1097]\n",
      "['Yes', 417, 349]\n",
      "['Yes', 193, 146]\n"
     ]
    }
   ],
   "source": [
    "for l in df5r.values.tolist():\n",
    "    print(l)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok that is good. But notice we lost the column names. How do we retain the column names when using values.tolist() method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "It is very simple. We will use Pythons zip method.  Lets see how we can do this."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets first save the columns and save it to a seperate list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "cnames = df5r.columns.values.tolist()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets also save our columns to a variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "cvalues = df5r.values.tolist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Ok we have now our two lists, we can simply use zip method as shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Private ['Yes', 1660, 1232]\n",
      "Apps ['Yes', 2186, 1924]\n",
      "Accept ['Yes', 1428, 1097]\n"
     ]
    }
   ],
   "source": [
    "for c,v in zip(cnames,cvalues):\n",
    "    print(c,v)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets flatten the list so it appears better."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Private - Yes 1660 1232\n",
      "Apps - Yes 2186 1924\n",
      "Accept - Yes 1428 1097\n"
     ]
    }
   ],
   "source": [
    "for c,value in zip(cnames,cvalues):\n",
    "    print(c, \"-\",\" \".join(str(v) for v in value))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok so far so good. But there is better way to retain the spreadsheet format. Lets try that."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
    "final_list = [cnames] + cvalues"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['Private', 'Apps', 'Accept'],\n",
       " ['Yes', 1660, 1232],\n",
       " ['Yes', 2186, 1924],\n",
       " ['Yes', 1428, 1097],\n",
       " ['Yes', 417, 349],\n",
       " ['Yes', 193, 146]]"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "final_list"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets check the data type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "final_list.__class__()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is still a python list. Lets loop through the list again. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Private   |Apps      |Accept    \n",
      "Yes       |1660      |1232      \n",
      "Yes       |2186      |1924      \n",
      "Yes       |1428      |1097      \n",
      "Yes       |417       |349       \n",
      "Yes       |193       |146       \n"
     ]
    }
   ],
   "source": [
    "f = '{:<10}|{:<10}|{:<10}'\n",
    "for l in final_list:\n",
    "    print(f.format(*l))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There we go, it looks better now."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}