Select Pandas Dataframe Rows And Columns Using iloc loc and ix

In this post, I will talk about how to use Python library Pandas iloc, loc and ix functions to select rows and columns from csv and excel files

I will be using college.csv data which has details about university admissions.

Lets start with importing pandas library and read_csv to read the csv file

import pandas as pd

df = pd.read_csv('data/College.csv')

df.head(2)

How To Use dataframe loc To Select Rows

Lets check what df.loc actually used for, if you do df.loc?, you will find following documentation...

Access a group of rows and columns by label(s) or a boolean array.

.loc[] is primarily label based, but may also be used with a boolean array.

Lets try to select the columns by labels first. We will have to use double index [[]]

df.loc[['Apps']]

I got following error

KeyError: "None of [Index(['Apps'], dtype='object')] are in the [index]"

The reason for above error is that Apps is not an index. We will have to set the column names as index first. Lets see what is index set to right now.

df.index

RangeIndex(start=0, stop=777, step=1)

The index is set from 0 to 777 which is actually row numbers. Of course we can use current index to select the rows. Lets try that.

How To Select Row By Index Using Pandas loc

df.loc[0]

Unnamed: 0     Abilene Christian University
Private                                 Yes
Apps                                   1660
Accept                                 1232
Enroll                                  721
Top10perc                                23
Top25perc                                52
F.Undergrad                            2885
P.Undergrad                             537
Outstate                               7440
Room.Board                             3300
Books                                   450
Personal                               2200
PhD                                      70
Terminal                                 78
S.F.Ratio                              18.1
perc.alumni                              12
Expend                                 7041
Grad.Rate                                60
Name: 0, dtype: object

As We see, we got the first row. Lets add column University Name as index and see what happens. We will have to rename the University Name which is Unname:0 in dataframe.

df.rename(columns={'Unnamed: 0':'univ'},inplace=True)

df.set_index('univ',inplace=True)

df.head(2)

Lets Try to select the row by university name

df.loc['Abilene Christian University']

Private         Yes
Apps           1660
Accept         1232
Enroll          721
Top10perc        23
Top25perc        52
F.Undergrad    2885
P.Undergrad     537
Outstate       7440
Room.Board     3300
Books           450
Personal       2200
PhD              70
Terminal         78
S.F.Ratio      18.1
perc.alumni      12
Expend         7041
Grad.Rate        60
Name: Abilene Christian University, dtype: object

How do you remove the indexes. Use reset_index()

df = df.reset_index()

df.head(1)

How To Use dataframe loc To Select Columns

Lets set the university as index again. This time, we want to select a particular column data for a partiular row.

df.set_index('univ',inplace=True)

df.loc['Abilene Christian University',['Apps']]

Apps    1660
Name: Abilene Christian University, dtype: object

Lets say we want to select Apps column for two rows. Check the double indexes [[]] for rows

df.loc[['Abilene Christian University','Adelphi University'],['Apps']]

Lets say we want to print all the rows for column 'Apps'. Look out for syntax : in the below command, it means all the rows

df.loc[:,'Apps']

univ
Abilene Christian University       1660
Adelphi University                 2186
Adrian College                     1428
Agnes Scott College                 417
Alaska Pacific University           193
                                  ...  
Worcester State College            2197
Xavier University                  1959
Xavier University of Louisiana     2097
Yale University                   10705
York College of Pennsylvania       2989
Name: Apps, Length: 777, dtype: int64

How To Use Pandas Dataframe iloc

Pandas iloc can be used to select both rows and columns.

Python Select Row By Index Using Pandas iloc

We can give the rows a range, lets say we want to select first 2 rows and just print all the columns.

df.iloc[:2,:]

We can give it different index numbers. Print rows 1 ,4 and 5

df.iloc[[1,4,5],:]

Lets try the above command by the univ index names but for that you will have to use the loc command

df.loc[['Adelphi University','Alaska Pacific University','Albertson College'],:]

Python Select Column By Index Using Pandas iloc

Let us print the first two columns only.

df.iloc[:,[1,2]].head(2)

Note the indices we are using [1,2], that means column 1 and 2 only. We can combine the indexing on both rows and columns.

Example: Print first two rows from first two columns only without using head(2) method this time.

df.iloc[[1,2],[1,2]]

How To Use .ix in Pandas

ix is a hybrid of both loc and iloc. Meaning we can use ix in place of loc and .loc. Pandas .ix has been deprecated in latest version. But If you are still using older version of Python, the following two commands would work.

Lets try with an example. The below command is same command as df.loc[['Adelphi University','Alaska Pacific University','Albertson College'],:]

import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
df.ix[['Adelphi University','Alaska Pacific University','Albertson College'],:]

Note: I used "import warnings" module just to supress the future warnings. Otherwise you would see big warning message about ix that .ix has been deprecated.

Similarly the below command is same as df.iloc[:2,:]

df.ix[:2,:]

There You go, we got the same result that we got with the iloc.

	Private	Apps	Accept	Enroll	Top10perc	Top25perc	F.Undergrad	P.Undergrad	Outstate	Room.Board	Books	Personal	PhD	Terminal	S.F.Ratio	perc.alumni	Expend	Grad.Rate
univ
Adelphi University	Yes	2186	1924	512	16	29	2683	1227	12280	6450	750	1500	29	30	12.2	16	10527	56
Alaska Pacific University	Yes	193	146	55	16	44	249	869	7560	4120	800	1500	76	72	11.9	2	10922	15
Albertson College	Yes	587	479	158	38	62	678	41	13500	3335	500	675	67	73	9.4	11	9727	55

	Private	Apps	Accept	Enroll	Top10perc	Top25perc	F.Undergrad	P.Undergrad	Outstate	Room.Board	Books	Personal	PhD	Terminal	S.F.Ratio	perc.alumni	Expend	Grad.Rate
univ
Adelphi University	Yes	2186	1924	512	16	29	2683	1227	12280	6450	750	1500	29	30	12.2	16	10527	56
Alaska Pacific University	Yes	193	146	55	16	44	249	869	7560	4120	800	1500	76	72	11.9	2	10922	15
Albertson College	Yes	587	479	158	38	62	678	41	13500	3335	500	675	67	73	9.4	11	9727	55