Select Pandas Dataframe Rows And Columns Using iloc loc and ix
In this post, I will talk about how to use Python library Pandas iloc, loc and ix functions to select rows and columns from csv and excel files
I will be using college.csv data which has details about university admissions.
Lets start with importing pandas library and read_csv to read the csv file
import pandas as pd
df = pd.read_csv('data/College.csv')
df.head(2)
How To Use dataframe loc To Select Rows
Lets check what df.loc actually used for, if you do df.loc?, you will find following documentation...
Access a group of rows and columns by label(s) or a boolean array.
.loc[]
is primarily label based, but may also be used with a
boolean array.
Lets try to select the columns by labels first. We will have to use double index [[]]
df.loc[['Apps']]
I got following error
KeyError: "None of [Index(['Apps'], dtype='object')] are in the [index]"
The reason for above error is that Apps is not an index. We will have to set the column names as index first. Lets see what is index set to right now.
df.index
The index is set from 0 to 777 which is actually row numbers. Of course we can use current index to select the rows. Lets try that.
How To Select Row By Index Using Pandas loc
df.loc[0]
As We see, we got the first row. Lets add column University Name as index and see what happens. We will have to rename the University Name which is Unname:0 in dataframe.
df.rename(columns={'Unnamed: 0':'univ'},inplace=True)
df.set_index('univ',inplace=True)
df.head(2)
Lets Try to select the row by university name
df.loc['Abilene Christian University']
How do you remove the indexes. Use reset_index()
df = df.reset_index()
df.head(1)
How To Use dataframe loc To Select Columns
Lets set the university as index again. This time, we want to select a particular column data for a partiular row.
df.set_index('univ',inplace=True)
df.loc['Abilene Christian University',['Apps']]
Lets say we want to select Apps column for two rows. Check the double indexes [[]] for rows
df.loc[['Abilene Christian University','Adelphi University'],['Apps']]
Lets say we want to print all the rows for column 'Apps'. Look out for syntax : in the below command, it means all the rows
df.loc[:,'Apps']
How To Use Pandas Dataframe iloc
Pandas iloc can be used to select both rows and columns.
Python Select Row By Index Using Pandas iloc
We can give the rows a range, lets say we want to select first 2 rows and just print all the columns.
df.iloc[:2,:]
We can give it different index numbers. Print rows 1 ,4 and 5
df.iloc[[1,4,5],:]
Lets try the above command by the univ index names but for that you will have to use the loc command
df.loc[['Adelphi University','Alaska Pacific University','Albertson College'],:]
Python Select Column By Index Using Pandas iloc
Let us print the first two columns only.
df.iloc[:,[1,2]].head(2)
Note the indices we are using [1,2], that means column 1 and 2 only. We can combine the indexing on both rows and columns.
Example: Print first two rows from first two columns only without using head(2) method this time.
df.iloc[[1,2],[1,2]]
How To Use .ix in Pandas
ix is a hybrid of both loc and iloc. Meaning we can use ix in place of loc and .loc. Pandas .ix has been deprecated in latest version. But If you are still using older version of Python, the following two commands would work.
Lets try with an example. The below command is same command as df.loc[['Adelphi University','Alaska Pacific University','Albertson College'],:]
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
df.ix[['Adelphi University','Alaska Pacific University','Albertson College'],:]
Note: I used "import warnings" module just to supress the future warnings. Otherwise you would see big warning message about ix that .ix has been deprecated.
Similarly the below command is same as df.iloc[:2,:]
df.ix[:2,:]
There You go, we got the same result that we got with the iloc.
Related Topics:
Related Notebooks
- Pandas How To Sort Columns And Rows
- 3 Ways to Rename Columns in Pandas DataFrame
- How to do SQL Select and Where Using Python Pandas
- Polynomial Interpolation Using Python Pandas Numpy And Sklearn
- Python Pandas String To Integer And Integer To String DataFrame
- How To Drop One Or More Columns In Pandas Dataframe
- How To Iterate Over Rows In A Dataframe In Pandas
- How To Append Rows With Concat to a Pandas DataFrame
- Covid 19 Curve Fit Using Python Pandas And Numpy