Let us first create a dummy dataframe for this tutorial.
In [3]:
import pandas as pd
df = pd.DataFrame({'Director': ['Steven Spielberg', 'Martin Scorsese', 'Steven Spielberg', 'Quentin Tarantino', 'Martin Scorsese', 'Steven Spielberg'],
'Movie': ['Jaws', 'Goodfellas', 'Jurassic Park', 'Pulp Fiction', 'Raging Bull', 'E.T.']})
In [4]:
print(df)
Director Movie 0 Steven Spielberg Jaws 1 Martin Scorsese Goodfellas 2 Steven Spielberg Jurassic Park 3 Quentin Tarantino Pulp Fiction 4 Martin Scorsese Raging Bull 5 Steven Spielberg E.T.
To get the count of rows in each group in a Pandas groupby object based on the movies data, you can use the size() method.
In [5]:
# Group the dataframe by the 'Director' column
grouped_df = df.groupby('Director')
# Get the size of each group
group_sizes = grouped_df.size()
print(group_sizes)
Director Martin Scorsese 2 Quentin Tarantino 1 Steven Spielberg 3 dtype: int64
You can also use the count() method to get the count of rows for each group. This method will count the number of non-NA/null values in each group.`
In [6]:
# Get the count of rows for each group
group_counts = grouped_df.count()
print(group_counts)
Movie Director Martin Scorsese 2 Quentin Tarantino 1 Steven Spielberg 3
Using the size() method, we will get the total number of rows in each group, On the other hand, if we use the count() method, we will get the number of non-NA/null values in each column for each group.
Related Notebooks
- PySpark GroupBy Examples
- How To Iterate Over Rows In A Dataframe In Pandas
- Pandas How To Sort Columns And Rows
- A Study of the TextRank Algorithm in Python
- How To Append Rows With Concat to a Pandas DataFrame
- Select Pandas Dataframe Rows And Columns Using iloc loc and ix
- An Anatomy of Key Tricks in word2vec project with examples
- Python IndexError List Index Out of Range
- Summarising Aggregating and Grouping data in Python Pandas