NbShare
  • Nbshare Notebooks

  • Table of Contents

  • Python Utilities

    • How To Install Jupyter Notebook
    • How to Upgrade Python Pip
    • How To Use Python Pip
  • Python

    • Python Datetime
    • Python Dictionary
    • Python Generators
    • Python Iterators and Generators
    • Python Lambda
    • Python Sort List
    • String And Literal In Python 3
    • Strftime and Strptime In Python
    • Python Tkinter
    • Python Underscore
    • Python Yield
  • Pandas

    • Aggregating and Grouping
    • DataFrame to CSV
    • DF to Numpy Array
    • Drop Columns of DF
    • Handle Json Data
    • Iterate Over Rows of DataFrame
    • Merge and Join DataFrame
    • Pivot Tables
    • Python List to DataFrame
    • Rename Columns of DataFrame
    • Select Rows and Columns Using iloc, loc and ix
    • Sort DataFrame
  • PySpark

    • Data Analysis With Pyspark
    • Read CSV
    • RDD Basics
  • Data Science

    • Confusion Matrix
    • Decision Tree Regression
    • Logistic Regression
    • Regularization Techniques
    • SVM Sklearn
    • Time Series Analysis Using ARIMA
  • Machine Learning

    • How To Code RNN and LSTM Neural Networks in Python
    • PyTorch Beginner Tutorial Tensors
    • Rectified Linear Unit For Artificial Neural Networks Part 1 Regression
    • Stock Sentiment Analysis Using Autoencoders
  • Natural Language
    Processing

    • Opinion Mining Aspect Level Sentiment Analysis
    • Sentiment Analysis using Autoencoders
    • Understanding Autoencoders With Examples
    • Word Embeddings Transformers In SVM Classifier
  • R

    • DataFrame to CSV
    • How to Create DataFrame in R
    • How To Use Grep In R
    • How To Use R Dplyr Package
    • Introduction To R DataFrames
    • Tidy Data In R
  • A.I. News
NbShare Notebooks
  • Publish Your Post On nbshare.io

  • R Python Pandas Data Science Excel NLP Numpy Pyspark Finance

Pandas group by multiple custom aggregate function on multiple columns

For list of Pandas tutorials click here...
https://www.nbshare.io/notebooks/pandas/

To group a Pandas DataFrame by multiple columns and apply multiple custom aggregate functions to multiple columns, you can use the groupby method of the DataFrame and the apply method of the resulting GroupBy object. Here's an example of how you could do this:

Let us first create a simple Pandas Dataframe.

In [1]:
import pandas as pd

df = pd.DataFrame({'A': [1, 2, 2, 3, 3, 3],
                   'B': [10, 20, 20, 30, 30, 40],
                   'C': [100, 200, 200, 300, 300, 400]})
In [2]:
print(df)
   A   B    C
0  1  10  100
1  2  20  200
2  2  20  200
3  3  30  300
4  3  30  300
5  3  40  400

Let us define our custom aggregate functions.

In [3]:
# Define custom aggregate functions
def custom_mean(x):
    return x.mean()

def custom_sum(x):
    return x.sum()

The above two functions are pretty much self explanatory. So let us now apply the custom aggregate functions to our columns as shown below. This will group the DataFrame by columns A and B, and for each group it will apply the custom functions custom_mean and custom_sum to the column C. The resulting DataFrame will have the following output:

In [4]:
# Group the DataFrame by columns 'A' and 'B' and apply the custom functions
result = df.groupby(['A', 'B']).apply(lambda x: pd.Series({'mean_C': custom_mean(x['C']),
                                                           'sum_C': custom_sum(x['C'])}))

print(result)
      mean_C  sum_C
A B                
1 10   100.0  100.0
2 20   200.0  400.0
3 30   300.0  600.0
  40   400.0  400.0

You can also use the agg method of the GroupBy object to apply multiple aggregate functions to multiple columns at once:

In [5]:
# Group the DataFrame by columns 'A' and 'B' and apply the custom aggregation functions to columns 'C'.
df.groupby(['A', 'B']).agg({'C': ['mean', 'sum']})
Out[5]:
C
mean sum
A B
1 10 100.0 100
2 20 200.0 400
3 30 300.0 600
40 400.0 400

Note above group 3 contains two rows.

Related Notebooks

  • Return Multiple Values From a Function in Python
  • Pandas Groupby Count of Rows In Each Group
  • What is LeakyReLU Activation Function
  • How To Add Regression Line On Ggplot
  • How To Install Python TensorFlow On Centos 8
  • How To Install R Sparklyr H2O Tensorflow Keras On Centos
  • 3 Ways to Rename Columns in Pandas DataFrame
  • Select Pandas Dataframe Rows And Columns Using iloc loc and ix
  • How To Drop One Or More Columns In Pandas Dataframe

Register

User Already registered.


Login

Login

We didn't find you! Please Register

Wrong Password!


Register
    Top Notebooks:
  • Data Analysis With Pyspark Dataframe
  • Strftime and Strptime In Python
  • Python If Not
  • Python Is Integer
  • Dictionaries in Python
  • How To install Python3.9 With Conda
  • String And Literal In Python 3
  • Privacy Policy
©