JIYIK CN >

Current Location:Home > Learning > PROGRAM > Python >

Pandas groupby by two columns

Author:JIYIK Last Updated:2025/04/30 Views:

This tutorial has shown how to use the method in Pandas DataFrame.groupby()to split a two-column DataFrame into several groups. We can also get more information from the created groups.

We will use the following DataFrame in this article.

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

data = pd.DataFrame(
    {
        "Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
        "Gender": ["Female", "Male", "Male", "Female", "Female", "Male"],
        "Employed": ["Yes", "No", "Yes", "No", "Yes", "No"],
        "Age": [30, 28, 27, 24, 28, 25],
    }
)

print(data)

Output:

       Name  Gender Employed  Age
0  Jennifer  Female      Yes   30
1    Travis    Male       No   28
2       Bob    Male      Yes   27
3      Emma  Female       No   24
4      Luna  Female      Yes   28
5     Anish    Male       No   25

Pandas Groupby Multiple Columns

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

data = pd.DataFrame(
    {
        "Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
        "Gender": ["Female", "Male", "Male", "Female", "Female", "Male"],
        "Employed": ["Yes", "No", "Yes", "No", "Yes", "No"],
        "Age": [30, 28, 27, 24, 28, 25],
    }
)

print(data)
print("")
print("Groups in DataFrame:")
groups = data.groupby(["Gender", "Employed"])
for group_key, group_value in groups:
    group = groups.get_group(group_key)
    print(group)
    print("")

Output:

       Name  Gender Employed  Age
0  Jennifer  Female      Yes   30
1    Travis    Male       No   28
2       Bob    Male      Yes   27
3      Emma  Female       No   24
4      Luna  Female      Yes   28
5     Anish    Male       No   25

Groups in DataFrame:
   Name  Gender Employed  Age
3  Emma  Female       No   24

       Name  Gender Employed  Age
0  Jennifer  Female      Yes   30
4      Luna  Female      Yes   28

     Name Gender Employed  Age
1  Travis   Male       No   28
5   Anish   Male       No   25

  Name Gender Employed  Age
2  Bob   Male      Yes   27

It creates 4 groups from the DataFrame. All rows Genderwith Employedthe same values ​​of and columns are placed in the same group.


Count number of rows per group Pandas

DataFrame.groupby()To count the number of rows for each group created using the method, we can use size()the method.

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

data = pd.DataFrame(
    {
        "Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
        "Gender": ["Female", "Male", "Male", "Female", "Female", "Male"],
        "Employed": ["Yes", "No", "Yes", "No", "Yes", "No"],
        "Age": [30, 28, 27, 24, 28, 25],
    }
)

print(data)
print("")
print("Count of Each group:")
grouped_df = data.groupby(["Gender", "Employed"]).size().reset_index(name="Count")
print(grouped_df)

Output:

       Name  Gender Employed  Age
0  Jennifer  Female      Yes   30
1    Travis    Male       No   28
2       Bob    Male      Yes   27
3      Emma  Female       No   24
4      Luna  Female      Yes   28
5     Anish    Male       No   25

Count of Each group:
   Gender Employed  Count
0  Female       No      1
1  Female      Yes      2
2    Male       No      2
3    Male      Yes      1

It displays the DataFrame, the groups created from the DataFrame, and the number of elements in each group.

If we want to get Employedthe maximum count of each value in the column, we can form another group from the groups created above and count the values, and then use max()the method to get the maximum value of the count.

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

data = pd.DataFrame(
    {
        "Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
        "Gender": ["Female", "Male", "Male", "Female", "Female", "Male"],
        "Employed": ["Yes", "No", "Yes", "No", "Yes", "No"],
        "Age": [30, 28, 27, 24, 28, 25],
    }
)

print(data)
print("")

groups = data.groupby(["Gender", "Employed"]).size().groupby(level=1)
print(groups.max())

Output:

       Name  Gender Employed  Age
0  Jennifer  Female      Yes   30
1    Travis    Male       No   28
2       Bob    Male      Yes   27
3      Emma  Female       No   24
4      Luna  Female      Yes   28
5     Anish    Male       No   25

Employed
No     2
Yes    2
dtype: int64

It shows the maximum count of column values ​​in the groups created from Genderthe and columns.EmployedEmployed

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL:

Related Articles

Pandas DataFrame DataFrame.query() function

Publish Date:2025/04/30 Views:107 Category:Python

The pandas.DataFrame.query() method filters the rows of the caller DataFrame using the given query expression. pandas.DataFrame.query() grammar DataFrame . query(expr, inplace = False , ** kwargs) parameter expr Filter rows based on query e

Pandas DataFrame DataFrame.min() function

Publish Date:2025/04/30 Views:162 Category:Python

Python Pandas DataFrame.min() function gets the minimum value of the DataFrame object along the specified axis. pandas.DataFrame.min() grammar DataFrame . mean(axis = None , skipna = None , level = None , numeric_only = None , ** kwargs) pa

Pandas DataFrame DataFrame.mean() function

Publish Date:2025/04/30 Views:85 Category:Python

Python Pandas DataFrame.mean() function calculates the mean of the values ​​of the DataFrame object over the specified axis. pandas.DataFrame.mean() grammar DataFrame . mean(axis = None , skipna = None , level = None , numeric_only = No

Pandas DataFrame DataFrame.isin() function

Publish Date:2025/04/30 Views:133 Category:Python

The pandas.DataFrame.isin(values) function checks whether each element in the caller DataFrame contains values the value specified in the input . pandas.DataFrame.isin(values) grammar DataFrame . isin(values) parameter values iterable - lis

Pandas DataFrame DataFrame.groupby() function

Publish Date:2025/04/30 Views:161 Category:Python

pandas.DataFrame.groupby() takes a DataFrame as input and divides the DataFrame into groups based on a given criterion. We can use groupby() the method to easily process large datasets. pandas.DataFrame.groupby() grammar DataFrame . groupby

Pandas DataFrame DataFrame.fillna() function

Publish Date:2025/04/30 Views:60 Category:Python

The pandas.DataFrame.fillna() function replaces the values DataFrame ​​in NaN with a certain value. pandas.DataFrame.fillna() grammar DataFrame . fillna( value = None , method = None , axis = None , inplace = False , limit = None , down

Pandas DataFrame DataFrame.dropna() function

Publish Date:2025/04/30 Views:181 Category:Python

The pandas.DataFrame.dropna() function removes null values ​​(missing values) from a DataFrame by dropping rows or columns that contain null values DataFrame . NaN ( Not a Number ) and NaT ( Not a Time ) represent null values. DataFrame

Pandas DataFrame DataFrame.assign() function

Publish Date:2025/04/30 Views:55 Category:Python

Python Pandas DataFrame.assign() function assigns new columns to DataFrame . pandas.DataFrame.assign() grammar DataFrame . assign( ** kwargs) parameter **kwargs Keyword arguments, DataFrame the column names to be assigned to are passed as k

Pandas DataFrame DataFrame.transform() function

Publish Date:2025/04/30 Views:120 Category:Python

Python Pandas DataFrame.transform() DataFrame applies a function on and transforms DataFrame . The function to be applied is passed as an argument to the function. The axis length of transform() the transformed should be the same as the ori

Scan to Read All Tech Tutorials

Social Media
  • https://www.github.com/onmpw
  • qq:1244347461

Recommended

Tags

Scan the Code
Easier Access Tutorial