Pandas DataFrame DataFrame.merge() function

Current Location：Home > Learning > PROGRAM > Python >

Python PHP Java Go TypeScript C++ Vba Node.js C语言 MATLAB

Pandas DataFrame DataFrame.merge() function

Author：JIYIK Last Updated：2025/04/30 Views：

Python Pandas DataFrame.merge() function merges DataFrameor named Series objects.

`pandas.DataFrame.merge()`grammar

DataFrame.merge(
    right,
    how="inner",
    on=None,
    left_on=None,
    right_on=None,
    left_index=False,
    right_index=False,
    sort=False,
    suffixes="_x",
    "_y",
    copy=True,
    indicator=False,
    validate=None,
)

parameter


`right`	`DataFrame`or named `Series`. The object to merge
`how`	`left`, `right`, `inner`or `outer`. How to perform a merge operation
`on`	label or list of column or index names to merge
`left_on`	Label or list of `DataFrame`column or index names to merge onto the left side.
`right_on`	Label or list. Column or index names to be merged into `DataFrame`on the right.
`left_index`	Boolean. Use `DataFrame`the index of the left side as the join key ( `left_index=True`)
`right_index`	Boolean. Use `DataFrame`the index on the right side as the join key ( `right_index=True`)
`sort`	Boolean. Sort join keys alphabetically in output ( `sort=True`)
`suffixes`	Suffixes are applied to overlapping column names on the left and right sides respectively
`copy`	Boolean. Avoids copying `copy=False`.
`indicator`	`DataFrame`Add a column named to the output `_merge`containing the source information of each row ( ), and add a column named `indicator=True`to the output ( )`DataFramestringindicator=string`
`validate`	Checks whether the merge is a parameter of the specified type

Return Value

It returns a merge of the given objects DataFrame.

Example code: `DataFrame.merge()`Function to merge two`DataFrame`

import pandas as pd

df1 = pd.DataFrame(
    {"Name": ["Suraj", "Zeppy", "Alish", "Sarah"], "Working Hours": [1, 2, 3, 5]}
)
df2 = pd.DataFrame({"Name": ["Suraj", "Zack", "Alish", "Raphel"], "Pay": [5, 6, 7, 8]})

print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)

merged_df = df1.merge(df2)
print("Merged DataFrame:")
print(merged_df)

Output:

1st DataFrame:
    Name  Working Hours
0  Suraj              1
1  Zeppy              2
2  Alish              3
3  Sarah              5
2nd DataFrame:
     Name  Pay
0   Suraj    5
1    Zack    6
2   Alish    7
3  Raphel    8
Merged DataFrame:
    Name  Working Hours  Pay
0  Suraj              1    5
1  Alish              3    7

It uses SQLthe inner join technique of to merge df1and df2into one DataFrame.

For inner-jointhe method, we must ensure that the two DataFramehave at least one column in common.

Here, merge()the function will concatenate the rows that have the same value for a common column into two DataFrame.

Example code: `merge`Setting `how`the parameters in the method, merging using various techniques`DataFrame`

import pandas as pd

df1 = pd.DataFrame(
    {"Name": ["Suraj", "Zeppy", "Alish", "Sarah"], "Working Hours": [1, 2, 3, 5]}
)
df2 = pd.DataFrame({"Name": ["Suraj", "Zack", "Alish", "Raphel"], "Pay": [5, 6, 7, 8]})

print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)

merged_df = df1.merge(df2, how="right")
print("Merged DataFrame:")
print(merged_df)

Output:

1st DataFrame:
    Name  Working Hours
0  Suraj              1
1  Zeppy              2
2  Alish              3
3  Sarah              5
2nd DataFrame:
     Name  Pay
0   Suraj    5
1    Zack    6
2   Alish    7
3  Raphel    8
Merged DataFrame:
     Name  Working Hours  Pay
0   Suraj            1.0    5
1   Alish            3.0    7
2    Zack            NaN    6
3  Raphel            NaN   8

It uses SQLthe right-jointechnique of to merge df1and df2into one DataFrame.

Here, the function returns all rows merge()from the right side . However, rows that exist only in the left side will get the value of .DataFrameDataFrameNaN

Similarly, we can also use the and values howof the parameter .leftouter

Example Code: Use the function in Pandas `DataFrame.merge()`to merge only specific columns

import pandas as pd

df1 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zeppy", "Alish", "Sarah"],
        "Working Hours": [1, 2, 3, 5],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)
df2 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zack", "Alish", "Raphel"],
        "Pay": [5, 6, 7, 8],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)

print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)

merged_df = df1.merge(df2, on="Name")
print("Merged DataFrame:")
print(merged_df)

Output:


1st DataFrame:
    Name  Working Hours    Position
0  Suraj              1    Salesman
1  Zeppy              2         CEO
2  Alish              3     Manager
3  Sarah              5  Sales Head
2nd DataFrame:
     Name  Pay    Position
0   Suraj    5    Salesman
1    Zack    6         CEO
2   Alish    7     Manager
3  Raphel    8  Sales Head
Merged DataFrame:
    Name  Working Hours Position_x  Pay Position_y
0  Suraj              1   Salesman    5   Salesman
1  Alish              3    Manager    7    Manager

It only merges the columns of df1and . Since the default join method is an inner join, only the common rows of the two are joined. The column is common to both , so there are two positional columns, namely and .df2NameDataFramepositionDataFramePosition_xPosition_y

By default, _xand _ysuffixes are appended to the names of overlapping columns. We can suffixesspecify the suffix using the parameter.

df1 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zeppy", "Alish", "Sarah"],
        "Working Hours": [1, 2, 3, 5],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)
df2 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zack", "Alish", "Raphel"],
        "Pay": [5, 6, 7, 8],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)

print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)

merged_df = df1.merge(df2, on="Name", suffixes=("_left", "_right"))
print("Merged DataFrame:")
print(merged_df)

Output:

1st DataFrame:
    Name  Working Hours    Position
0  Suraj              1    Salesman
1  Zeppy              2         CEO
2  Alish              3     Manager
3  Sarah              5  Sales Head
2nd DataFrame:
     Name  Pay    Position
0   Suraj    5    Salesman
1    Zack    6         CEO
2   Alish    7     Manager
3  Raphel    8  Sales Head
Merged DataFrame:
    Name  Working Hours Position_left  Pay Position_right
0  Suraj              1      Salesman    5       Salesman
1  Alish              3       Manager    7        Manager

Example code: Merge using an index as the join key`DataFrame`

import pandas as pd

df1 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zeppy", "Alish", "Sarah"],
        "Working Hours": [1, 2, 3, 5],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)
df2 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zack", "Alish", "Raphel"],
        "Pay": [5, 6, 7, 8],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)

print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)

merged_df = df1.merge(
    df2, left_index=True, right_index=True, suffixes=("_left", "_right")
)
print("Merged DataFrame:")
print(merged_df)

Output:


1st DataFrame:
    Name  Working Hours    Position
0  Suraj              1    Salesman
1  Zeppy              2         CEO
2  Alish              3     Manager
3  Sarah              5  Sales Head
2nd DataFrame:
     Name  Pay    Position
0   Suraj    5    Salesman
1    Zack    6         CEO
2   Alish    7     Manager
3  Raphel    8  Sales Head
Merged DataFrame:
  Name_left  Working Hours Position_left Name_right  Pay Position_right
0     Suraj              1      Salesman      Suraj    5       Salesman
1     Zeppy              2           CEO       Zack    6            CEO
2     Alish              3       Manager      Alish    7        Manager
3     Sarah              5    Sales Head     Raphel    8     Sales Head

It merges DataFramethe corresponding rows of two , regardless of the similarity of the columns. If DataFramethe same column name appears on two , then after merging, a suffix is appended to the column name to make it a different column.

Previous：Finding the installed version of Pandas

Next：Pandas DataFrame DataFrame.plot.hist() function

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL：

JIYIK CN >