JIYIK CN >

Current Location:Home > Learning > PROGRAM > Python >

Pandas DataFrame DataFrame.merge() function

Author:JIYIK Last Updated:2025/04/30 Views:

Python Pandas DataFrame.merge() function merges DataFrameor named Series objects.


pandas.DataFrame.merge()grammar

DataFrame.merge(
    right,
    how="inner",
    on=None,
    left_on=None,
    right_on=None,
    left_index=False,
    right_index=False,
    sort=False,
    suffixes="_x",
    "_y",
    copy=True,
    indicator=False,
    validate=None,
)

parameter

right DataFrameor named Series. The object to merge
how left, right, inneror outer. How to perform a merge operation
on label or list of column or index names to merge
left_on Label or list of DataFramecolumn or index names to merge onto the left side.
right_on Label or list. Column or index names to be merged into DataFrameon the right.
left_index Boolean. Use DataFramethe index of the left side as the join key ( left_index=True)
right_index Boolean. Use DataFramethe index on the right side as the join key ( right_index=True)
sort Boolean. Sort join keys alphabetically in output ( sort=True)
suffixes Suffixes are applied to overlapping column names on the left and right sides respectively
copy Boolean. Avoids copying copy=False.
indicator DataFrameAdd a column named to the output _mergecontaining the source information of each row ( ), and add a column named indicator=Trueto the output ( )DataFramestringindicator=string
validate Checks whether the merge is a parameter of the specified type

Return Value

It returns a merge of the given objects DataFrame.


Example code: DataFrame.merge()Function to merge twoDataFrame

import pandas as pd

df1 = pd.DataFrame(
    {"Name": ["Suraj", "Zeppy", "Alish", "Sarah"], "Working Hours": [1, 2, 3, 5]}
)
df2 = pd.DataFrame({"Name": ["Suraj", "Zack", "Alish", "Raphel"], "Pay": [5, 6, 7, 8]})

print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)

merged_df = df1.merge(df2)
print("Merged DataFrame:")
print(merged_df)

Output:

1st DataFrame:
    Name  Working Hours
0  Suraj              1
1  Zeppy              2
2  Alish              3
3  Sarah              5
2nd DataFrame:
     Name  Pay
0   Suraj    5
1    Zack    6
2   Alish    7
3  Raphel    8
Merged DataFrame:
    Name  Working Hours  Pay
0  Suraj              1    5
1  Alish              3    7

It uses SQLthe inner join technique of to merge df1and df2into one DataFrame.

For inner-jointhe method, we must ensure that the two DataFramehave at least one column in common.

Here, merge()the function will concatenate the rows that have the same value for a common column into two DataFrame.


Example code: mergeSetting howthe parameters in the method, merging using various techniquesDataFrame

import pandas as pd

df1 = pd.DataFrame(
    {"Name": ["Suraj", "Zeppy", "Alish", "Sarah"], "Working Hours": [1, 2, 3, 5]}
)
df2 = pd.DataFrame({"Name": ["Suraj", "Zack", "Alish", "Raphel"], "Pay": [5, 6, 7, 8]})

print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)

merged_df = df1.merge(df2, how="right")
print("Merged DataFrame:")
print(merged_df)

Output:

1st DataFrame:
    Name  Working Hours
0  Suraj              1
1  Zeppy              2
2  Alish              3
3  Sarah              5
2nd DataFrame:
     Name  Pay
0   Suraj    5
1    Zack    6
2   Alish    7
3  Raphel    8
Merged DataFrame:
     Name  Working Hours  Pay
0   Suraj            1.0    5
1   Alish            3.0    7
2    Zack            NaN    6
3  Raphel            NaN   8 

It uses SQLthe right-jointechnique of to merge df1and df2into one DataFrame.

Here, the function returns all rows merge()from the right side . However, rows that exist only in the left side will get the value of .DataFrameDataFrameNaN

Similarly, we can also use the and values how​​of the parameter .leftouter


Example Code: Use the function in Pandas DataFrame.merge()to merge only specific columns

import pandas as pd

df1 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zeppy", "Alish", "Sarah"],
        "Working Hours": [1, 2, 3, 5],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)
df2 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zack", "Alish", "Raphel"],
        "Pay": [5, 6, 7, 8],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)

print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)

merged_df = df1.merge(df2, on="Name")
print("Merged DataFrame:")
print(merged_df)

Output:


1st DataFrame:
    Name  Working Hours    Position
0  Suraj              1    Salesman
1  Zeppy              2         CEO
2  Alish              3     Manager
3  Sarah              5  Sales Head
2nd DataFrame:
     Name  Pay    Position
0   Suraj    5    Salesman
1    Zack    6         CEO
2   Alish    7     Manager
3  Raphel    8  Sales Head
Merged DataFrame:
    Name  Working Hours Position_x  Pay Position_y
0  Suraj              1   Salesman    5   Salesman
1  Alish              3    Manager    7    Manager

It only merges the columns of df1and . Since the default join method is an inner join, only the common rows of the two are joined. The column is common to both , so there are two positional columns, namely and .df2NameDataFramepositionDataFramePosition_xPosition_y

By default, _xand _ysuffixes are appended to the names of overlapping columns. We can suffixesspecify the suffix using the parameter.

df1 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zeppy", "Alish", "Sarah"],
        "Working Hours": [1, 2, 3, 5],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)
df2 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zack", "Alish", "Raphel"],
        "Pay": [5, 6, 7, 8],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)

print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)

merged_df = df1.merge(df2, on="Name", suffixes=("_left", "_right"))
print("Merged DataFrame:")
print(merged_df)

Output:

1st DataFrame:
    Name  Working Hours    Position
0  Suraj              1    Salesman
1  Zeppy              2         CEO
2  Alish              3     Manager
3  Sarah              5  Sales Head
2nd DataFrame:
     Name  Pay    Position
0   Suraj    5    Salesman
1    Zack    6         CEO
2   Alish    7     Manager
3  Raphel    8  Sales Head
Merged DataFrame:
    Name  Working Hours Position_left  Pay Position_right
0  Suraj              1      Salesman    5       Salesman
1  Alish              3       Manager    7        Manager

Example code: Merge using an index as the join keyDataFrame

import pandas as pd

df1 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zeppy", "Alish", "Sarah"],
        "Working Hours": [1, 2, 3, 5],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)
df2 = pd.DataFrame(
    {
        "Name": ["Suraj", "Zack", "Alish", "Raphel"],
        "Pay": [5, 6, 7, 8],
        "Position": ["Salesman", "CEO", "Manager", "Sales Head"],
    }
)

print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)

merged_df = df1.merge(
    df2, left_index=True, right_index=True, suffixes=("_left", "_right")
)
print("Merged DataFrame:")
print(merged_df)

Output:


1st DataFrame:
    Name  Working Hours    Position
0  Suraj              1    Salesman
1  Zeppy              2         CEO
2  Alish              3     Manager
3  Sarah              5  Sales Head
2nd DataFrame:
     Name  Pay    Position
0   Suraj    5    Salesman
1    Zack    6         CEO
2   Alish    7     Manager
3  Raphel    8  Sales Head
Merged DataFrame:
  Name_left  Working Hours Position_left Name_right  Pay Position_right
0     Suraj              1      Salesman      Suraj    5       Salesman
1     Zeppy              2           CEO       Zack    6            CEO
2     Alish              3       Manager      Alish    7        Manager
3     Sarah              5    Sales Head     Raphel    8     Sales Head

It merges DataFramethe corresponding rows of two , regardless of the similarity of the columns. If DataFramethe same column name appears on two , then after merging, a suffix is ​​appended to the column name to make it a different column.

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL:

Related Articles

Pandas DataFrame DataFrame.query() function

Publish Date:2025/04/30 Views:107 Category:Python

The pandas.DataFrame.query() method filters the rows of the caller DataFrame using the given query expression. pandas.DataFrame.query() grammar DataFrame . query(expr, inplace = False , ** kwargs) parameter expr Filter rows based on query e

Pandas DataFrame DataFrame.min() function

Publish Date:2025/04/30 Views:162 Category:Python

Python Pandas DataFrame.min() function gets the minimum value of the DataFrame object along the specified axis. pandas.DataFrame.min() grammar DataFrame . mean(axis = None , skipna = None , level = None , numeric_only = None , ** kwargs) pa

Pandas DataFrame DataFrame.mean() function

Publish Date:2025/04/30 Views:85 Category:Python

Python Pandas DataFrame.mean() function calculates the mean of the values ​​of the DataFrame object over the specified axis. pandas.DataFrame.mean() grammar DataFrame . mean(axis = None , skipna = None , level = None , numeric_only = No

Pandas DataFrame DataFrame.isin() function

Publish Date:2025/04/30 Views:133 Category:Python

The pandas.DataFrame.isin(values) function checks whether each element in the caller DataFrame contains values the value specified in the input . pandas.DataFrame.isin(values) grammar DataFrame . isin(values) parameter values iterable - lis

Pandas DataFrame DataFrame.groupby() function

Publish Date:2025/04/30 Views:161 Category:Python

pandas.DataFrame.groupby() takes a DataFrame as input and divides the DataFrame into groups based on a given criterion. We can use groupby() the method to easily process large datasets. pandas.DataFrame.groupby() grammar DataFrame . groupby

Pandas DataFrame DataFrame.fillna() function

Publish Date:2025/04/30 Views:60 Category:Python

The pandas.DataFrame.fillna() function replaces the values DataFrame ​​in NaN with a certain value. pandas.DataFrame.fillna() grammar DataFrame . fillna( value = None , method = None , axis = None , inplace = False , limit = None , down

Pandas DataFrame DataFrame.dropna() function

Publish Date:2025/04/30 Views:181 Category:Python

The pandas.DataFrame.dropna() function removes null values ​​(missing values) from a DataFrame by dropping rows or columns that contain null values DataFrame . NaN ( Not a Number ) and NaT ( Not a Time ) represent null values. DataFrame

Pandas DataFrame DataFrame.assign() function

Publish Date:2025/04/30 Views:55 Category:Python

Python Pandas DataFrame.assign() function assigns new columns to DataFrame . pandas.DataFrame.assign() grammar DataFrame . assign( ** kwargs) parameter **kwargs Keyword arguments, DataFrame the column names to be assigned to are passed as k

Pandas DataFrame DataFrame.transform() function

Publish Date:2025/04/30 Views:120 Category:Python

Python Pandas DataFrame.transform() DataFrame applies a function on and transforms DataFrame . The function to be applied is passed as an argument to the function. The axis length of transform() the transformed should be the same as the ori

Scan to Read All Tech Tutorials

Social Media
  • https://www.github.com/onmpw
  • qq:1244347461

Recommended

Tags

Scan the Code
Easier Access Tutorial