Pandas DataFrame DataFrame.merge() function
Python Pandas DataFrame.merge() function merges DataFrameor named Series objects.
pandas.DataFrame.merge()grammar
DataFrame.merge(
right,
how="inner",
on=None,
left_on=None,
right_on=None,
left_index=False,
right_index=False,
sort=False,
suffixes="_x",
"_y",
copy=True,
indicator=False,
validate=None,
)
parameter
right |
DataFrameor named Series. The object to merge |
how |
left, right, inneror outer. How to perform a merge operation |
on |
label or list of column or index names to merge |
left_on |
Label or list of DataFramecolumn or index names to merge onto the left side. |
right_on |
Label or list. Column or index names to be merged into DataFrameon the right. |
left_index |
Boolean. Use DataFramethe index of the left side as the join key ( left_index=True) |
right_index |
Boolean. Use DataFramethe index on the right side as the join key ( right_index=True) |
sort |
Boolean. Sort join keys alphabetically in output ( sort=True) |
suffixes |
Suffixes are applied to overlapping column names on the left and right sides respectively |
copy |
Boolean. Avoids copying copy=False. |
indicator |
DataFrameAdd a column named to the output _mergecontaining the source information of each row ( ), and add a column named indicator=Trueto the output ( )DataFramestringindicator=string |
validate |
Checks whether the merge is a parameter of the specified type |
Return Value
It returns a merge of the given objects DataFrame.
Example code: DataFrame.merge()Function to merge twoDataFrame
import pandas as pd
df1 = pd.DataFrame(
{"Name": ["Suraj", "Zeppy", "Alish", "Sarah"], "Working Hours": [1, 2, 3, 5]}
)
df2 = pd.DataFrame({"Name": ["Suraj", "Zack", "Alish", "Raphel"], "Pay": [5, 6, 7, 8]})
print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)
merged_df = df1.merge(df2)
print("Merged DataFrame:")
print(merged_df)
Output:
1st DataFrame:
Name Working Hours
0 Suraj 1
1 Zeppy 2
2 Alish 3
3 Sarah 5
2nd DataFrame:
Name Pay
0 Suraj 5
1 Zack 6
2 Alish 7
3 Raphel 8
Merged DataFrame:
Name Working Hours Pay
0 Suraj 1 5
1 Alish 3 7
It uses SQLthe inner join technique of to merge df1and df2into one DataFrame.
For inner-jointhe method, we must ensure that the two DataFramehave at least one column in common.
Here, merge()the function will concatenate the rows that have the same value for a common column into two DataFrame.
Example code: mergeSetting howthe parameters in the method, merging using various techniquesDataFrame
import pandas as pd
df1 = pd.DataFrame(
{"Name": ["Suraj", "Zeppy", "Alish", "Sarah"], "Working Hours": [1, 2, 3, 5]}
)
df2 = pd.DataFrame({"Name": ["Suraj", "Zack", "Alish", "Raphel"], "Pay": [5, 6, 7, 8]})
print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)
merged_df = df1.merge(df2, how="right")
print("Merged DataFrame:")
print(merged_df)
Output:
1st DataFrame:
Name Working Hours
0 Suraj 1
1 Zeppy 2
2 Alish 3
3 Sarah 5
2nd DataFrame:
Name Pay
0 Suraj 5
1 Zack 6
2 Alish 7
3 Raphel 8
Merged DataFrame:
Name Working Hours Pay
0 Suraj 1.0 5
1 Alish 3.0 7
2 Zack NaN 6
3 Raphel NaN 8
It uses SQLthe right-jointechnique of to merge df1and df2into one DataFrame.
Here, the function returns all rows merge()from the right side . However, rows that exist only in the left side will get the value of .DataFrameDataFrameNaN
Similarly, we can also use the and values howof the parameter .leftouter
Example Code: Use the function in Pandas DataFrame.merge()to merge only specific columns
import pandas as pd
df1 = pd.DataFrame(
{
"Name": ["Suraj", "Zeppy", "Alish", "Sarah"],
"Working Hours": [1, 2, 3, 5],
"Position": ["Salesman", "CEO", "Manager", "Sales Head"],
}
)
df2 = pd.DataFrame(
{
"Name": ["Suraj", "Zack", "Alish", "Raphel"],
"Pay": [5, 6, 7, 8],
"Position": ["Salesman", "CEO", "Manager", "Sales Head"],
}
)
print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)
merged_df = df1.merge(df2, on="Name")
print("Merged DataFrame:")
print(merged_df)
Output:
1st DataFrame:
Name Working Hours Position
0 Suraj 1 Salesman
1 Zeppy 2 CEO
2 Alish 3 Manager
3 Sarah 5 Sales Head
2nd DataFrame:
Name Pay Position
0 Suraj 5 Salesman
1 Zack 6 CEO
2 Alish 7 Manager
3 Raphel 8 Sales Head
Merged DataFrame:
Name Working Hours Position_x Pay Position_y
0 Suraj 1 Salesman 5 Salesman
1 Alish 3 Manager 7 Manager
It only merges the columns of df1and . Since the default join method is an inner join, only the common rows of the two are joined. The column is common to both , so there are two positional columns, namely and .df2NameDataFramepositionDataFramePosition_xPosition_y
By default, _xand _ysuffixes are appended to the names of overlapping columns. We can suffixesspecify the suffix using the parameter.
df1 = pd.DataFrame(
{
"Name": ["Suraj", "Zeppy", "Alish", "Sarah"],
"Working Hours": [1, 2, 3, 5],
"Position": ["Salesman", "CEO", "Manager", "Sales Head"],
}
)
df2 = pd.DataFrame(
{
"Name": ["Suraj", "Zack", "Alish", "Raphel"],
"Pay": [5, 6, 7, 8],
"Position": ["Salesman", "CEO", "Manager", "Sales Head"],
}
)
print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)
merged_df = df1.merge(df2, on="Name", suffixes=("_left", "_right"))
print("Merged DataFrame:")
print(merged_df)
Output:
1st DataFrame:
Name Working Hours Position
0 Suraj 1 Salesman
1 Zeppy 2 CEO
2 Alish 3 Manager
3 Sarah 5 Sales Head
2nd DataFrame:
Name Pay Position
0 Suraj 5 Salesman
1 Zack 6 CEO
2 Alish 7 Manager
3 Raphel 8 Sales Head
Merged DataFrame:
Name Working Hours Position_left Pay Position_right
0 Suraj 1 Salesman 5 Salesman
1 Alish 3 Manager 7 Manager
Example code: Merge using an index as the join keyDataFrame
import pandas as pd
df1 = pd.DataFrame(
{
"Name": ["Suraj", "Zeppy", "Alish", "Sarah"],
"Working Hours": [1, 2, 3, 5],
"Position": ["Salesman", "CEO", "Manager", "Sales Head"],
}
)
df2 = pd.DataFrame(
{
"Name": ["Suraj", "Zack", "Alish", "Raphel"],
"Pay": [5, 6, 7, 8],
"Position": ["Salesman", "CEO", "Manager", "Sales Head"],
}
)
print("1st DataFrame:")
print(df1)
print("2nd DataFrame:")
print(df2)
merged_df = df1.merge(
df2, left_index=True, right_index=True, suffixes=("_left", "_right")
)
print("Merged DataFrame:")
print(merged_df)
Output:
1st DataFrame:
Name Working Hours Position
0 Suraj 1 Salesman
1 Zeppy 2 CEO
2 Alish 3 Manager
3 Sarah 5 Sales Head
2nd DataFrame:
Name Pay Position
0 Suraj 5 Salesman
1 Zack 6 CEO
2 Alish 7 Manager
3 Raphel 8 Sales Head
Merged DataFrame:
Name_left Working Hours Position_left Name_right Pay Position_right
0 Suraj 1 Salesman Suraj 5 Salesman
1 Zeppy 2 CEO Zack 6 CEO
2 Alish 3 Manager Alish 7 Manager
3 Sarah 5 Sales Head Raphel 8 Sales Head
It merges DataFramethe corresponding rows of two , regardless of the similarity of the columns. If DataFramethe same column name appears on two , then after merging, a suffix is appended to the column name to make it a different column.
For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.
Related Articles
Pandas DataFrame DataFrame.query() function
Publish Date:2025/04/30 Views:107 Category:Python
-
The pandas.DataFrame.query() method filters the rows of the caller DataFrame using the given query expression. pandas.DataFrame.query() grammar DataFrame . query(expr, inplace = False , ** kwargs) parameter expr Filter rows based on query e
Pandas DataFrame DataFrame.min() function
Publish Date:2025/04/30 Views:162 Category:Python
-
Python Pandas DataFrame.min() function gets the minimum value of the DataFrame object along the specified axis. pandas.DataFrame.min() grammar DataFrame . mean(axis = None , skipna = None , level = None , numeric_only = None , ** kwargs) pa
Pandas DataFrame DataFrame.mean() function
Publish Date:2025/04/30 Views:85 Category:Python
-
Python Pandas DataFrame.mean() function calculates the mean of the values of the DataFrame object over the specified axis. pandas.DataFrame.mean() grammar DataFrame . mean(axis = None , skipna = None , level = None , numeric_only = No
Pandas DataFrame DataFrame.isin() function
Publish Date:2025/04/30 Views:133 Category:Python
-
The pandas.DataFrame.isin(values) function checks whether each element in the caller DataFrame contains values the value specified in the input . pandas.DataFrame.isin(values) grammar DataFrame . isin(values) parameter values iterable - lis
Pandas DataFrame DataFrame.groupby() function
Publish Date:2025/04/30 Views:161 Category:Python
-
pandas.DataFrame.groupby() takes a DataFrame as input and divides the DataFrame into groups based on a given criterion. We can use groupby() the method to easily process large datasets. pandas.DataFrame.groupby() grammar DataFrame . groupby
Pandas DataFrame DataFrame.fillna() function
Publish Date:2025/04/30 Views:60 Category:Python
-
The pandas.DataFrame.fillna() function replaces the values DataFrame in NaN with a certain value. pandas.DataFrame.fillna() grammar DataFrame . fillna( value = None , method = None , axis = None , inplace = False , limit = None , down
Pandas DataFrame DataFrame.dropna() function
Publish Date:2025/04/30 Views:181 Category:Python
-
The pandas.DataFrame.dropna() function removes null values (missing values) from a DataFrame by dropping rows or columns that contain null values DataFrame . NaN ( Not a Number ) and NaT ( Not a Time ) represent null values. DataFrame
Pandas DataFrame DataFrame.assign() function
Publish Date:2025/04/30 Views:55 Category:Python
-
Python Pandas DataFrame.assign() function assigns new columns to DataFrame . pandas.DataFrame.assign() grammar DataFrame . assign( ** kwargs) parameter **kwargs Keyword arguments, DataFrame the column names to be assigned to are passed as k
Pandas DataFrame DataFrame.transform() function
Publish Date:2025/04/30 Views:120 Category:Python
-
Python Pandas DataFrame.transform() DataFrame applies a function on and transforms DataFrame . The function to be applied is passed as an argument to the function. The axis length of transform() the transformed should be the same as the ori

