JIYIK CN >

Current Location:Home > Learning > PROGRAM > Python >

Combine values from multiple columns into one column in Pandas DataFrame

Author:JIYIK Last Updated:2025/05/01 Views:

This tutorial will demonstrate how to merge or return the first non-null values ​​from multiple columns into another column in a Python Pandas DataFrame.

For example, if it is not empty, the value of column 1 is used for the new column 3; otherwise, if column 1 is empty, the value of column 2 is used for the new column 3.

We can accomplish this task in a number of ways in a Pandas DataFrame.


Writing code from scratch to merge values ​​from multiple columns into one column in a Pandas DataFrame

We can write the logic from scratch to merge the values. In the following code, we create a Pandas DataFrame with three columns named , Age_in_Years, Age_in_Monthsand Age_in_Days.

The DataFrame also has some missing values. If we want to display the age, first, we will output the age in years.

If the value in that column is Null, we will display the age in months. Similarly, if the value in months is Null, we will display the age in days.

To do this, we write the code from scratch to get the value of the first non-null column. The function is iterating over all DataFrame columns and returns the value where a non-null value is found; otherwise, it checks the values ​​in other columns.

Sample code:

# Python 3.x
import pandas as pd

df_age = pd.DataFrame(
    {
        "Age_in_Years": ["4 y", None, None, None],
        "Age_in_Months": ["48 m", "24 m", None, None],
        "Age_in_Days": ["1440 d", None, "2520 d", None],
    }
)


def get_first_non_null(dfrow, cols):
    for c in cols:
        if pd.notnull(dfrow[c]):
            return dfrow[c]
    return None


cols = ["Age_in_Years", "Age_in_Months", "Age_in_Days"]
df_age["Age"] = df_age.apply(lambda x: get_first_non_null(x, cols), axis=1)
display(df_age)

Output:


Run a SQL query in a Pandas DataFrame using DuckDB to combine values ​​from multiple columns into one column

Sample code:

DuckDB is a Python API and a database management system for interacting with databases using SQL queries. This package has a built-in merge method that selects the first non-null value from a column.

We will pass the column names in the SQL query to the coalesce method.

# Python 3.x
import pandas as pd
import duckdb

df_age = pd.DataFrame(
    {
        "Age_in_Years": ["4 y", None, None, None],
        "Age_in_Months": ["48 m", "24 m", None, None],
        "Age_in_Days": ["1440 d", None, "2520 d", None],
    }
)
df_age = duckdb.query(
    """SELECT Age_in_Years, Age_in_Months, Age_in_Days, coalesce(Age_in_Years, Age_in_Months, Age_in_days) as Age from df_age"""
).to_df()
display(df_age)

Output:


In Pandas DataFrame, use combine_first()the method to combine values ​​from multiple columns into one column

combine_first()Method fills the empty values ​​in one DataFrame with the non-empty data from the second DataFrame to combine two DataFrame objects.

In the following code, we will return the column values. We will combine Age_in_Yearswith and with .Age_in_MonthsAge_in_MonthsAge_in_Days

It will return Age_in_yearsthe value from . If that is Null, it will return Age_in_Monthsthe value from . Likewise, if that is also Null, it will Age_in_Daysreturn a value from .

The data in the actual DataFrame will not change and we will Ageget the values ​​we want in the columns.

Sample code:

# Python 3.x
import pandas as pd

df_age = pd.DataFrame(
    {
        "Age_in_Years": ["4 y", None, None, None],
        "Age_in_Months": ["48 m", "24 m", None, None],
        "Age_in_Days": ["1440 d", None, "2520 d", None],
    }
)
df_age["Age"] = (
    df_age["Age_in_Years"]
    .combine_first(df_age["Age_in_Months"])
    .combine_first(df_age["Age_in_Days"])
)
df_age

Output:


In Pandas DataFrame, use bfill()the method to combine values ​​from multiple columns into one column

bfillStands for Backward Fill. This method replaces NaNs with the next row or column value.

Here, we specify to axis=1return the value from the next column if the value in the current column is Null.

Sample code:

# Python 3.x
import pandas as pd

df_age = pd.DataFrame(
    {
        "Age_in_Years": ["4 y", None, None, None],
        "Age_in_Months": ["48 m", "24 m", None, None],
        "Age_in_Days": ["1440 d", None, "2520 d", None],
    }
)
df_age["Age"] = df_age.bfill(axis=1).iloc[:, 0]
df_age

Output:


In Pandas DataFrame, use mask()the method to combine values ​​from multiple columns into one column

mask()Method works if-thensimilarly to .

If the null condition for a column is false, then its value will be used. Otherwise, it will get the value from the other specified column.

Sample code:

# Python 3.x
import pandas as pd

df_age = pd.DataFrame(
    {
        "Age_in_Years": ["4 y", None, None, None],
        "Age_in_Months": ["48 m", "24 m", None, None],
        "Age_in_Days": ["1440 d", None, "2520 d", None],
    }
)
df_age["Age"] = (
    df_age["Age_in_Years"]
    .mask(pd.isnull, df_age["Age_in_Months"])
    .mask(pd.isnull, df_age["Age_in_Days"])
)
df_age

Output:

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL:

Related Articles

Convert Pandas to CSV without index

Publish Date:2025/05/01 Views:159 Category:Python

As you know, an index can be thought of as a reference point used to store and access records in a DataFrame. They are unique for each row and usually range from 0 to the last row of the DataFrame, but we can also have serial numbers, dates

Convert Pandas DataFrame to Dictionary

Publish Date:2025/05/01 Views:197 Category:Python

This tutorial will show you how to convert a Pandas DataFrame into a dictionary with the index column elements as keys and the corresponding elements of other columns as values. We will use the following DataFrame in the article. import pan

Convert Pandas DataFrame columns to lists

Publish Date:2025/05/01 Views:191 Category:Python

When working with Pandas DataFrames in Python, you often need to convert the columns of the DataFrame into Python lists. This process is very important for various data manipulation and analysis tasks. Fortunately, Pandas provides several m

Subtracting Two Columns in Pandas DataFrame

Publish Date:2025/05/01 Views:120 Category:Python

Pandas can handle very large data sets and has a variety of functions and operations that can be applied to the data. One of the simple operations is to subtract two columns and store the result in a new column, which we will discuss in thi

Dropping columns by index in Pandas DataFrame

Publish Date:2025/05/01 Views:99 Category:Python

DataFrames can be very large and can contain hundreds of rows and columns. It is necessary to master the basic maintenance operations of DataFrames, such as deleting multiple columns. We can use dataframe.drop() the method to delete columns

Pandas Copy DataFrame

Publish Date:2025/05/01 Views:53 Category:Python

This tutorial will show you how to DataFrame.copy() copy a DataFrame object using the copy method. import pandas as pd items_df = pd . DataFrame( { "Id" : [ 302 , 504 , 708 ], "Cost" : [ "300" , "400" , "350" ], } ) print (items_df) Output:

Pandas DataFrame.ix[] Function

Publish Date:2025/05/01 Views:168 Category:Python

Python Pandas DataFrame.ix[] function slices rows or columns based on the value of the argument. pandas.DataFrame.ix[] grammar DataFrame . ix[index = None , label = None ] parameter index Integer or list of integers used to slice row indice

Pandas DataFrame.describe() Function

Publish Date:2025/05/01 Views:120 Category:Python

Python Pandas DataFrame.describe() function returns the statistics of a DataFrame. pandas.DataFrame.describe() grammar DataFrame . describe( percentiles = None , include = None , exclude = None , datetime_is_numeric = False ) parameter perc

Pandas DataFrame.astype() Function

Publish Date:2025/05/01 Views:160 Category:Python

Python Pandas DataFrame.astype() function changes the data type of an object to the specified data type. pandas.DataFrame.astype() grammar DataFrame . astype(dtype, copy = True , errors = "raise" ) parameter dtype The data type we want to a

Scan to Read All Tech Tutorials

Social Media
  • https://www.github.com/onmpw
  • qq:1244347461

Recommended

Tags

Scan the Code
Easier Access Tutorial