JIYIK CN >

Current Location:Home > Learning > PROGRAM > Python >

Filling Missing Values in Pandas DataFrame

Author:JIYIK Last Updated:2025/05/01 Views:

Sometimes, we may have a dataset with missing values. There are many ways to replace the missing data using certain methods.

ffill()(Forward Fill) is one of the methods to replace missing values ​​in DataFrame. This method replaces NaN with the previous row or column value.


ffill()Syntax of Pandas Methods

# Python 3.x
dataframe.ffill(axis, inplace, limit, downcast)

ffill()The method takes four optional parameters:

  • axisSpecifies where to fill missing values. The value 0 represents rows and 1 represents columns.
  • inplaceCan be True or False. True specifies that the changes are made in the current DataFrame, while False creates a separate copy of the new DataFrame with the filled values.
  • limitSpecify the maximum number of missing values ​​to be filled consecutively along the axis.
  • downcastSpecifies a dictionary of values ​​to be populated for a specific data type.

ffill()Fill missing values ​​in DataFrame using Pandas

Fill missing values ​​along the row axis

In the following code, we have a DataFrame with missing values ​​represented by None or NaN. We have displayed the actual DataFrame and then applied ffill()the method to that DataFrame.

By default, ffill()the method replaces missing values ​​along the row/index axis. NaNs are replaced with the value from the previous row for that cell.

The first row still contains NaNs in the output because there is no preceding row.

Sample code:

# Python 3.x
import pandas as pd

df = pd.DataFrame(
    {
        "C1": [2, 7, None, 4],
        "C2": [None, 2, None, 3],
        "C3": [2, None, 6, 5],
        "C4": [5, 2, 8, None],
    }
)
display(df)
df2 = df.ffill()
display(df2)

Output:

Fill missing values ​​along the column axis

Here, we will specify that axis=1it will fill the missing values ​​by observing the value in the previous column of the corresponding cell.

In the output, all values ​​are filled except two. Since we don't have 1a previous column for column , that value is still NaN.

The value in column 2 is NaN because the corresponding cell in the previous column is also NaN.

Sample code:

# Python 3.x
import pandas as pd

df = pd.DataFrame(
    {
        "C1": [2, 7, None, 4],
        "C2": [None, 2, None, 3],
        "C3": [2, None, 6, 5],
        "C4": [5, 2, 8, None],
    }
)
display(df)
df2 = df.ffill(axis=1)
display(df2)

Output:

Use limitto limit the number of consecutive NaNs to be filled

We can use limitthe parameter to limit the number of consecutive missing values ​​to be filled along the row or column axis.

In the code below, we have the actual DataFrame where the last three rows have consecutive NaNs. limit=2No more than two consecutive NaNs can be filled along the row axis if we specify.

That's why the NaNs in the last row remain unfilled.

Sample code:

# Python 3.x
import pandas as pd

df = pd.DataFrame(
    {
        "C1": [2, 7, None, 4],
        "C2": [4, None, None, None],
        "C3": [6, 6, 6, 5],
        "C4": [None, 2, 8, None],
    }
)
display(df)
df2 = df.ffill(axis=0, limit=2)
display(df2)

Output:

inplaceFill the values ​​in the original DataFrame using

Suppose, instead of copying the DataFrame with filled values ​​in another DataFrame, we want to make changes in the original DataFrame. In this case, we can use inplacethe fill parameter with value True.

Sample code:

# Python 3.x
import pandas as pd

df = pd.DataFrame(
    {
        "C1": [2, 7, None, 4],
        "C2": [4, None, None, None],
        "C3": [6, 6, 6, 5],
        "C4": [None, 2, 8, None],
    }
)
display(df)
df.ffill(inplace=True)
display(df)

Output:

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL:

Related Articles

Convert Pandas to CSV without index

Publish Date:2025/05/01 Views:159 Category:Python

As you know, an index can be thought of as a reference point used to store and access records in a DataFrame. They are unique for each row and usually range from 0 to the last row of the DataFrame, but we can also have serial numbers, dates

Convert Pandas DataFrame to Dictionary

Publish Date:2025/05/01 Views:197 Category:Python

This tutorial will show you how to convert a Pandas DataFrame into a dictionary with the index column elements as keys and the corresponding elements of other columns as values. We will use the following DataFrame in the article. import pan

Convert Pandas DataFrame columns to lists

Publish Date:2025/05/01 Views:191 Category:Python

When working with Pandas DataFrames in Python, you often need to convert the columns of the DataFrame into Python lists. This process is very important for various data manipulation and analysis tasks. Fortunately, Pandas provides several m

Subtracting Two Columns in Pandas DataFrame

Publish Date:2025/05/01 Views:120 Category:Python

Pandas can handle very large data sets and has a variety of functions and operations that can be applied to the data. One of the simple operations is to subtract two columns and store the result in a new column, which we will discuss in thi

Dropping columns by index in Pandas DataFrame

Publish Date:2025/05/01 Views:99 Category:Python

DataFrames can be very large and can contain hundreds of rows and columns. It is necessary to master the basic maintenance operations of DataFrames, such as deleting multiple columns. We can use dataframe.drop() the method to delete columns

Pandas Copy DataFrame

Publish Date:2025/05/01 Views:53 Category:Python

This tutorial will show you how to DataFrame.copy() copy a DataFrame object using the copy method. import pandas as pd items_df = pd . DataFrame( { "Id" : [ 302 , 504 , 708 ], "Cost" : [ "300" , "400" , "350" ], } ) print (items_df) Output:

Pandas DataFrame.ix[] Function

Publish Date:2025/05/01 Views:168 Category:Python

Python Pandas DataFrame.ix[] function slices rows or columns based on the value of the argument. pandas.DataFrame.ix[] grammar DataFrame . ix[index = None , label = None ] parameter index Integer or list of integers used to slice row indice

Pandas DataFrame.describe() Function

Publish Date:2025/05/01 Views:120 Category:Python

Python Pandas DataFrame.describe() function returns the statistics of a DataFrame. pandas.DataFrame.describe() grammar DataFrame . describe( percentiles = None , include = None , exclude = None , datetime_is_numeric = False ) parameter perc

Pandas DataFrame.astype() Function

Publish Date:2025/05/01 Views:160 Category:Python

Python Pandas DataFrame.astype() function changes the data type of an object to the specified data type. pandas.DataFrame.astype() grammar DataFrame . astype(dtype, copy = True , errors = "raise" ) parameter dtype The data type we want to a

Scan to Read All Tech Tutorials

Social Media
  • https://www.github.com/onmpw
  • qq:1244347461

Recommended

Tags

Scan the Code
Easier Access Tutorial