How to filter DataFrame rows based on column values in Pandas
We can select rows based on single or multiple column values DataFrame
. We can also get rows from a DataFrame that satisfy or do not satisfy one or more conditions. This can be achieved using Boolean indexing, positional indexing, label indexing, and the query() method.
Select Pandas rows based on specific column values
We can select Pandas rows from a DataFrame that contain or do not contain a specific value of a column. It is widely used to filter a DataFrame based on a column value.
Select Pandas rows containing specific column values
In boolean indexing, we first generate a mask, which is simply a series of boolean values that represent whether the column contains a particular element.
df_mask = df["col_name"] == "specific_value"
We then apply this mask to the original DataFrame to filter the desired values.
filtered_df = df[df_mask]
This returns a filtered DataFrame that DataFrame
contains only rows with values for col_name
the column .specific_value
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
sales = [200, 300, 400, 200, 300, 300]
prices = [3, 1, 2, 4, 3, 2]
df = pd.DataFrame({"Date": dates, "Sales": sales, "Price": prices})
df_mask = df["Sales"] == 300
filtered_df = df[df_mask]
print(filtered_df)
Output:
Date Sales Price
1 April-11 300 1
4 April-14 300 3
5 April-16 300 2
This will give df
all rows in which sales is 300.
It is similar to boolean indexing but does one more step. In this method, we first create a boolean mask and then find the positions where the boolean mask has true values. We then True
pass all the positions in the mask where the value is true to iloc()
the method in order to select only all the required rows.
import pandas as pd
import numpy as np
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
sales = [200, 300, 400, 200, 300, 300]
prices = [3, 1, 2, 4, 3, 2]
df = pd.DataFrame({"Date": dates, "Sales": sales, "Price": prices})
df_mask = df["Sales"] == 300
positions = np.flatnonzero(df_mask)
filtered_df = df.iloc[positions]
print(filtered_df)
Output:
Date Sales Price
1 April-11 300 1
4 April-14 300 3
5 April-16 300 2
This will also give df
all rows in which the sales value is 300.
We can also use Pandas Chaining to filter by column values pandas.DataFrame
. In this method, we use pandas.DataFrame.eq() method, whose values will be checked to compare element-wise equality in the DataFrame.
import pandas as pd
import numpy as np
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
sales = [200, 300, 400, 200, 300, 300]
prices = [3, 1, 2, 4, 3, 2]
df = pd.DataFrame({"Date": dates, "Sales": sales, "Price": prices})
filtered_df = df[df.Sales.eq(300)]
print(filtered_df)
Output:
Date Sales Price
1 April-11 300 1
4 April-14 300 3
5 April-16 300 2
We can select Pandas rows by column values in a column using pandas.DataFrame.query().
import pandas as pd
import numpy as np
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
sales = [200, 300, 400, 200, 300, 300]
prices = [3, 1, 2, 4, 3, 2]
df = pd.DataFrame({"Date": dates, "Sales": sales, "Price": prices})
filtered_df = df.query("Sales == 300")
print(filtered_df)
Output:
Date Sales Price
1 April-11 300 1
4 April-14 300 3
5 April-16 300 2
If we wish to update an existing DataFrame, we can do so in the query method inplace=True
.
Select Pandas rows that do not contain a specific column value
The method of selecting Pandas rows that do not contain a specific column value is similar to the method of selecting Pandas rows with a specific column value. The only thing we need to change is the condition, that is, when creating the mask or query, just ==
replace with !=
for the column not to contain a specific value.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
sales = [200, 300, 400, 200, 300, 300]
prices = [3, 1, 2, 4, 3, 2]
df = pd.DataFrame({"Date": dates, "Sales": sales, "Price": prices})
df_mask = df["Sales"] != 300
filtered_df = df[df_mask]
print(filtered_df)
Output:
Date Sales Price
0 April-10 200 3
2 April-12 400 2
3 April-13 200 4
This will select 300
all df
rows where the Sales value is not .
Select Pandas rows where column value is greater or less than a specific value
To select Pandas rows where column values are greater than or less than a specific value, use operators such as >
, <=
, when creating a mask or query .>=
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
sales = [200, 300, 400, 200, 300, 300]
prices = [3, 1, 2, 4, 3, 2]
df = pd.DataFrame({"Date": dates, "Sales": sales, "Price": prices})
df_mask = df["Sales"] >= 300
filtered_df = df[df_mask]
print(filtered_df)
Output:
Date Sales Price
1 April-11 300 1
2 April-12 400 2
4 April-14 300 3
5 April-16 300 2
This will result in a DataFrame Sales
with values greater than or equal to 300
.
Select Pandas rows based on multiple column values
We have introduced methods to select rows based on specific values of columns in DataFrame. In this section, we will discuss methods to select rows in Pandas based on multiple column values.
Select Pandas rows containing one of multiple column values
To select Pandas rows that contain one of multiple column values, we use pandas.DataFrame.isin(values), which returns a DataFrame of Boolean values representing whether each element in the DataFrame is contained in values. The DataFrame of Boolean values obtained in this way can be used to select rows.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
sales = [200, 300, 400, 200, 300, 300]
prices = [3, 1, 2, 4, 3, 2]
df = pd.DataFrame({"Date": dates, "Sales": sales, "Price": prices})
values = [200, 400]
filtered_df = df[df.Sales.isin(values)]
print(filtered_df)
Output:
Date Sales Price
0 April-10 200 3
2 April-12 400 2
3 April-13 200 4
It filters out all rows in the DataFrame Sales
where the value of is 200
or .400
Select Pandas rows that do not contain one of multiple specified column values
To select the rows of the DataFrame that do not contain any of multiple specified column values, we will negate ~
the value returned from pandas.DataFrame.isin(values) by placing the - sign booleans
in front.DataFrame
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
sales = [200, 300, 400, 200, 300, 300]
prices = [3, 1, 2, 4, 3, 2]
df = pd.DataFrame({"Date": dates, "Sales": sales, "Price": prices})
values = [200, 400]
filtered_df = df[~df.Sales.isin(values)]
print(filtered_df)
Output:
Date Sales Price
1 April-11 300 1
4 April-14 300 3
5 April-16 300 2
It filters all rows from the DataFrame whose sales value is neither 200 nor 400.
Select DataFrame rows with multiple conditions
If we want to filter rows considering row values of multiple columns, we create multiple conditions and &
combine them with the operator. Now, the row is selected only if the conditions of all the columns are met.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
sales = [200, 300, 400, 200, 300, 300]
prices = [3, 1, 2, 4, 3, 2]
df = pd.DataFrame({"Date": dates, "Sales": sales, "Price": prices})
values_1 = [200, 400]
values_2 = [2, 3]
filtered_df = df[df.Sales.isin(values_1) & ~df.Price.isin(values_2)]
print(filtered_df)
Output:
Date Sales Price
3 April-13 200 4
It filters all rows from the DataFrame where sales value is 200
or 400
and price is 2
or 3
. The rows in the output satisfy only those two conditions in the entire DataFrame.
For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.
Related Articles
Convert Tensor to NumPy array in Python
Publish Date:2025/05/03 Views:85 Category:Python
-
This tutorial will show you how to convert a Tensor to a NumPy array in Python. Use the function in Python Tensor.numpy() to convert a tensor to a NumPy array Eager Execution of TensorFlow library can be used to convert tensor to NumPy arra
Saving NumPy arrays as images in Python
Publish Date:2025/05/03 Views:193 Category:Python
-
In Python, numpy module is used to manipulate arrays. There are many modules available in Python that allow us to read and store images. An image can be thought of as an array of different pixels stored at specific locations with correspond
Transposing a 1D array in NumPy
Publish Date:2025/05/03 Views:98 Category:Python
-
Arrays and matrices form the core of this Python library. The transpose of these arrays and matrices plays a vital role in certain topics such as machine learning. In NumPy, it is easy to calculate the transpose of an array or a matrix. Tra
Find the first index of an element in a NumPy array
Publish Date:2025/05/03 Views:58 Category:Python
-
In this tutorial, we will discuss how to find the first index of an element in a numpy array. Use where() the function to find the first index of an element in a NumPy array The function in the numpy module where() is used to return an arra
Remove Nan values from NumPy array
Publish Date:2025/05/03 Views:118 Category:Python
-
This article discusses some built-in NumPy functions that you can use to remove nan values. Remove Nan values using logical_not() and methods in NumPy isnan() logical_not() is used to apply logical NOT to the elements of an array. isn
Normalizing a vector in Python
Publish Date:2025/05/03 Views:51 Category:Python
-
A common concept in the field of machine learning is to normalize a vector or dataset before passing it to the algorithm. When we talk about normalizing a vector, we say that its vector magnitude is 1, being a unit vector. In this tutorial,
Calculating Euclidean distance in Python
Publish Date:2025/05/03 Views:128 Category:Python
-
In the world of mathematics, the shortest distance between two points in any dimension is called the Euclidean distance. It is the square root of the sum of the squares of the differences between the two points. In Python, the numpy, scipy
Element-wise division in Python NumPy
Publish Date:2025/05/03 Views:199 Category:Python
-
This tutorial shows you how to perform element-wise division on NumPy arrays in Python. NumPy Element-Wise Division using numpy.divide() the function If we have two arrays and want to divide each element of the first array with each element
Convert 3D array to 2D array in Python
Publish Date:2025/05/03 Views:79 Category:Python
-
In this tutorial, we will discuss the methods to convert 3D array to 2D array in Python. numpy.reshape() Convert 3D array to 2D array using function in Python [ numpy.reshape() Function](numpy.reshape - NumPy v1.20 manual)Changes the shape