How to Shuffle DataFrame Rows in Pandas
We can use sample()the shuffle method of the Pandas Dataframe object, the shuffle function from the NumPy module permutation(), and the shuffle function from the sklearn package shuffle()to shuffle the rows of a Pandas DataFrame.
pandas.DataFrame.sample()Methods to Shuffle Rows in Pandas DataFrame
pandas.DataFrame.sample() can be used to return a random sample of items starting from an axis of a DataFrame object. We need to axisset the parameter to 0 because we need to sample elements row-wise, which is axisthe default value of the parameter.
fracThe parameter determines what fraction of the total number of instances to return. If you want a random order, fracset the value of to 1.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13"]
fruits = ["Apple", "Papaya", "Banana", "Mango"]
prices = [3, 1, 2, 4]
df = pd.DataFrame({"Date": dates, "Fruit": fruits, "Price": prices})
print(df)
df_shuffled = df.sample(frac=1).reset_index(drop=True)
print(df_shuffled)
Output:
Date Fruit Price
0 April-10 Apple 3
1 April-11 Papaya 1
2 April-12 Banana 2
3 April-13 Mango 4
Date Fruit Price
3 April-13 Mango 4
2 April-12 Banana 2
0 April-10 Apple 3
1 April-11 Papaya 1
As shown above, Dataframe.shuttlethe method shuffles the rows of a Pandas DataFrame. The index of the DataFrame rows is the same as the initial index.
We can add a reset_index() method to reset DataFramethe index.
import pandas as pd
dates = ["April-10", "April-11", "April-12", "April-13"]
fruits = ["Apple", "Papaya", "Banana", "Mango"]
prices = [3, 1, 2, 4]
df = pd.DataFrame({"Date": dates, "Fruit": fruits, "Price": prices})
print(df)
df_shuffled = df.sample(frac=1).reset_index(drop=True)
print(df_shuffled)
Output:
Date Fruit Price
0 April-10 Apple 3
1 April-11 Papaya 1
2 April-12 Banana 2
3 April-13 Mango 4
Date Fruit Price
0 April-11 Papaya 1
1 April-13 Mango 4
2 April-10 Apple 3
3 April-12 Banana 2
Here, drop=Truethe option prevents indexthe column from being added as a new column.
numpy.random.permutation() randomly permutes Pandas DataFrame rows
We can use numpy.random.permutation() to permute the index of a DataFrame. When iloc()the randomly permuted index is used to select rows using the method, we will get randomly permuted rows.
import pandas as pd
import numpy as np
dates = ["April-10", "April-11", "April-12", "April-13"]
fruits = ["Apple", "Papaya", "Banana", "Mango"]
prices = [3, 1, 2, 4]
df = pd.DataFrame({"Date": dates, "Fruit": fruits, "Price": prices})
df_shuffled = df.iloc[np.random.permutation(df.index)].reset_index(drop=True)
print(df_shuffled)
Output:
Date Fruit Price
0 April-13 Mango 4
1 April-12 Banana 2
2 April-10 Apple 3
3 April-11 Papaya 1
You may get different results when running the same code. This is because np.random.permutation()the function generates a different arrangement of numbers each time.
sklearn.utils.shuffle()Shuffle Pandas DataFrame Rows
We can also use sklearn.utils.shuffle() to shuffle the rows of a Pandas DataFrame.
import pandas as pd
import numpy as np
import sklearn
dates = ["April-10", "April-11", "April-12", "April-13"]
fruits = ["Apple", "Papaya", "Banana", "Mango"]
prices = [3, 1, 2, 4]
df = pd.DataFrame({"Date": dates, "Fruit": fruits, "Price": prices})
df_shuffled = sklearn.utils.shuffle(df)
print(df_shuffled)
Output:
Date Fruit Price
3 April-13 Mango 4
0 April-10 Apple 3
1 April-11 Papaya 1
2 April-12 Banana 2
If you don't have sklearnthe package installed, you can install it using the following script:
pip install -U scikit-learn
For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.
Related Articles
Pandas DataFrame DataFrame.query() function
Publish Date:2025/04/30 Views:107 Category:Python
-
The pandas.DataFrame.query() method filters the rows of the caller DataFrame using the given query expression. pandas.DataFrame.query() grammar DataFrame . query(expr, inplace = False , ** kwargs) parameter expr Filter rows based on query e
Pandas DataFrame DataFrame.min() function
Publish Date:2025/04/30 Views:162 Category:Python
-
Python Pandas DataFrame.min() function gets the minimum value of the DataFrame object along the specified axis. pandas.DataFrame.min() grammar DataFrame . mean(axis = None , skipna = None , level = None , numeric_only = None , ** kwargs) pa
Pandas DataFrame DataFrame.mean() function
Publish Date:2025/04/30 Views:85 Category:Python
-
Python Pandas DataFrame.mean() function calculates the mean of the values of the DataFrame object over the specified axis. pandas.DataFrame.mean() grammar DataFrame . mean(axis = None , skipna = None , level = None , numeric_only = No
Pandas DataFrame DataFrame.isin() function
Publish Date:2025/04/30 Views:133 Category:Python
-
The pandas.DataFrame.isin(values) function checks whether each element in the caller DataFrame contains values the value specified in the input . pandas.DataFrame.isin(values) grammar DataFrame . isin(values) parameter values iterable - lis
Pandas DataFrame DataFrame.groupby() function
Publish Date:2025/04/30 Views:161 Category:Python
-
pandas.DataFrame.groupby() takes a DataFrame as input and divides the DataFrame into groups based on a given criterion. We can use groupby() the method to easily process large datasets. pandas.DataFrame.groupby() grammar DataFrame . groupby
Pandas DataFrame DataFrame.fillna() function
Publish Date:2025/04/30 Views:60 Category:Python
-
The pandas.DataFrame.fillna() function replaces the values DataFrame in NaN with a certain value. pandas.DataFrame.fillna() grammar DataFrame . fillna( value = None , method = None , axis = None , inplace = False , limit = None , down
Pandas DataFrame DataFrame.dropna() function
Publish Date:2025/04/30 Views:181 Category:Python
-
The pandas.DataFrame.dropna() function removes null values (missing values) from a DataFrame by dropping rows or columns that contain null values DataFrame . NaN ( Not a Number ) and NaT ( Not a Time ) represent null values. DataFrame
Pandas DataFrame DataFrame.assign() function
Publish Date:2025/04/30 Views:55 Category:Python
-
Python Pandas DataFrame.assign() function assigns new columns to DataFrame . pandas.DataFrame.assign() grammar DataFrame . assign( ** kwargs) parameter **kwargs Keyword arguments, DataFrame the column names to be assigned to are passed as k
Pandas DataFrame DataFrame.transform() function
Publish Date:2025/04/30 Views:120 Category:Python
-
Python Pandas DataFrame.transform() DataFrame applies a function on and transforms DataFrame . The function to be applied is passed as an argument to the function. The axis length of transform() the transformed should be the same as the ori

