JIYIK CN >

Current Location:Home > Learning > PROGRAM > Python >

How to Shuffle DataFrame Rows in Pandas

Author:JIYIK Last Updated:2025/04/30 Views:

We can use sample()the shuffle method of the Pandas Dataframe object, the shuffle function from the NumPy module permutation(), and the shuffle function from the sklearn package shuffle()to shuffle the rows of a Pandas DataFrame.


pandas.DataFrame.sample()Methods to Shuffle Rows in Pandas DataFrame

pandas.DataFrame.sample() can be used to return a random sample of items starting from an axis of a DataFrame object. We need to axisset the parameter to 0 because we need to sample elements row-wise, which is axisthe default value of the parameter.

fracThe parameter determines what fraction of the total number of instances to return. If you want a random order, fracset the value of to 1.

import pandas as pd

dates = ["April-10", "April-11", "April-12", "April-13"]
fruits = ["Apple", "Papaya", "Banana", "Mango"]
prices = [3, 1, 2, 4]

df = pd.DataFrame({"Date": dates, "Fruit": fruits, "Price": prices})
print(df)

df_shuffled = df.sample(frac=1).reset_index(drop=True)
print(df_shuffled)

Output:

       Date   Fruit  Price
0  April-10   Apple      3
1  April-11  Papaya      1
2  April-12  Banana      2
3  April-13   Mango      4
       Date   Fruit  Price
3  April-13   Mango      4
2  April-12  Banana      2
0  April-10   Apple      3
1  April-11  Papaya      1

As shown above, Dataframe.shuttlethe method shuffles the rows of a Pandas DataFrame. The index of the DataFrame rows is the same as the initial index.

We can add a reset_index() method to reset DataFramethe index.

import pandas as pd

dates = ["April-10", "April-11", "April-12", "April-13"]
fruits = ["Apple", "Papaya", "Banana", "Mango"]
prices = [3, 1, 2, 4]

df = pd.DataFrame({"Date": dates, "Fruit": fruits, "Price": prices})
print(df)

df_shuffled = df.sample(frac=1).reset_index(drop=True)
print(df_shuffled)

Output:

       Date   Fruit  Price
0  April-10   Apple      3
1  April-11  Papaya      1
2  April-12  Banana      2
3  April-13   Mango      4
       Date   Fruit  Price
0  April-11  Papaya      1
1  April-13   Mango      4
2  April-10   Apple      3
3  April-12  Banana      2

Here, drop=Truethe option prevents indexthe column from being added as a new column.


numpy.random.permutation() randomly permutes Pandas DataFrame rows

We can use numpy.random.permutation() to permute the index of a DataFrame. When iloc()the randomly permuted index is used to select rows using the method, we will get randomly permuted rows.

import pandas as pd
import numpy as np

dates = ["April-10", "April-11", "April-12", "April-13"]
fruits = ["Apple", "Papaya", "Banana", "Mango"]
prices = [3, 1, 2, 4]

df = pd.DataFrame({"Date": dates, "Fruit": fruits, "Price": prices})

df_shuffled = df.iloc[np.random.permutation(df.index)].reset_index(drop=True)
print(df_shuffled)

Output:

       Date   Fruit  Price
0  April-13   Mango      4
1  April-12  Banana      2
2  April-10   Apple      3
3  April-11  Papaya      1

You may get different results when running the same code. This is because np.random.permutation()the function generates a different arrangement of numbers each time.


sklearn.utils.shuffle()Shuffle Pandas DataFrame Rows

We can also use sklearn.utils.shuffle() to shuffle the rows of a Pandas DataFrame.

import pandas as pd
import numpy as np
import sklearn

dates = ["April-10", "April-11", "April-12", "April-13"]
fruits = ["Apple", "Papaya", "Banana", "Mango"]
prices = [3, 1, 2, 4]

df = pd.DataFrame({"Date": dates, "Fruit": fruits, "Price": prices})

df_shuffled = sklearn.utils.shuffle(df)
print(df_shuffled)

Output:

       Date   Fruit  Price
3  April-13   Mango      4
0  April-10   Apple      3
1  April-11  Papaya      1
2  April-12  Banana      2

If you don't have sklearnthe package installed, you can install it using the following script:

pip install -U scikit-learn

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL:

Related Articles

Pandas DataFrame DataFrame.query() function

Publish Date:2025/04/30 Views:107 Category:Python

The pandas.DataFrame.query() method filters the rows of the caller DataFrame using the given query expression. pandas.DataFrame.query() grammar DataFrame . query(expr, inplace = False , ** kwargs) parameter expr Filter rows based on query e

Pandas DataFrame DataFrame.min() function

Publish Date:2025/04/30 Views:162 Category:Python

Python Pandas DataFrame.min() function gets the minimum value of the DataFrame object along the specified axis. pandas.DataFrame.min() grammar DataFrame . mean(axis = None , skipna = None , level = None , numeric_only = None , ** kwargs) pa

Pandas DataFrame DataFrame.mean() function

Publish Date:2025/04/30 Views:85 Category:Python

Python Pandas DataFrame.mean() function calculates the mean of the values ​​of the DataFrame object over the specified axis. pandas.DataFrame.mean() grammar DataFrame . mean(axis = None , skipna = None , level = None , numeric_only = No

Pandas DataFrame DataFrame.isin() function

Publish Date:2025/04/30 Views:133 Category:Python

The pandas.DataFrame.isin(values) function checks whether each element in the caller DataFrame contains values the value specified in the input . pandas.DataFrame.isin(values) grammar DataFrame . isin(values) parameter values iterable - lis

Pandas DataFrame DataFrame.groupby() function

Publish Date:2025/04/30 Views:161 Category:Python

pandas.DataFrame.groupby() takes a DataFrame as input and divides the DataFrame into groups based on a given criterion. We can use groupby() the method to easily process large datasets. pandas.DataFrame.groupby() grammar DataFrame . groupby

Pandas DataFrame DataFrame.fillna() function

Publish Date:2025/04/30 Views:60 Category:Python

The pandas.DataFrame.fillna() function replaces the values DataFrame ​​in NaN with a certain value. pandas.DataFrame.fillna() grammar DataFrame . fillna( value = None , method = None , axis = None , inplace = False , limit = None , down

Pandas DataFrame DataFrame.dropna() function

Publish Date:2025/04/30 Views:181 Category:Python

The pandas.DataFrame.dropna() function removes null values ​​(missing values) from a DataFrame by dropping rows or columns that contain null values DataFrame . NaN ( Not a Number ) and NaT ( Not a Time ) represent null values. DataFrame

Pandas DataFrame DataFrame.assign() function

Publish Date:2025/04/30 Views:55 Category:Python

Python Pandas DataFrame.assign() function assigns new columns to DataFrame . pandas.DataFrame.assign() grammar DataFrame . assign( ** kwargs) parameter **kwargs Keyword arguments, DataFrame the column names to be assigned to are passed as k

Pandas DataFrame DataFrame.transform() function

Publish Date:2025/04/30 Views:120 Category:Python

Python Pandas DataFrame.transform() DataFrame applies a function on and transforms DataFrame . The function to be applied is passed as an argument to the function. The axis length of transform() the transformed should be the same as the ori

Scan to Read All Tech Tutorials

Social Media
  • https://www.github.com/onmpw
  • qq:1244347461

Recommended

Tags

Scan the Code
Easier Access Tutorial