JIYIK CN >

Current Location:Home > Learning > PROGRAM > Python >

Difference between Shallow Copy and Deep Copy in Pandas DataFrame

Author:JIYIK Last Updated:2025/05/03 Views:

This tutorial post will explain the difference between shallow and deep copy in Pandas Dataframe.

When we want to add, remove, or update a DataFrame, we can make a copy and perform the operation without modifying the DataFrame.


Difference between Shallow and Deep Copy in Pandas Dataframes

There are many differences between shallow and deep copies in Pandas Dataframes. This article will provide two of these differences.

Dataframe.copy()You can see the syntax used for Python Pandas functions below .

DataFrame.copy(deep=True)

DeepRepresents a Boolean value (True or False), and the default is True. Pandas has two ways of shallow copying and deep copying of data structures. First, we discuss shallow copying.

Creating Shallow Copies in Pandas DataFrames is Faster Than Creating Deep Copies

Deep=FalseDoes not copy the index or data of the original object. Use df.copy(deep=False)the method to make a shallow copy of a Pandas DataFrame.

It means creating a new collection object and then populating it with references to the original sub-objects. Because the copy operation is not recursive, no copies of the sub-objects are created.

As opposed to deep copies, creating shallow copies is faster.

pandas.DataFrame.copy(deep=False)

Import the Python Pandas library for this purpose.

import pandas as pd

After importing the Pandas library, assign a DataFrame.

df = pd.DataFrame([5, 6, 7, 8, 9])
print(df)

Output:

   0
0  5
1  6
2  7
3  8
4  9

Now use the id and see what happens.

>>> id(df1)

Output:

140509987701904

Create a variable df2and store df1and view df2the ID of .

>>> df2 = df1
>>> id(df2)

Output:

140509987701904

df2df1Now, use the copy function to see if the id changes .

>>> df3 = df1.copy()
>>> id(df3)

Check out the output below to see the changes.

Output:

140509924069968

Shallow copy:

>>> df4 = df1.copy(deep=False)
>>> print(df4)
>>> id(df4)

Output:

   0
0   6
1   7
2   8
3   9
4  10
140509923248976

Deep copy:

Deep=True(Default) Generates a new object with a copy of the calling object's data and indices. Changes to the copy's data or indices will not be reflected in the original object.

Use df.copy(deep=False)the shallow copy method to make a shallow copy of a Pandas Dataframe. A copy of an object is copied into another object in a deep copy.

It means that any modifications made to the copy of the object will not be reflected in the original object. Creating a deep copy takes longer than creating a shallow copy.

>>> df4 = df1.copy(deep=True)
>>> print(df4)
>>> id(df4)

Output:

    0
0   6
1   7
2   8
3   9
4  10
140509923248720

The two ids are different. Let's take another example to see the difference between shallow copy and deep copy.

Shallow copies depend on the original

import pandas as pd

df = pd.DataFrame({"in": [1, 2, 3, 4], "Maria": ["Man", "kon", "nerti", "Ba"]})
copydf = df.copy(deep=False)
print("\nBefore Operation:\n", copydf == df)
copydf["in"] = [0, 0, 0, 0]
print("\nAfter Operation:\n", copydf == df)
print("\nAfter operation original dataframe:\n", df)

Output:

Before Operation:
      in  Maria
0  True   True
1  True   True
2  True   True
3  True   True

After Operation:
      in  Maria
0  True   True
1  True   True
2  True   True
3  True   True

After operation original dataframe:
    in  Maria
0   0    Man
1   0    kon
2   0  nerti
3   0     Ba

As shown in the output of the previous program, the modifications made to the shallow copy of DataFrame are automatically applied to the original series. Now use the same code; change the deep copy deep=Trueof

Deep copy does not completely depend on the original

import pandas as pd

df = pd.DataFrame({"in": [1, 2, 3, 4], "Maria": ["Man", "kon", "nerti", "Ba"]})
copydf = df.copy(deep=True)
print("\nBefore Operation:\n", copydf == df)
copydf["in"] = [0, 0, 0, 0]
print("\nAfter Operation:\n", copydf == df)
print("\nAfter operation original dataframe:\n", df)

Output:

Before Operation:
      in  Maria
0  True   True
1  True   True
2  True   True
3  True   True

After Operation:
       in  Maria
0  False   True
1  False   True
2  False   True
3  False   True

After operation original dataframe:
    in  Maria
0   1    Man
1   2    kon
2   3  nerti
3   4     Ba

In this case, the data contained in the original object is not recursively copied. The data contained in the original object data still points to the same memory location.

For example, if the data in a Series object is mutable, it will be shared between it and its deep copy, and any changes to one will be reflected in the other.

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL:

Related Articles

Convert Tensor to NumPy array in Python

Publish Date:2025/05/03 Views:85 Category:Python

This tutorial will show you how to convert a Tensor to a NumPy array in Python. Use the function in Python Tensor.numpy() to convert a tensor to a NumPy array Eager Execution of TensorFlow library can be used to convert tensor to NumPy arra

Saving NumPy arrays as images in Python

Publish Date:2025/05/03 Views:193 Category:Python

In Python, numpy module is used to manipulate arrays. There are many modules available in Python that allow us to read and store images. An image can be thought of as an array of different pixels stored at specific locations with correspond

Transposing a 1D array in NumPy

Publish Date:2025/05/03 Views:98 Category:Python

Arrays and matrices form the core of this Python library. The transpose of these arrays and matrices plays a vital role in certain topics such as machine learning. In NumPy, it is easy to calculate the transpose of an array or a matrix. Tra

Find the first index of an element in a NumPy array

Publish Date:2025/05/03 Views:58 Category:Python

In this tutorial, we will discuss how to find the first index of an element in a numpy array. Use where() the function to find the first index of an element in a NumPy array The function in the numpy module where() is used to return an arra

Remove Nan values from NumPy array

Publish Date:2025/05/03 Views:118 Category:Python

This article discusses some built-in NumPy functions that you can use to remove nan values. Remove Nan values ​​using logical_not() and methods in NumPy isnan() logical_not() is used to apply logical NOT to the elements of an array. isn

Normalizing a vector in Python

Publish Date:2025/05/03 Views:51 Category:Python

A common concept in the field of machine learning is to normalize a vector or dataset before passing it to the algorithm. When we talk about normalizing a vector, we say that its vector magnitude is 1, being a unit vector. In this tutorial,

Calculating Euclidean distance in Python

Publish Date:2025/05/03 Views:128 Category:Python

In the world of mathematics, the shortest distance between two points in any dimension is called the Euclidean distance. It is the square root of the sum of the squares of the differences between the two points. In Python, the numpy, scipy

Element-wise division in Python NumPy

Publish Date:2025/05/03 Views:199 Category:Python

This tutorial shows you how to perform element-wise division on NumPy arrays in Python. NumPy Element-Wise Division using numpy.divide() the function If we have two arrays and want to divide each element of the first array with each element

Convert 3D array to 2D array in Python

Publish Date:2025/05/03 Views:79 Category:Python

In this tutorial, we will discuss the methods to convert 3D array to 2D array in Python. numpy.reshape() Convert 3D array to 2D array using function in Python [ numpy.reshape() Function](numpy.reshape - NumPy v1.20 manual)Changes the shape

Scan to Read All Tech Tutorials

Social Media
  • https://www.github.com/onmpw
  • qq:1244347461

Recommended

Tags

Scan the Code
Easier Access Tutorial