Difference between Shallow Copy and Deep Copy in Pandas DataFrame
This tutorial post will explain the difference between shallow and deep copy in Pandas Dataframe.
When we want to add, remove, or update a DataFrame, we can make a copy and perform the operation without modifying the DataFrame.
Difference between Shallow and Deep Copy in Pandas Dataframes
There are many differences between shallow and deep copies in Pandas Dataframes. This article will provide two of these differences.
Dataframe.copy()
You can see the syntax used for Python Pandas functions below .
DataFrame.copy(deep=True)
Deep
Represents a Boolean value (True or False), and the default is True. Pandas has two ways of shallow copying and deep copying of data structures. First, we discuss shallow copying.
Creating Shallow Copies in Pandas DataFrames is Faster Than Creating Deep Copies
Deep=False
Does not copy the index or data of the original object. Use df.copy(deep=False)
the method to make a shallow copy of a Pandas DataFrame.
It means creating a new collection object and then populating it with references to the original sub-objects. Because the copy operation is not recursive, no copies of the sub-objects are created.
As opposed to deep copies, creating shallow copies is faster.
pandas.DataFrame.copy(deep=False)
Import the Python Pandas library for this purpose.
import pandas as pd
After importing the Pandas library, assign a DataFrame.
df = pd.DataFrame([5, 6, 7, 8, 9])
print(df)
Output:
0
0 5
1 6
2 7
3 8
4 9
Now use the id and see what happens.
>>> id(df1)
Output:
140509987701904
Create a variable df2
and store df1
and view df2
the ID of .
>>> df2 = df1
>>> id(df2)
Output:
140509987701904
df2
df1
Now, use the copy function to see if the id changes .
>>> df3 = df1.copy()
>>> id(df3)
Check out the output below to see the changes.
Output:
140509924069968
Shallow copy:
>>> df4 = df1.copy(deep=False)
>>> print(df4)
>>> id(df4)
Output:
0
0 6
1 7
2 8
3 9
4 10
140509923248976
Deep copy:
Deep=True
(Default) Generates a new object with a copy of the calling object's data and indices. Changes to the copy's data or indices will not be reflected in the original object.
Use df.copy(deep=False)
the shallow copy method to make a shallow copy of a Pandas Dataframe. A copy of an object is copied into another object in a deep copy.
It means that any modifications made to the copy of the object will not be reflected in the original object. Creating a deep copy takes longer than creating a shallow copy.
>>> df4 = df1.copy(deep=True)
>>> print(df4)
>>> id(df4)
Output:
0
0 6
1 7
2 8
3 9
4 10
140509923248720
The two ids are different. Let's take another example to see the difference between shallow copy and deep copy.
Shallow copies depend on the original
import pandas as pd
df = pd.DataFrame({"in": [1, 2, 3, 4], "Maria": ["Man", "kon", "nerti", "Ba"]})
copydf = df.copy(deep=False)
print("\nBefore Operation:\n", copydf == df)
copydf["in"] = [0, 0, 0, 0]
print("\nAfter Operation:\n", copydf == df)
print("\nAfter operation original dataframe:\n", df)
Output:
Before Operation:
in Maria
0 True True
1 True True
2 True True
3 True True
After Operation:
in Maria
0 True True
1 True True
2 True True
3 True True
After operation original dataframe:
in Maria
0 0 Man
1 0 kon
2 0 nerti
3 0 Ba
As shown in the output of the previous program, the modifications made to the shallow copy of DataFrame are automatically applied to the original series. Now use the same code; change the deep copy deep=True
of
Deep copy does not completely depend on the original
import pandas as pd
df = pd.DataFrame({"in": [1, 2, 3, 4], "Maria": ["Man", "kon", "nerti", "Ba"]})
copydf = df.copy(deep=True)
print("\nBefore Operation:\n", copydf == df)
copydf["in"] = [0, 0, 0, 0]
print("\nAfter Operation:\n", copydf == df)
print("\nAfter operation original dataframe:\n", df)
Output:
Before Operation:
in Maria
0 True True
1 True True
2 True True
3 True True
After Operation:
in Maria
0 False True
1 False True
2 False True
3 False True
After operation original dataframe:
in Maria
0 1 Man
1 2 kon
2 3 nerti
3 4 Ba
In this case, the data contained in the original object is not recursively copied. The data contained in the original object data still points to the same memory location.
For example, if the data in a Series object is mutable, it will be shared between it and its deep copy, and any changes to one will be reflected in the other.
For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.
Related Articles
Convert Tensor to NumPy array in Python
Publish Date:2025/05/03 Views:85 Category:Python
-
This tutorial will show you how to convert a Tensor to a NumPy array in Python. Use the function in Python Tensor.numpy() to convert a tensor to a NumPy array Eager Execution of TensorFlow library can be used to convert tensor to NumPy arra
Saving NumPy arrays as images in Python
Publish Date:2025/05/03 Views:193 Category:Python
-
In Python, numpy module is used to manipulate arrays. There are many modules available in Python that allow us to read and store images. An image can be thought of as an array of different pixels stored at specific locations with correspond
Transposing a 1D array in NumPy
Publish Date:2025/05/03 Views:98 Category:Python
-
Arrays and matrices form the core of this Python library. The transpose of these arrays and matrices plays a vital role in certain topics such as machine learning. In NumPy, it is easy to calculate the transpose of an array or a matrix. Tra
Find the first index of an element in a NumPy array
Publish Date:2025/05/03 Views:58 Category:Python
-
In this tutorial, we will discuss how to find the first index of an element in a numpy array. Use where() the function to find the first index of an element in a NumPy array The function in the numpy module where() is used to return an arra
Remove Nan values from NumPy array
Publish Date:2025/05/03 Views:118 Category:Python
-
This article discusses some built-in NumPy functions that you can use to remove nan values. Remove Nan values using logical_not() and methods in NumPy isnan() logical_not() is used to apply logical NOT to the elements of an array. isn
Normalizing a vector in Python
Publish Date:2025/05/03 Views:51 Category:Python
-
A common concept in the field of machine learning is to normalize a vector or dataset before passing it to the algorithm. When we talk about normalizing a vector, we say that its vector magnitude is 1, being a unit vector. In this tutorial,
Calculating Euclidean distance in Python
Publish Date:2025/05/03 Views:128 Category:Python
-
In the world of mathematics, the shortest distance between two points in any dimension is called the Euclidean distance. It is the square root of the sum of the squares of the differences between the two points. In Python, the numpy, scipy
Element-wise division in Python NumPy
Publish Date:2025/05/03 Views:199 Category:Python
-
This tutorial shows you how to perform element-wise division on NumPy arrays in Python. NumPy Element-Wise Division using numpy.divide() the function If we have two arrays and want to divide each element of the first array with each element
Convert 3D array to 2D array in Python
Publish Date:2025/05/03 Views:79 Category:Python
-
In this tutorial, we will discuss the methods to convert 3D array to 2D array in Python. numpy.reshape() Convert 3D array to 2D array using function in Python [ numpy.reshape() Function](numpy.reshape - NumPy v1.20 manual)Changes the shape