This is one of most frequently asked questions in Python interviews since this gives the interviewer an idea of how well the interviewee has understood the language.
Let us begin with the basics. We generally use the assignment operator (=
) to copy an object from one memory location to the other. This doesn't really create a new object (which would be the copy of the original object, rather it creates a new variable that will have the reference to the object, i.e the original variable.
Below example will make it clear that using the assignment operator doesn't really copy data from original location to new location. Only the reference is shared, i.e a binding is created between the original object and the new variable.
my_list_old = [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'a']]
my_list_new = my_list_old
my_list_old[2][2] = "Studytonight"
print('The old List is', my_list_old)
print('ID of the Old List:', id(my_list_old))
print('The new List:', my_list_new)
print('ID of the new List:', id(my_list_new))
Output:
The old List is [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'Studytonight']]
ID of the Old List: 2315995030152
The new List: [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'Studytonight']]
ID of the new List: 2315995030152
Since the reference is shared, changes made to either of these variables will, in turn, reflect the change in both of them.
Sometimes, such a characteristic might not be desirable. We might want to change just the newly copied values or the original values and not both of them at a time.
This is when shallow copy and deep copy comes into play. Both these methods are used to create copies of the original object whose results are different.
The deep copy and shallow copy are present in the copy module. Let us begin by understanding the shallow copy method. The deep copy method will be covered in a different post.
To create a shallow copy of any object, the below is used:
import copy
copy.copy(object)
There is no explicit method named shallow copy. The copy
method from the copy module creates a shallow copy of the object provided as argument to it. This creates a new object that stores the reference to the elements in the original object. This means shallow copy doesn't create a new copy of the original object. Instead, it just copies the reference of the elements in the original object to a newly created object. In the end, both the objects would be referring to the same data. This process doesn't occur for every element in the original object, hence it doesn't create new copies of elements of the original object in the new object.
Time for an example:
import copy
my_list_old = [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'a']]
my_list_new = copy.copy(my_list_old)
print('The old List is', my_list_old)
print('ID of the Old List:', id(my_list_old))
print('The new List:', my_list_new)
print('ID of the new List:', id(my_list_new))
Output:
The old List is [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'a']]
ID of the Old List: 2316046585736
The new List: [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'a']]
ID of the new List: 2315995030664
The IDs of both the list objects are different. This means a new object is created that stores the reference of the original object.
This means, if anything new is added to the original object, it won't reflect in the shallow copied object. Below is an example demonstrating the same:
import copy
my_list_old = [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'a']]
my_list_new = copy.copy(my_list_old)
my_list_old.append(["I", "love", "Studytonight"])
print('The old List is', my_list_old)
print('The new List:', my_list_new)
Output:
The old List is [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'a'], ['I', 'love', 'Studytonight']]
The new List: [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'a']]
But if values in the original object are changed, it will be reflected in the new object too. Understand this concept, any additions to the original object are not reflected in the new object, but changing the values of the original object would change the new object too. Below is an example demonstrating this:
import copy
my_list_old = [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'a']]
my_list_new = copy.copy(my_list_old)
print('The old List is', my_list_old)
my_list_old[1][2] = 9.99
print('The old List is', my_list_old)
print('The new List:', my_list_new)
Output:
The old List is [[1, 2, 's'], [3, 8, 9.78], [7, 8, 'a']]
The old List is [[1, 2, 's'], [3, 8, 9.99], [7, 8, 'a']]
The new List: [[1, 2, 's'], [3, 8, 9.99], [7, 8, 'a']]
Why does it work this way?
This is because both the list objects share the reference to the same data object. They don't have separate copies of the data objects inside the list. But both the objects have different id value because a new object is created, but the data stored is simply referencing to the existing data object.
Conclusion
In this post, we understood how elements of a data structure can be shallow copied. In the upcoming posts, we will understand what happens in case of deep copy.