Dark Mode On/Off

Interactive Learning

C Language course

GO Lang course

Learn JavaScript

Learn HTML

Learn CSS

C Language

C Tutorial

C Programs (100+)

C Compiler

Execute C programs online.

C++ Language

C++ Tutorial

Standard Template Library

C++ Programs (100+)

C++ Compiler

Execute C++ programs online.

Python

Python Tutorial

Python Projects

Python Programs

Python How Tos

Numpy Module

Matplotlib Module

Tkinter Module

Network Programming with Python

Learn Web Scraping

Python Dataclass decorator - Part 1

Technology #Python#Python Dataclass

If you are already using Python 3.7, you may be aware of the new features, one of them being the dataclass. But if you haven’t updated to the new version yet, here is news for you.

Python 3.7 has introduced a new feature, the dataclass.

But wait, even in Python 3.6, the dataclasses can be implemented by installing it with the help of the following statement:

pip install dataclasses

So what is a dataclass?

Dataclasses are basically Python classes that store data objects. Data objects include (but not limited to) specific data types, like a number or a class instance.

They come with already implemented basic functionality set like instantiation, print method, and comparison of instances. Dataclass can be created by specifying the @dataclass decorator with a normal class.

Why a dataclass now?

The whole point of creating the Python language was to make it more readable. In Python, readability counts. Due to the same reason, dataclasses were created. You will see what we mean by readability in a few minutes.

How can I differentiate between a regular class and a dataclass?

This is quite simple. Python comes with a dataclass decorator (@dataclass) that indicates that the class is a dataclass. This is usually done in the following way:

from dataclasses import dataclass

@dataclass
class class_name:
    # class definition

Now let us compare a normal class and a dataclass to see what a dataclass has to offer to us.

Normal Python class

A normal class is implemented by using the class keyword followed by the name of the class.

class Website:
    def __init__(self, val):
        self.val = val

# creating class object
class_instance = Website(12)
class_instance.val

Output:

Python Dataclass

The dataclass is indicated with the help of @dataclass decorator.

from dataclasses import dataclass

@dataclass
class Website: 
    val:float

class_instance = Website(12.21)
print(class_instance)

Output:

Website(val=12.21)

Comparing the above two classes, the following can be inferred:

The usage of __init__ in the dataclass has been dismissed(not required).
The variables inside the class have been defined with their type in dataclass, as opposed to using self(representing the object of class) to declare it in normal class. This method of indicating the type of value is known as type hinting.
The output clearly shows that the value belongs to the class Website.

In addition to this, default values can be specified in the dataclass's class members.

Under the hood, the dataclass implements a __repr__() method that helps present the object of the class in a readable string format. It also implements an __eq__() method which comes into play when we compare two objects of the dataclass. We will cover this in details below.

Well, this is simple. Is this the only reason I should use a dataclass?

No, this isn't the only reason. In addition to readability, the dataclasses (as mentioned previously) have pre-implemented methods. This means such methods don't need to be explicitly defined in a dataclass.

Dataclasses can be represented in different ways. Below is a demonstration:

import dataclasses
 
@dataclasses.dataclass
# or @dataclasses.dataclass() 
class Website:
    val:int = 0

The init, repr and eq methods are set to True automatically when a dataclass is implemented. In other words, it is interpreted as follows,

@dataclasses.dataclass(init=True, repr=True, eq=True)

Let's cover about these special methods one by one.

Representation:

When we create a default class, we generally add only the __init__ method to it for initializing the object of the class,

class Website:
    def __init__(self, val):
        self.val = val

class_instance = Website(12)
print(class_instance)
print(class_instance.val)

Output:

<__main__.Website object at 0x000002C0B4FE4E80>
12

What do you understand from the above code?

You see that the value of the Website instance is 12. But what about the line, i.e <__main__.Website object at 0x000002C0B4FE4E80>?

Well that is how python displays an object of a class.

Hence making debugging tough since the object's representation utility isn't specified in a normal class. A neat representation of the data in a normal class needs to be implemented with the help of __repr__ method. See the below code to understand the implementation of the __repr__ method in a normal class.

class Website:

    def __init__(self, val):
        self.val = val

    # special method __repr__
    def __repr__(self):
        return self.val

class_instance = Website('12')
print(class_instance)

Output:

This means the __repr__ method must be explicitly defined in normal classes. On the other hand, these methods come already implemented in a dataclass.

Consider the following code of a dataclass:

From the below code, it can be seen that the __repr__ method doesn't have to be explicitly defined.

from dataclasses import dataclass

@dataclass
class data_class():
    value : int

class_instance = data_class(12)
print(class_instance)
print(type(class_instance))

Output:

data_class(value=12)
<class '__main__.data_class'>

The above functionality of representation, as well as other methods, can be included by default in a dataclass by specifying the appropriate keyword to True.

If we want to exclude the default __repr__ method from our dataclass, we can do so by using the following code:

from dataclasses import dataclass

@dataclass(repr=False)
class data_class(): 
    value:int 
    
class_instance = data_class(12)
print(class_instance)

Output:

<__main__.data_class object at 0x000002CFG4FE4E80>

Comparing Objects:

In a dataclass, the __eq__ method is implemented, which is used for equating two objects of the class.

We will compare the implementation of == (checking for the equality of two objects) in a normal class and a dataclass.

from dataclasses import dataclass

@dataclass
class data_class():
    value : int

class_instance = data_class(12)
print(class_instance)
print(type(class_instance))


class normal_class():
    def __init__(self, val):
        self.val = val

#Two objects instantiated for the dataclass
instance_one = data_class(12)
instance_two = data_class(12)

#Two objects instantiated for the normal class
instance_three = normal_class(12)
instance_four = normal_class(12)

print("DataClass Equal:", instance_one == instance_two)
print("Normal Class Equal:", instance_three == instance_four)

Output:

data_class(value=12)
<class '__main__.data_class'>
DataClass Equal: True
Normal Class Equal: False

The last two lines of this output might seem confusing. Here is the explanation for this behaviour of dataclass and normal class.

The equality operator basically checks whether both the objects refer to the same memory location. But this isn't the case since two different instances of the same class will obviously have different locations. Hence, the result is False in case of a normal class.

On the other hand, when the == is used to compare objects of a dataclass, it checks to see if the contents of both the instances of the same class are the same or not. Since both instances contain the same data, it returns True.

When a dataclass generates an __eq__ method, it compares 2 instances of the same class. This is done by comparing the attributes of one class instance (which is in the form of a tuple) with the attributes of the other instance of the class.

If we have a complex class logic and we want to implement our own logic for equating class objects for our dataclass, we can define our dataclass and specify eq=False to not include the __eq__ method by default.

Note: The ordering methods(which include <, >, <=, and >=) can be implemented by setting the keyword order to True, i.e order=True while mentioning the dataclass decorator.

Conclusion

In today's post, we understood what a dataclass is, its significance, its usage and its advantages over normal classes. In the upcoming posts, we will dive deeper into dataclasses and understand more about them.

C TUTORIAL

C PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

C++ TUTORIAL

C++ PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

PYTHON TUTORIAL

PYTHON HOW TOS

INTERVIEW TESTS

EXECUTE CODE

JAVA TUTORIAL

JAVA CODE EXAMPLES

SPRING TUTORIAL

MORE IN JAVA

COMPUTER ARCHITECTURE

COMPUTER NETWORK

OPERATING SYSTEM

DBMS & SQL

PL/SQL

MongoDB

EXECUTE SQL

ANDROID DEVELOPMENT

GO LANGUAGE

LINUX

DOCKER

HTML TAGS (A to Z)

CSS REFERENCES

SASS/SCSS

KOTLIN

GAME DEVELOPMENT

PHP

GIT GUIDE

JAVASCRIPT

ADVANCED DSA

Python Dataclass decorator - Part 1

Table of Contents

So what is a dataclass?

Why a dataclass now?

How can I differentiate between a regular class and a dataclass?

Normal Python class

Python Dataclass

Comparing the above two classes, the following can be inferred:

Well, this is simple. Is this the only reason I should use a dataclass?

Representation:

What do you understand from the above code?

Consider the following code of a dataclass:

Comparing Objects:

Conclusion

You may also like:

IF YOU LIKE IT, THEN SHARE IT

RELATED POSTS