This post is a continuation of dataclass decorator series. It is strongly suggested that you go ahead with this post after understanding Part 1 and Part 2 of this series.
In this post, we will understand how various properties of attributes in the dataclass object can be changed with the help of the field
function. This, in turn, would change how the class instances behave, in short, the field
function gives more control to the user while creating a dataclass object.
This function is used for controlling class attributes present in the dataclass, like providing default values, including or excluding any particular attribute in __repr__
method, or including/excluding any class attribute in __init__
method or not for class objects etc.
Syntax of the field function
Following is the syntax for the dataclass field function,
dataclasses.field(*, default, default_factory, repr, hash, init, compare, metadata)
The field function has the following parameters:
Parameters of Field Function
1. default Parameter
If no value is specified during the creation of an object in a dataclass for a particular class attribute, we can use the field
function to provide a default value for the class attribute. In the below example, an instance is created and it is supplied only one value, whereas the definition of the class has two attributes. Among these two attributes, one variable has been supplied with a default value using the field
function's default
parameter which is used by the newly instantiated object.
Time for an example:
from dataclasses import dataclass, field
@dataclass
class data_class:
value : int
title: str = field(default = 'Python3')
class_instance_1 = data_class("10")
print(class_instance_1)
Output:
data_class(value='10', title='Python3')
2. default_factory Parameter
Using this parameter, we can provide a callable(function etc which returns a value) as the default value which acts as a factory method to create a default value for that specific class attribute which has the field
function with the default_factory
parameter. If a value is provided to this parameter, it should be a non-zero value that is called when a default value is needed.
A value to either default
or default_factory
should be provided, not to both. It would be an error if a value is provided to both the default
and default_factory
parameters.
Time for an example:
from dataclasses import dataclass, field
from random import choice
# function
def get_language_version():
languages = ['Python3', 'Python2']
return choice(languages)
@dataclass
class data_class:
value : int
title: str
language_version: str = field(default_factory = get_language_version)
class_instance_1 = data_class("10", 'Studytonight')
print(class_instance_1)
Output:
data_class(value='10', title='Studytonight', language_version='Python3')
Note: A MISSING value can be specified to the default
and default_factory
parameters. This will flag the two parameters and indicate that they have not been provided to the field function. Any code written after this shouldn't be able to directly access these MISSING values.
Your question might be, Why not use None
instead of MISSING
?
The answer is that None
can't be used because it is a valid value that indicates a default value.
3. init Parameter
Its default value is True.
If it is True, then that particular class attribute for which field
function is used with init
parameter as True, is passed as a parameter to the __init__
method, that would be generated for the dataclass.
If this value is False, a technique has to be defined to set default values for that particular class attribute.
4. repr Parameter
Its default value is True.
If it is True, then that particular class attribute for which field
function is used with repr
parameter as True, is included in the string which is returned by the default __repr__
method of the dataclass.
If it is supplied with a False value, then a method to print the values for that attribute has to be defined.
5. compare Parameter
Its default value is True.
If it is set to True for any class attribute, then it is included in the methods which are generated (compare and equality methods) for comparing dataclass objects. Otherwise, separate methods have to be defined for comparison operators for comparing values of the attributes in class objects for which compare
parameter is set as False.
Time for an example:
from dataclasses import dataclass, field
@dataclass
class data_class:
title: str = field(compare = False)
name: str = field(repr = False)
language : str = field(default = 'Python3')
value : int = field(init = False, default = '12')
class_instance_1 = data_class('Dataclass', 'Studytonight')
class_instance_2 = data_class("Dataclass", "Studytonight")
print(class_instance_1)
print(class_instance_2)
print(class_instance_1.value)
print(class_instance_2.language)
print(class_instance_1 != class_instance_2)
Output:
data_class(title='Dataclass', language='Python3', value='12')
data_class(title='Dataclass', language='Python3', value='12')
12
Python3
False
6. hash Parameter
It can have a bool
or a None
value. It is strongly suggested that it is set to None
value.
If it is set to True, it generates a __hash__
method in which the class attribute is included. In general, this is used when two or more objects need to be compared in a class.
If it is set to None
, the value of the compare parameter is used.
7. metadata Parameter
It is usually a key-value pair (a dictionary) that provides the mapping of data and more information about the class attribute/field. This parameter is not used extensively but it is essential if the dataclass is used in production code when this data could potentially be required or accessed by third-party applications.
Time for an example:
from dataclasses import dataclass, field
@dataclass
class data_class:
title: str = field(compare = False)
name: str = field(metadata = {'Website' : 'Studytonight'})
language : str = field(default = 'Python3')
value : int = field(init = False, default = '12')
class_instance_1 = data_class('Dataclass', 'Studytonight')
print(class_instance_1)
print(class_instance_1.__dataclass_fields__['name'].metadata)
Output:
data_class(title='Dataclass', name='Studytonight', language='Python3', value='12')
{'Website': 'Studytonight'}
Conclusion
The field function in Python's data class module is a powerful tool for customising the behaviour of class attributes. With the field function, developers can specify attributes such as default values, type hints, and more. This allows for more efficient and concise code, as well as improved data validation and error handling. Additionally, the field function works seamlessly with other Python features, such as decorators and inheritance. By mastering the field function, developers can create more robust and efficient data classes in their Python applications.
Frequently Asked Questions(FAQs)
1. What is the "field" function in Python data classes?
The "field" function is a decorator in Python data classes that allows the user to specify additional metadata about a class field, such as default values, type hints, and validation rules.
2. How do you use the "field" function in Python data classes?
To use the "field" function, you decorate the class attribute with "@dataclasses.field()", and pass in any optional parameters you want to specify, such as the default value or type hint.
3. What are some optional parameters you can specify with the "field" function?
Some optional parameters you can specify with the "field" function include the default value, the type hint, the field name, the field's default factory function, and various validation rules.
4. What are some use cases for the "field" function in Python data classes?
The "field" function can be used to define default values for fields, enforce data types, add validation rules, and customize serialization and deserialization behavior. It is particularly useful when working with large, complex data structures.
You may also like: