The tree data structure is a widely used A.D.T (Abstract Data Type) in the field of computer science and technology to store various information in a hierarchical manner, unlike linked lists, stacks, and queues which store information linearly. The tree data structure can be thought of as an upside-down version of a natural tree, where the branches are at the downward side of the root. A tree data structure popularly known as the tree diagram consists of nodes and a head node (popularly known as the root node), a common tree data structure consists of a root node along with the parent node and its child node with root and the parent node connected to a minimum of one child node, above diagram represents a tree data structure with root node having three children, a total number of 9 nodes and 8 edges.
Conditions Of A Data Structure To Be A Tree:
A data structure is said to be a tree data structure if it obeys the following conditions:-
- It should not contain any cycle (i.e., it should not have any self-loop).
- The direction of the edge or the link should be from the parent node to the child node.
- The total number of edges of a tree must be (
n - 1
) where ‘n’ is the total number of nodes in a tree.
Terminologies Related To Tree Data Structure:
- Root Node: The topmost node of a tree data structure is known as the root node it is also known as the head node of a tree data structure, it is the only node that does not have any parent node or any ancestors, in the above diagram node with data 1 is the root node, if a root node has only one child then it is known as a skew tree or the degenerate tree.
- Edge: It is defined as the link between the parent node and the child node. It is always directed from the parent node to the child node, and the total number of edges in a tree with ‘n’ nodes is equal to (
n - 1
). In the above diagram of a tree, there are total 9 nodes with a total number of 8 edges.
- Leaf Node: The terminal node or the outer node of a tree data structure is known as the leaf node, or it can also be defined as the node which does not have any child node is known as the leaf node. In the above diagram, nodes with data 4, 10, 11, and 15 are known as the leaf nodes.
- Parent Node: A parent node is defined as the node with at least one child (in the case of a tree) or at most two children (in the case of a binary tree), but will only have one parent of its own, which will be its predecessor node. Every node of a tree data structure is a parent node except the leaf nodes. In the above diagram nodes with data 1, 2, 3, 5, and 7 are the parent node of nodes with data (2, 3), (4, 5), (7), (10, 11), (15) respectively, with node with data 4 having no child node.
- Child node: The child node is defined as the node which has a predecessor node of its own, or it is just the successor of any node in the tree data structure, there may be more than one child of a parent node in a tree data structure. In a tree data structure, except for the root node, all other nodes are child nodes. In the above diagram, the nodes with data 2, 3, 4, 5, 7, 10, 11 and 15 are the child node of the parent node with data 1, 2, 3, 4, 5, and 7 respectively.
- Sibling Node: The child node which has a common parent node or when a parent node has more than one child node, then the child node to each other is known as the sibling node. For the nodes to be siblings of each other there must be at least two child nodes There may be more than one sibling in a tree data structure but a maximum of two siblings in the case of a binary tree. In the above diagram, the nodes with data (2, 3), (4, 5), and (10, 11) are siblings of each other.
- Internal Node: The internal node can be defined as the total number of non-leaf nodes existing in the given tree data structure. The internal node comprises the parent node along with the root node. They are also called the non-terminal nodes. In the given diagram the nodes with data 1, 2, 3, 5, and 7 are the internal nodes or the non-terminal nodes.
- Degree: The degree of a tree data structure is defined as the maximum number of child nodes a parent node has including the root node as a parent node and the degree of a node is defined as the total number of its child node. In the above diagram, the node with data 1 has a degree of 2, the node with data 2 also has a degree of 2, the node with data 3 has a degree of 1, the node with data 5 has a degree of 2, and node with data 7 has a degree of 1 and node with data 4, 10 and 15 have a degree of 0.
- Level: The level of a node in the tree data structure is defined as the position of the node from the root node where it depends on the reader and the writer to take the level of the root node as 0 or 1, either of them is applicable. The level of a node is also known as the depth of the node. Here in the above-given diagram if the level of a node with data 1 is taken as 0 then the level of other nodes with data (2, 3) is 1, (4, 5, 7) is 2, and (10, 11, 15) is 3 and if the level of a node with data 1 is taken to be 1 then the respective levels of other nodes are 2, 3 and 4.
- Height: The height of a node is defined as the position of the node from the leaf node, and the height of a tree data structure is defined as the longest distance of the root node from the deepest leaf node present in the tree data structure. In the above-given diagram, the height of the node with data 1 is 4 and the height of other nodes with data (2, 3) is 3, (4, 5, 7) is 2, and (10, 11, 15) is 1.
- Path: Path is defined as the sequence of nodes and edges in a tree data structure.
- Forest: Forest is defined as the set of n>=0 disjoint trees.
Types Of Tree Data Structure:
The various types of tree data structures are :
- Null Tree: The tree data structure with 0 or no nodes is defined as the null tree.
- Skew Tree: The tree data structure which has only one child, either on the left subtree or on the right subtree. Skew trees are divided into two sub-parts namely the left skew tree and the right skew tree in the above diagram, the left one represents a left skew tree where all the child nodes of the parent node are towards the left except for the root node and the right one represents the right skew tree where all the child node of the tree data structure is towards the right except for the root node.
Representation Of Tree Data Structure:
The tree data structure can be represented in the following two ways :
- Linked List Representation: Linked List Representation of a tree data structure is a representation of a complete tree in the form of linked lists consisting of the first field to be the data field where the data of the node is stored and the rest of the fields are allocated for the link to its child nodes. The total number of fields allocated for the link to its child node is equal to the degree of the tree data structure. In the above diagram, the degree of the tree data structure is 3 as the node with data 1 has the highest number of child nodes connected to it, and hence the total number of link fields for every node is equal to 3.
Disadvantages:
- There is a waste of lots of space memory for example in the above-given diagram the total degree of the node with data 4 is 1 (i.e., a node with data 6) so, there is a wastage of two link fields in the representation.
- Insertion/Deletion of nodes takes a lot of time and space.
- Not suited for a full or complete tree.
Advantages:
- It gives a very clear picture to the programmer about how the nodes are connected to each other.
- Programming is easy.
- Very easy to move from the parent node to the child node and vice versa.
2. Alternate Approach To Linked List Representation:
It is also one of the approaches for the representation of tree data in a structure where there are three fields namely the tag field, data field, or the down-link field, and the link field to the sibling node. If there is 0 in the tag field then the data field is occupied by the down-link i.e., It points towards the node itself and the link field points towards the child node if any, or else if there is 1 in the tag field, then the down-link field is occupied by the data field and the link field points towards its sibling node, if any.
One advantage of this representation is that there is no wastage of space memory. In the above diagram, the tag of nodes with data 1, 2, 4, 6, and 7 will be 0 as they have a child node and their data field is occupied by the down-link field to point to there the node itself and their link field points towards its child node if any and the rest nodes with data 3, 8, 9, and 10 will have the tag of 1 as they do not have any child node and the link of all nodes will point to their sibling node if it exists.