Thursday, August 7, 2008

Binary Trees

Brief Intro

Trees are non-linear data structures. Trees are encountered frequently in everyday life. In a linked list each node has a link which points to another node. In a tree structure, however, each node may point to several other nodes (which may then point to several other nodes, etc.). Thus a tree is a very flexible and powerful data structure that can be used for a wide variety of applications. For example, suppose we wish to use a data structure to represent a person and all of his or her descendants.

Assume that the person's name is Rahul and that he has 3 children, Sanjay , Sameer and Nisha . Also suppose that Sameer has 3 children, Abhay , Ajit & Madhu and Nisha has one child Neha . We can represent Rahul and his descendants quite naturally with the tree structure shown in Figure 1.


Figure 1.

Notice that each tree node contains a name for data and one or more pointers to the other tree nodes. Although the nodes in a general tree may contain any number of pointers to the other tree nodes, a large number of data-structures have at the most two pointers to the other tree nodes. This type of a tree is called a binary tree.



Binary Trees


Let us begin our study of binary trees by discussing some basic concepts and terminology. A simple binary tree is shown in Figure 2.


Figure 2.

A binary tree is a finite set of elements that is either empty or is partitioned into three disjoint subsets. The first subset contains a single element called the root of the tree. The other two subsets are themselves binary trees, called the left and right subtrees of the original tree. A left or right subtree can be empty. Each element of a binary tree is called a node of the tree and the tree consists of nine nodes with A as its root. Its left subtree is rooted at B and its right subtree is rooted at C . This is indicated by the two branches emanating from A to B on the left and to C on the right. The absence of a branch indicates an empty subtree. For example, the left subtree of the binary tree rooted at C and the right subtree of the binary tree rooted at E are both empty. The binary trees rooted at D , G , H and I have empty right and left subtrees. Following figure illustrates some structures that are not binary trees. Be sure that you understand why each of them is not a binary tree as just defined.


Figure 3.

Let us now get used to some more terminology used in association with binary trees. If A is the root of a binary tree and B is the root of its left or right subtree, then A is said to be the father of B and B is said to be the left or right son of A . A node that has no sons (such as D , G , H , or I of Figure 2.) is called a leaf . In general any node say n1, is an ancestor of node n2 (and n2 is a descendant of n1) if n1 is either the father of n2 or the father of some ancestor of n2 . For example, in the tree of Figure 3, A is an ancestor of C . A node n2 is a left descendant of node n1 if n2 is either the left son of n1 or a descendant of the left son of n1 . A right descendant may be similary defined. Two nodes are brothers if they are left and right sons of the same father. Although natural trees grow with their roots in the ground and their leaves in the air, computer scientists almost universally portray tree data structures with the root at the top and the leaves at the bottom. The direction from the root to the leaves is "down" and the opposite direction is "up". Going from the leaves to the root is called climbing the tree, and going from the root to the leaves is called descending the tree. If every nonleaf node in a binary tree has nonempty left and right subtrees, the tree is termed a strictly binary tree . Thus the tree of Figure 4 is a strictly binary, whereas that of Figure 2 is not a strict binary tree (because nodes C and E have one son each).


Figure 4

The level of a node in a binary tree is defined as follows. The root of the tree has level 0, and the level of any other node in the tree is one more than the level of its father. For example, in the binary tree of Figure 2, node E is at level 2 and node H is at level 3. The depth of a binary tree is the maximum level of any leaf in the tree. This equals the length of the longest path from the root to any leaf. Thus the depth of the tree of Figure 2 is 3. A complete binary tree of depth d is a strictly binary tree all of whose leaves are at level d . Figure 5 illustrates the complete binary tree of depth 3.


Figure 5.

Go Top


Traversal Of A Binary Tree

The traversal of a binary tree is to visit each node in the tree exactly once. Binary tree traversal is useful in many applications. The order in which nodes of a linear list are visited is clearly from first to last. However, there is no such natural linear order for the nodes of a tree. The methods differ primarily in the order in which they visit the nodes. There are three popular methods of binary tree traversal. These methods are known as inorder traversal, preorder traversal and postorder traversal. In each of these methods nothing need be done to traverse an empty binary tree. The functions used to traverse a tree using these methods can be kept quite short if we understand the recursive nature of the binary tree. Recall that a binary tree is recursive in that each subtree is really a binary tree itself. Thus traversing a binary tree involves visiting the root node and traversing its left and right subtrees. The only difference among the methods is the order in which these three operations are performed.

To traverse a nonempty binary tree in preorder , we perform the following three operations:

  1. Visit the root.

  2. Traverse the left subtree in preorder.

  3. Traverse the right subtree in preorder.

To traverse a nonempty binary tree in inorder (or symmetric order):

  1. Traverse the left subtree in inorder.

  2. Visit the root.

  3. Traverse the right subtree in inorder.

To traverse a nonempty binary tree in postorder :

  1. Traverse the left subtree in postorder.

  2. Traverse the right subtree in postorder.

  3. Visit the root.

Many algorithms that use binary trees proceed in two phases. The first phase builds a binary tree, and the second traverses the tree. As an example of such an algorithm, consider the following sorting method. Given a list of numbers in an input file, we wish to print them in ascending order. As we read the numbers, they can be inserted into a binary tree such as the one of Figure 6. When a number is compared with the contents of a node in the tree, a left branch is taken if the number is smaller than the contents of the node and a right branch if it is greater or equal to the contents of the node. Thus if the input list is

20 17 6 8 10 20 7 18 13 12 5 6

The binary tree of Figure 6 is produced.



Figure 6

Such a binary tree has the property that all elements in the left subtree of a node n are less than the contents of n, and all elements in the right subtree of n are greater than or equal to the contents of n. A binary tree that has this property is called a Binary Search tree. If a binary search tree is traversed in inorder (left, root, right) and the contents of each node are printed as the node is visited, the numbers are printed in ascending order. Convince yourself that this is the case for the binary search tree of Figure 6. The program to implement this algorithm is given below

/* Program to implement a binary tree */

#include"alloc.h"

struct btreenode

{

struct btreenode *leftchild ;

int data ;

struct btreenode *rightchild ;

} ;

main( )

{

struct btreenode *bt ;

int req, i = 1, num ;

bt = NULL ; /* empty tree */

clrscr( ) ;

printf ( "Specify the number of data items to be inserted: " ) ;

scanf ( "%d", &req ) ;

while ( i++ <= req )

{

printf ( "Enter the data :" ) ;

scanf ( "%d", &num ) ;

insert (&bt, num ) ;

}

clrscr( ) ;

printf ( "\nInorder Traversal:" ) ;

inorder ( bt ) ;

printf ( "\nPreorder Traversal:" ) ;

preorder ( bt ) ;

printf ( "\nPostorder Traversal: " ) ;

postorder ( bt ) ;

}

/* inserts a new node in a binary search tree */

insert ( struct btreenode **sr, int num )

{

if ( *sr == NULL )

{

*sr = malloc ( sizeof ( struct btreenode ) ) ;

( *sr ) -> leftchild = NULL ;

( *sr ) -> data =num ;

( *sr ) -> rightchild = NULL ;

return ;

}

else /* search the node to which new node will be attached */

{

/* if new data is less, traverse to left */

if ( num < ( *sr ) -> data )

insert ( &( ( *sr ) -> leftchild ), num ) ;

else

/* else traverse to right */

insert (&( ( *sr ) -> rightchild ), num ) ;

}

return ;

}

/*traverse a binary search tree in a LDR (Left-Data-Right) fashion */

inorder ( struct btreenode *sr )

{

if ( sr != NULL )

{

inorder ( sr -> leftchild ) ; /* print the data of the node whose leftchild is NULL or the path

has already been traversed */

printf ( "%d ", sr-> data ) ;

inorder ( sr -> rightchild ) ;

}

else

return ;

}

/* traverse a binary search tree in a DLR (Data-Left-right) fashion */

preorder ( struct btreenode *sr )

{

if ( sr != NULL )

{

/* print the data of a node */

printf ( "%d ", sr -> data ) ; /* traverse till leftchild is not NULL */

preorder ( sr -> leftchild ) ;

/* traverse till rightchild is not NULL */

preorder ( sr -> rightchild ) ;

}

else

return ;

}

/* traverse a binary search tree in LRD (Left-Right-Data) fashion */

postorder ( struct btreenode *sr )

{

if ( sr != NULL )

{

postorder ( sr -> leftchild ) ;

postorder ( sr -> rightchild ) ;

printf ( "%d ", sr -> data ) ;

}

else

return ;

}

Go Top


Deletion From A Binary Tree

In addition to techniques for inserting data in a binary tree and traversing the tree, practical examples call for deleting data from the binary tree. Assuming that we will pass the specified data item that we wish to delete to the delete( ) function, there are four possible cases that we need to consider:

  1. No node in the tree contains the specified data.

  2. The node containing the data has no children.

  3. The node containing the data has exactly one child.

  4. The node containing the data has two children.

For case (a) we merely need to print the message that the data item is not present in the tree. In case (b) since the node to be deleted has no children the memory occupied by this should be freed and either the left link or the right link of the parent of this node should be set to NULL . Which of these to set to NULL depends upon whether the node being deleted is a left child or a right child of its parent. In case (c) since the node to be deleted has one child the solution is again rather simple. We have to adjust the pointer of the parent of the node to be deleted such that after deletion it points to the child of the node being deleted. This is illustrated in Figure 7. For case (d) in which the node to be deleted has two children the solution is more complex. Consider node C in Figure 7. Before Deletion. From the figure the inorder successor of the node C is node J. The data of this inoder successor should now be copied into the node to be deleted and a pointer should be setup to the inorder successor (node J). This inorder successor would have one or zero children. This node should then be deleted using the same procedure as for deleting a one child or a zero child node. Thus the whole logic of deleting a node with two children is to locate the inorder successor, copy its data and reducing the problem to a simple deletion of a node with one or zero children. A program to implement this deletion procedure is given below.


Figure 7.

/* Program to to delete a node form a binary search tree */

#include "alloc.h"

#define TRUE 1

#define FALSE 0

struct btreenode

{

struct btreenode *leftchild ;

int data ;

struct btreenode *rightchild ;

} ;

main( )

{

struct btreenode *bt ;

int req, i = 0, num, a[ ]= { 11, 9, 13, 8, 10, 12, 14, 15, 7 } ;

bt = NULL ;/* empty tree */

clrscr( ) ;

while ( i <= 8 )

{

insert ( &bt, a[i] ) ;

i++ ;

}

clrscr( ) ;

printf ( "Binary tree before deletion: " ) ;

inorder ( bt ) ;

delete ( &bt, 10 ) ;

printf ( "\nBinary tree after deletion: " ) ;

inorder ( bt ) ;

delete ( &bt, 14 ) ;

printf ( "\nBinary tree after deletion: " ) ;

inorder ( bt ) ;

delete ( &bt, 8 ) ;

printf ( "\nBinary tree after deletion: " ) ;

inorder ( bt ) ;

delete ( &bt, 13 ) ;

printf ( "\nBinary tree after deletion: " ) ;

inorder ( bt ) ;

}

/* inserts a new node in a binary search tree */

insert ( struct btreenode **sr, int num )

{

if ( *sr == NULL )

{

*sr = malloc ( sizeof ( struct btreenode ) ) ;

( *sr ) -> leftchild = NULL ;

( *sr ) -> data = num ;

( *sr ) -> rightchild = NULL ;

return ;

}

else /* search the node to which new node will be attached */

{

/* if new data is less, traverse to left */

if ( num < ( *sr ) -> data )

insert ( &( ( *sr ) -> leftchild ), num ) ;

else

/* else traverse to right */

insert ( &( ( *sr ) -> rightchild ), num ) ;

}

return ;

}

/* deletes a node from the binary search tree */

delete ( struct btreenode **root, int num )

{

int found ;

struct btreenode *parent, *x, *xsucc ;

/* if tree is empty */

if ( *root == NULL )

{

printf ( "\nTree is empty" ) ;

return ;

}

parent = x = NULL ;

/* call to search function to find the node to be deleted */

search ( root, num, &parent, &x, &found ) ;

/* if the node to deleted is not found */

if ( found == FALSE )

{

printf ( "\nData to be deleted, not found" ) ;

return ;

}

/* if the node to be deleted has two children */

if ( x -> leftchild != NULL && x -> rightchild != NULL)

{

parent = x ;

xsucc = x -> rightchild ;

while ( xsucc -> leftchild != NULL )

{

parent = xsucc ;

xsucc = xsucc -> leftchild ;

}

x -> data = xsucc -> data ;

x = xsucc ;

}

/* if the node to be deleted has no child */

if ( x -> leftchild == NULL && x -> rightchild == NULL )

{

if ( parent -> rightchild == x )

parent -> rightchild = NULL ;

else

parent -> leftchild = NULL ;

free ( x ) ;

return ;

}

/* if the node to be deleted has only rightchild */

if ( x -> leftchild == NULL && x -> rightchild != NULL )

{

if ( parent -> leftchild == x )

parent -> leftchild = x -> rightchild ;

else

parent -> rightchild = x -> rightchild ;

free ( x ) ;

return ;

}

/* if the node to be deleted has only left child */

if ( x -> leftchild != NULL && x -> rightchild == NULL )

{

if ( parent -> leftchild == x )

parent -> leftchild = x -> leftchild ;

else

parent -> rightchild = x -> leftchild ;

free ( x ) ;

return ;

}

}

/* returns the address of the node to be deleted, address of its parent and whether the node is found or not */

search ( struct btreenode **root, int num, struct btreenode **par, struct btreenode **x, int *found )

{

struct btreenode *q ;

q = *root ;

*found = FALSE ;

*par = NULL ;

while ( q != NULL )

{

/* if the node to be deleted is found */

if ( q -> data == num )

{

*found = TRUE ;

*x = q ;

return ;

}

if ( q -> data > num )

{

*par = q ;

q = q -> leftchild ;

}

else

{

*par = q ;

q = q -> rightchild ;

}

}

}

/* traverse a binary search tree in a LDR (Left-Data-Right) fashion */

inorder ( struct btreenode *sr )

{

if ( sr != NULL )

{

inorder ( sr -> leftchild ) ;

/* print the data of the node whose leftchild is NULL or the path has already been traversed */

printf ( "%d ", sr -> data ) ;

inorder ( sr -> rightchild ) ;

}

else

return ;

}

Go Top


Threaded Binary Trees

Both the recursive and non-recursive procedures for binary tree traversal require that pointers to all of the free nodes be kept temporarily on a stack. It is possible to write binary tree traversal procedure that do not require that any pointers to the nodes be put on the stack. Such procedures eliminate the overhead (time and memory) involved in initialising, ushing and popping the stack.

In order to get an idea of how such binary-tree-traversal procedures work, let us look at the tree in Figure 8. First we follow the left pointers until we reach node C , without, however, pushing the pointers to A , B and C onto a stack. For inorder traversal the data for node C is then printed, after which C 's right pointer is followed to node E . Then the data from node E is printed. The next step in our inorder traversal is to go back to node B and print its data; however, we did not save any pointers. But suppose that when we created the tree we had replaced the NULL right pointer in node E with a pointer back to node B . We could then easily follow this pointer back to node B (see Figure 9 ).

Similarly, suppose we replace the normal NULL right pointer of D with a pointer back up to A , as in Figure 10. Then after printing the data in D , we can easily jump up to A and print its data. These pointers which point inorder successor of a node are called right threads . Each right thread replaces a normal right pointer in a tree node. Likewise we can have left threads which point to the inorder predecessor of a node.


The only problem with threads is that the coding requires that we know whether a pointer is a normal pointer to a child or a thread that points back to a inorder successor or inorder predecessor node. One solution to this problem is to add to the data in each tree node two fields which indicate whether the left and right pointers in that node are normal pointers or threads. For example, these fields might be boolean variables, left and right . The variable right is true if the right pointer is a thread and false if it is a normal right pointer. Likewise the variable left is true if the left pointer is a thread and false if it is a normal left pointer. If we add these boolean variables to each tree node, we would make the following structure declaration for a node.

struct thtree

{

enum boolean left ;

struct thtree *leftchild ;

int data ;

struct thtree *rightchild ;

enum boolen right ;

} ;

Thus each node would contain data, a left pointer, a true or false value for left thread, a right pointer and a true or false value for right thread. The program to implement a threaded binary tree is given below. The program also shows how to insert nodes in a threaded binary tree, delete nodes from it and traverse it in inorder traversal.

/* Program to implement threaded binary tree */

#include "alloc.h"

enum boolean

{

false = 0 ,

true = 1 ,

} ;

struct thtree

{

enum boolean left ;

struct thtree *leftchild ;

int data ;

struct thtree *rightchild ;

enum boolen right ;

} ;

main( )

{

struct thtree *th_head ;

th_head = NULL ;/* empty tree */

insert ( &th_head, 11 ) ;

insert ( &th_head, 9 ) ;

insert ( &th_head, 13 ) ;

insert ( &th_head, 8 ) ;

insert ( &th_head, 10 ) ;

insert ( &th_head, 12 ) ;

insert ( &th_head, 14 ) ;

insert ( &th_head, 15 ) ;

insert ( &th_head, 7 ) ;

clrscr( ) ;

printf ( "Threaded binary tree before deletion: " ) ;

inorder ( th_head ) ;

delete ( &th_head, 10 ) ;

printf ( "\nThreaded binary tree after deletion: " ) ;

inorder ( th_head ) ;

delete ( &th_head, 14 ) ;

printf ( "\nThreaded binary tree after deletion: " ) ;

inorder ( th_head ) ;

delete ( &th_head, 8 ) ;

printf ( "\nThreaded binary tree after deletion: " ) ;

inorder ( th_head ) ;

delete ( &th_head, 13 ) ;

printf ( "\nThreaded binary tree after deletion: " ) ;

inorder ( th_head ) ;

}

/* inserts a node in a threaded binary tree */

insert ( struct thtree **s, int num )

{

struct thtree *head = *s , *p, *z ;

/* allocating a new node */

z = malloc ( sizeof ( struct thtree ) ) ;

z -> left = true ; /* indicates a thread */

z -> data = num ; /* assign new data */

z -> right = true ; /* indicates a thread */

/* if tree is empty */

if ( *s == NULL )

{

head = malloc ( sizeof ( struct thtree ) ) ;

/* the entire tree is treated as a left subtree of the head node */

head -> left = false ;

head -> leftchild = z ; /* z becomes leftchild of the head node */

head -> data = -9999 ; /* no data */

head -> rightchild = head ; /* right link will always be pointing to itself */

head -> right = false ;

*s = head ;

z -> leftchild = head ; /* left thread to head */

z -> rightchild = head ; /* right thread to head */

}

else /* if tree is non-empty */

{

p = head -> leftchild ;

/* traverse till the thread is found attached to the head */

while ( p != head )

{

if ( p -> data > num )

{

if ( p -> left != true ) /* checking for a thread */

p = p -> leftchild ;

else

{

z -> leftchild = p -> leftchild ;

p -> leftchild = z ;

p -> left = false ; /* indicates a link */

z -> right = true ;

z -> rightchild = p ;

return ;

}

}

else

{

if ( p -> data <>

{

if ( p -> right != true )

p = p -> rightchild ;

else

{

z -> rightchild = p -> rightchild ;

p -> rightchild = z ;

p -> right = false ; /* indicates a link */

z -> left = true ;

z -> leftchild = p ;

return ;

}

}

}

}

}

}

/* deletes a node from the binary search tree */

delete ( struct thtree **root, int num )

{

int found ;

struct thtree *parent, *x, *xsucc ;

/* if tree is empty */

if ( *root == NULL )

{

printf ( "\nTree is empty" ) ;

return ;

}

parent = x = NULL ;

/* call to search function to find the node to be deleted */

search ( root, num, &parent, &x, &found ) ;

/* if the node to deleted is not found */

if ( found == false )

{

printf ( "\nData to be deleted, not found" ) ;

return ;

}

/* if the node to be deleted has two children */

if ( x -> left == false && x -> right == false )

{

parent = x ;

xsucc = x -> rightchild ;

while ( xsucc -> left == false )

{

parent = xsucc ;

xsucc = xsucc -> leftchild ;

}

x -> data = xsucc -> data ;

x = xsucc ;

}

/* if the node to be deleted has no child */

if ( x -> left == true && x -> right == true )

{

if ( parent -> rightchild == x )

{

parent -> right = true ;

parent -> rightchild = x -> rightchild ;

}

else

{

parent -> left = true ;

parent -> leftchild = x -> leftchild ;

}

free ( x ) ;

return ;

}

/* if the node to be deleted has only rightchild */

if ( x -> left == true && x -> right == false )

{

if ( parent -> leftchild == x )

{

parent -> leftchild = x -> rightchild ;

x -> rightchild -> leftchild = x -> leftchild ;

}

else

{

parent -> rightchild = x -> rightchild ;

x -> rightchild -> leftchild = parent ;

}

free ( x ) ;

return ;

}

/* if the node to be deleted has only left child */

if ( x -> left == false && x -> right == true )

{

if ( parent -> leftchild == x )

{

parent -> leftchild = x -> leftchild ;

x -> leftchild -> rightchild = parent ;

}

else

{

parent -> rightchild = x -> leftchild ;

x -> leftchild -> rightchild = x -> rightchild ;

}

free ( x ) ;

return ;

}

}

/* returns the address of the node to be deleted, address of its parent and whether the node is found or not */

search ( struct thtree **root, int num, struct thtree **par, struct thtree **x, int *found )

{

struct thtree *q ;

q = ( *root ) -> leftchild ;

*found = false ;

*par = NULL ;

while ( q != root )

{

/* if the node to be deleted is found */

if ( q -> data == num )

{

*found = true ;

*x = q ;

return ;

}

if ( q -> data > num )

{

*par = q ;

q = q -> leftchild ;

}

else

{

*par = q ;

q = q -> rightchild ;

}

}

}

/* traverses the threaded binary tree in inorder */

inorder ( struct thtree *root )

{

struct thtree *p ;

p = root -> leftchild ;

while ( p != root )

{

while ( p -> left == false )

p = p -> leftchild ;

printf ( "%d ", p -> data ) ;

while ( p -> right == true )

{

p = p -> rightchild ;

if ( p == root )

break ;

printf ( "%d ", p -> data ) ;

}

p = p -> rightchild ;

}

}

And now a brief explanation about the program.

We have used an enumerated data type boolean is used to represent, if the thread is present or not. If the left is true there is a left thread means the node has no left child, if the right is true it shows the presence of right thread and the node has no right child. To insert a new node in a threaded tree, the insert( ) function is called which first checks for the empty tree. If the tree is found to be empty a head node is created and the node is joined as its left subtree with both links converted to threads by making left and right both true and leftchild and rightchild pointing back to the head node. Otherwise, the node is inserted into the tree so that a threaded binary search tree is created. Deletion of a node from the threaded binary tree is similar to that of a normal binary tree. That is, we have to identify the four possibilities about the node being deleted:

  1. No node in the tree contains the specified data.

  2. The node containing the data has no children.

  3. The node containing the data has exactly one child.

  4. The node containing the data has two children.

The treatment given to these possibilties is same as the one discussed in the previous section on binary trees except that some minor readjustment of threads.

The threaded binary tree's inorder traversal is different than a normal tree in the sense that we do not have to stack the pointers to nodes visited earlier so as to reach them later. This is avoided by using the threads to ancestors. The procedure to achieve this is as follows.

This procedure begins by first going to the left subtree of the head node using the statement

p = root -> leftchild ;

Then through a while loop we follow the leftmost pointers until a thread to a predecessor is found. On encountering this thread we print the data for the leftmost node. Next, through another while loop we check the boolean value of the right thread. If this value is true, we follow the thread back up to the ancestor node, print this ancestor node's data. This way we continue to move up till the right thread is true. When the right thread is found to be false we again proceed by going to the right child and checking it left subtree.

As we follow these steps we are sometimes likely to reach the head node, and that is the time to stop the procedure. This is what is being achieved by the statements,

if ( p == root ) break ;


No comments: