Threaded Search Trees

In many applications, binary tree traversals are carried out repeatedly, and the overhead of stack operations - the implicit ones carried out by the runtime system (when a recursively coded traversal is being executed), or the explicit pushes/pops of a non-recursive (but stack-driven) traversal procedure - is costly in terms of program execution time. (The traversal is O(n) in terms of computational effort, but the stack operations introduce overhead (a larger run-time constant) than might be desirable.) It is sometimes useful to modify the data structure (of pointers and records) which represents the binary tree so as to speed up - say - the inorder traversal process: make it "stack-free". Certainly this would expedite - for example - a sorting method which (1) builds a binary search tree from a sequence of data keys, and (2) carries out the inorder traversal of the tree, thereby printing the keys in key order.

Oddly, most of the pointer fields in our representation of binary trees are NULL! After all, since every node (except the root) is pointed to, there are only n-1 nonNULL pointers out of a possible 2n (for an n node tree), so that n+1 pointers are NULL. The threaded tree data structure will replace these NULL pointers with pointers to the inorder successor (predecessor) of a node as appropriate. Of course, we'll need to know whenever formerly NULL pointers have been replaced by nonNULL pointers to successor/predecessor nodes, since otherwise there's no way to distinguish those pointers from the customary pointers to children. Here are the datatypes for threaded tree nodes:



           typedef enum {child, thread} thrd_bit;

	   typedef struct nd {
               char data;      /* single-character data, say */
	       struct nd *L, *R;
	       thrd_bit LTH, RTH;
           } tt_node;
We will assign thread to a LTH/RTH field if the pointer in that record is a predecessor/successor pointer, and child if the pointer is simply a structural pointer (identifying a child of a node in the usual way.) Before we discuss how to build a threaded tree, let's suppose that we have a threaded tree in place, and see how we can utilize it for fast traversal. We note that (somehow, to be discussed) any node with a NULL left pointer in the original binary tree now contains a (nonNULL) pointer to the inorder predecessor of the node (and the node's LTH field is set to thread, and any node with a NULL right pointer now has a pointer in place that identifies the inorder successor of that node (and the RTH value is thread. (We'll also put off for later discussion the matter of : who is the predecessor of the left-most node in the binary tree, and who is the successor of the right-most node?) Let's use the thread information to build a simple (non-recursive) function to find the inorder successor of any node in a threaded binary tree. If we are "at" a node pointed to by p, where do we look for it's inorder successor? The successor is in the right subtree (if there is one): in fact, it's the leftmost node in the right subtree. On the other hand, if the right subtree is empty (the R pointer used to be NULL), then we simply follow the successor thread that was installed there (by a process we still have to discuss). So:


              tt_node *next_inorder(tt_node *p){
                   if(p->RTH == thread) return(p->R);
                   else {
                      p = p->R;
                      while(p->LTH == child)
                        p = p->L;
                      return p;
                   }
              }
Now, if we can get things started correctly, we can simply call next_inorder repeatedly (in a simple loop) and move rapidly around the tree inorder printing node labels (say) - without a stack. Note that if we call next_inorder with the root of the binary tree, we're going to have some difficulty. The code won't work at all the way we want. One neat way out of this is to establish a "dummy" node as the root of a BT with only a left subtree - namely the actual binary tree we're interested in, and to have the threading process (to be discussed) use the dummy as the predecessor of the leftmost node in the "real" tree (and the successor of the rightmost node in the "real" tree). We arrange the LTH and RTH bit of the dummy to be child, and - here's the clever "fix" - have R point to the dummy itself! Now when next_inorder is called with a pointer to the dummy node, it runs down to the left and stops when it encounters the thread bit on the leftmost node, so that node becomes the first node visited inorder - as it should be. From here on, everything works fine with iterative calls, until we get to the rightmost node in the tree, where next_inorder takes us back to the dummy node. The code below "traps" this, and terminates the traversal.

        void fast_inorder(tt_node *p) {
            while((p = next_inorder(p)) != dummy)
               printf("%c", p->data);
        }
and we call fast_inorder with dummy. How are threaded trees constructed? One way (sketched out below) is to thread the tree as it gets built, one node at a time (an incremental/dynamic approach). Another is to convert an existing binary tree to a threaded tree (an in-place approach. Can you think of how this might be done?) The first - incremental - approach is similar to adding a new node to a binary search tree. Suppose we have "treesearched" our way down the tree to look for the place where the new node is to be attached (as a new leaf) to the tree-so-far. Let p and t point to threaded_tree nodes, where t identifies the new node (t = malloc(sizeof(tt_node))), and p points to the (new) parent of the new node. Suppose that *t is the left child of *p (in the opposite case, the code below is converted by exchanging right with left). If the tree were not threaded, p->L would be NULL, but in a threaded tree (no NULL pointers) p->LTH will have the value thread, and p->L points to the inorder predecessor of *p. As usual, we write to the fields of the new node before we attach it to the tree:


                  t->L = p->L;         /* copy the thread */
                  t->LTH = thread;
                  t->R = p;            /* *p is the successor of *t */
                  t->RTH = thread;
                  
                  p->L = t;            /* attach the new leaf */
                  p->LTH = child;              
To get things started correctly - to install the first node (the root - also a leaf) of the tree, we initialize the dummy node as follows:


                  tt_node *dummy = malloc(sizeof(tt_node));
                  dummy->L = dummy->R = dummy;
                  dummy->LTH = thread;
                  dummy->RTH = child;
                  dummy->data = ;          
The last assignment guarantees that the root of the "real" binary tree will be the left child of the dummy node. It is also obviously possible to write a simple inorder_predecessor routine. And, with some thought, the inorder threads can be utilized to get a preorder_successor routine, a postorder_predecessor, and also - a little more clumsily - a parent routine. Try one or more of these. Lastly, threaded trees can be built with preorder threads or with postorder threads, rather than inorder. And sometimes only "right-threaded" trees are needed: any NULL left pointers in the binary tree are left unchanged by the conversion/building processes.