§4 Additional Binary Tree Operations Copying Binary Trees: LRV where V ::= copy ( Program 5.6 on p.207 ) Testing for Equality of Binary Trees: VLR where V ::= test ( Program 5.7 on p.207 ) Read the two programs in the textbook and think: Is there any particular reason to choose one order over the others for the tasks?
§5 Threaded Binary Trees Let’s look at the rules first 2n links, if each node has only the left and right-child links. Here comes the typical question of mine: Why do we need threaded binary trees? Because I enjoy giving you headaches ... Just kidding. Okay, think of a full binary tree with n nodes. How many links are there? Any clue on how to improve the situation? We can replace the null links by “threads” which will make traversals easier. Then who should take the credit? They are A. J. Perlis and C. Thornton. You are such a genius ! n + 1. Oh well, I wish I’d have really done it Of course not! How many of them are NULL? You got it! Can I stand that?
§5 Threaded Binary Trees Rule 1: If ptr->left_child is null, replace it with a pointer to the inorder predecessor of ptr. Rule 2: If ptr->right_child is null, replace it with a pointer to the inorder successor of ptr. Rule 3: There must not be any loose threads. Therefore a threaded binary tree must have a head node of which the left child points to the first node. Structure typedef struct threaded_tree *threaded_ptr; typedef struct threaded_tree { short int left_thread; /* if it is TRUE, then left_child */ threaded_ptr left_child; /* is a thread, not a child ptr. */ char data; short int right_thread; /* if it is TRUE, then right_child */ threaded_ptr right_child; /* is a thread, not a child ptr. */ }
〖Example〗 Given the syntax tree of an expression (infix) A + B C D §5 Threaded Binary Trees 〖Example〗 Given the syntax tree of an expression (infix) A + B C D F + T A D B C head node + A D B C
The time complexity is still O( n ). What’s the advantage? §5 Threaded Binary Trees Inorder Traversal of a Threaded Binary Tree void tinorder ( threaded_ptr tree ) { /* traverse the threaded binary tree inorder */ threaded_ptr temp = tree ; /* start from the head node */ for ( ; ; ) { temp = insucc ( temp ); /* find the inorder successor */ if ( temp == tree ) break ; /* if back to the head node, done */ printf ( “ %3c”, temp->data ) ; /* else print it out */ } /* end for-loop */ } The time complexity is still O( n ). What’s the advantage? The stack is gone so it runs faster. threaded_ptr insucc ( threaded_ptr tree ) { /* find the inorder successor of tree in a threaded binary tree */ threaded_ptr temp ; temp = tree->right_child ; /* go to the right_child of tree */ if ( !tree->right_thread ) /* if temp is indeed a right child */ while ( !temp->left_thread ) /* while temp has a left child */ temp = temp->left_child ; /* then keep going to the next left child */ return temp ; /* temp’s left thread must points to tree */ }
Inserting a Node into a Threaded Binary Tree child->left_thread §5 Threaded Binary Trees Inserting a Node into a Threaded Binary Tree child->left_thread = child->right_thread = TRUE Case 1 A P B X parent C child We will examine only the case of inserting a new node as the right child. child->left_child = parent child->right_child = parent->right_child parent->right_thread = FALSE parent->right_child = child P A B parent C child child->right_child = parent->right_child child->right_thread = parent->right_thread child->left_child = parent child->left_thread = TRUE parent->right_child = child parent->right_thread = FALSE Case 2 insucc(child)->left_child = child
void insert_right (threaded_pointer parent, threaded_pointer child) { /* insert child as the right child of parent in a threaded binary tree */ threaded_pointer temp; child->right_child=parent->right_child; child->right_therad=parent->right_thread; child->left_child=parent; child->left_thread=TRUE; parent->right_child=child; parent->right_thread=FALSE; if ( !child->right_thread ) { temp = insucc(child); temp->left_child = child; } }
Two important binary trees Huffman Tree Binary search tree Definition Searching Insertion Deletion
霍夫曼树 (Huffman Tree) 具有不同路径长度的二叉树 路径(Path) 从树中一个节点到另一个节点之间的分支构成这两个节点之间的路径。 路径长度 (Path Length) 两个结点之间的路径长度是连接两结点的路径上的分支数。树的路径长度是各叶结点到根结点的路径长度之和。 具有不同路径长度的二叉树
带权路径长度 树的带权路径长度是树的各叶结点所带的权值与该结点到根的路径长度的乘积的和。 带权路径长度 ( Weighted Path Length, WPL ) 树的带权路径长度是树的各叶结点所带的权值与该结点到根的路径长度的乘积的和。 具有不同带权路径长度的扩充二叉树
Huffman tree 设有n个权值{ω1, ω 2,…, ω n},试构造一棵有n个叶子结点的二叉树,每个叶子结点带权为ω i,则其中带权路径长度WPL最小的二叉树称作最优二叉树或Huffman树。 在霍夫曼树中,权值大的结点离根最近。 利用霍夫曼树可以在解某些判定问题时得到最佳判定算法。
霍夫曼树的构造算法 由给定的 n 个权值{w0, w1, w2, …, wn-1},构造具有 n 棵扩充二叉树的森林F = { T0, T1, T2, …, Tn-1 },其中每一棵扩充二叉树 Ti 只有一个带有权值 wi 的根结点,其左、右子树均为空。 重复以下步骤, 直到 F 中仅剩下一棵树为止: 在 F 中选取两棵根结点的权值最小的扩充二叉树, 做为左、右子树构造一棵新的二叉树。置新的二叉树的根结点的权值为其左、右子树上根结点的权值之和。 在 F 中删去这两棵二叉树。 把新的二叉树加入 F。
霍夫曼树的构造过程
霍夫曼编码 An example 设给出一段报文: CAST CAST SAT AT A TASA字符集合是 { C, A, S, T }, 各个字符出现的频度(次数)是 W={ 2, 7, 4, 5 }。 若给每个字符以等长编码 A : 00 T : 10 C : 01 S : 11 则总编码长度为 ( 2+7+4+5 ) * 2 = 36. 若按各个字符出现的概率不同而给予不等长编码,让出现次数多的字符采用尽可能短的编码,可望减少总编码长度。 前缀编码 任一个字符的编码都不是另一个字符的编码的前缀。 对每个字符设计长度不等的编码,且让电文中出现次数较多的字符采用尽可能短的编码。 但必须是任意一个字符的编码都不是另一个字符的编码的前缀--》前缀编码。
利用Huffman树设计前缀编码 因各字符出现的概率为{ 2/18, 7/18, 4/18, 5/18 },化整为 { 2, 7, 4, 5 },以它们为各叶结点上的权值,建立霍夫曼树。左分支赋 0,右分支赋 1,得霍夫曼编码(变长编码)。 结果: A : 0 T : 10 C : 110 S : 111 它的总编码长度: 7*1+5*2+( 2+4 )*3 = 35。 比等长编码的情形要短。 如何得到使电文总长最短的二进制前缀编码? 假设每种字符在电文中出现的次数为wi,其编码长度为li,电文中有n种字符, 则电文总长为∑wili。 对应到二叉树,设wi为叶结点的权,li恰为从根结点到叶结点的路径的长度, 则∑wili恰为二叉树上带权路径长度。 由此可见,设计电文总长最短的二进制前缀编码即为以n种字符出现的频率为权,设计一棵haffman树的问题。
Binary Search Trees 【Definition】A binary search tree is a binary tree. It may be empty. If it is not empty, it satisfies the following properties: (1) Every element has a key, and the keys are unique. (2) The keys in a nonempty left subtree must be smaller than the key in the root of the subtree. (3) The keys in a nonempty right subtree must be larger than the key in the root of the subtree. (4) The left and right subtrees are also binary search trees. 30 5 2 40 60 70 80 65 20 15 12 25 10 22
#include "tree.h" #include <stdlib.h> #include "fatal.h" #ifndef _Tree_H #define _Tree_H struct TreeNode; typedef struct TreeNode *Position; typedef struct TreeNode *SearchTree; SearchTree MakeEmpty( SearchTree T ); Position Find( ElementType X, SearchTree T ); Position FindMin( SearchTree T ); Position FindMax( SearchTree T ); SearchTree Insert( ElementType X, SearchTree T ); SearchTree Delete( ElementType X, SearchTree T ); ElementType Retrieve( Position P ); #endif /* _Tree_H */ #include "tree.h" #include <stdlib.h> #include "fatal.h" struct TreeNode { ElementType Element; SearchTree Left; SearchTree Right; }; Position Find( ElementType X, SearchTree T ) { if( T == NULL ) return NULL; if( X < T->Element ) return Find( X, T->Left ); else if( X > T->Element ) return Find( X, T->Right ); else return T; } SearchTree MakeEmpty( SearchTree T ) { if( T != NULL ) { MakeEmpty( T->Left ); MakeEmpty( T->Right ); free( T ); } return NULL; }
Position FindMin( SearchTree T ) { if( T == NULL ) return NULL; else if( T->Left == NULL ) return T; else return FindMin( T->Left ); } Position FindMax( SearchTree T ) { if( T != NULL ) while( T->Right != NULL ) T = T->Right; return T; }
Inserting into a Binary Search Tree Sketch of the idea: Insert 80 30 5 2 40 check if 80 is already in the tree 80 > 40, so it must be the right child of 40 25 35 80 This is the last node we encounter when search for the key number. It will be the parent of the new node. Insert 35 check if 35 is already in the tree 35 < 40, so it must be the left child of 40 Insert 25 check if 25 is already in the tree 25 > 5, so it must be the right child of 5
重复元的插入可以通过在节点记录中保留一个附加域以指示发生的频率来处理。 SearchTree Insert( ElementType X, SearchTree T ) { if( T == NULL ) { /* Create and return a one-node tree */ T = malloc( sizeof( struct TreeNode ) ); if( T == NULL ) FatalError( “Out of space!!!” ); else { T->Element = X; T->Left = T->Right = NULL; } } else if( X < T->Element ) T->Left = Insert( X, T->Left ); else if( X > T->Element ) T->Right = Insert( X, T->Right ); /* Else X is in the tree already; we‘ll do nothing */ return T; /* Do not forget this line!! */ } 重复元的插入可以通过在节点记录中保留一个附加域以指示发生的频率来处理。
Deletion from a binary search tree Note: These kinds of nodes have degree at most 1. Delete leaf node Set the child field of its parent to NULL and free the node. Delete node that has a single child Erase the node and place the single child in the place of the erased node. Delete node that has two child Replace the node with either the largest element in its left subtree or the smallest element in its right subtree. Delete the replacing node from the subtree. 〖Example〗 Delete 60 40 50 45 55 52 60 70 20 10 30 h Tp = O( ? ) Solution 1: reset left subtree. 55 Solution 2: reset right subtree. 52 Of course you can do it by yourselves
SearchTree Delete( ElementType X, SearchTree T ) { Position TmpCell; if( T == NULL ) Error( "Element not found" ); else if( X < T->Element ) /* Go left */ T->Left = Delete( X, T->Left ); else if( X > T->Element ) /* Go right */ T->Right = Delete( X, T->Right ); else /* Found element to be deleted */ if( T->Left && T->Right ) /* Two children */ { /* Replace with smallest in right subtree */ TmpCell = FindMin( T->Right ); T->Element = TmpCell->Element; T->Right = Delete( T->Element, T->Right ); } else /* One or zero children */ { TmpCell = T; if( T->Left == NULL ) /* Also handles 0 children */ T = T->Right; else if( T->Right == NULL ) T = T->Left; free( TmpCell ); } return T; }
Height of a binary search tree Question: Place n elements in a binary search tree. How high can this tree be? Answer: The height depends on the order of insertion. 〖Example〗 Given elements 1, 2, 3, 4, 5, 6, 7. Insert them into a binary search tree in the orders: 4, 2, 1, 3, 6, 5, 7 and 1, 2, 3, 4, 5, 6, 7 4 2 3 1 4 5 6 7 6 Average h = O( log2 n ) 2 所有的操作都花费 O(d ), d 是包含所访问的关键字的节点的深度。树的所有节点的平均深度是 O (LogN)一棵树的所有节点的深度的和称为内部路径长(internal path length) D( N ) = D ( i ) + D ( N - i - 1 ) + N - 1 D ( N ) 具有 N 个节点的某棵树 T 的内部路径长 D(1) = 0, 一棵N节点树是由一棵i节点左子树和一棵(N - i - 1)右子树 =O ( N log N ), 任意节点的期望深度为 O (LogN) h = 7 1 3 5 7 h = 3 Balanced search trees will be discussed in next class
Excercises Question: Place n elements in a binary search tree. How high can this tree be?
AVL Tree AVL(adelson-velskii and landis)树是带有平衡条件的二叉查找树。 平衡条件: (1)树的深度是 O ( Log N ) , 左右子树具有相同的高度 (2)每个节点的左子树和右子树的高度最多差1 大致上讲,一个 AVL树的高度最多=1.44 log( N + 2 ) - 0.328
在高度为h的 AVL树中,最少节点数, S ( h ),由 S ( h ) = S ( h - 1 ) + S ( h - 2 ) + 1给出 在插入以后,只有那些从插入点到根节点的路径上的节点的平衡可能被改变,因为只有这些节点的子树可能发生变化。
可能找到一个节点,它的新平衡破坏了 AVL 条件。 对 的左儿子的左子树进行一次插入 LL 单旋转 对 的左儿子的右子树进行一次插入 LR 双旋转 对 的右儿子的左子树进行一次插入 RL 对 的右儿子的右子树进行一次插入 RR
LL
RR
LR
RL