Question? Leave a message!




BINARY SEARCH TREES

BINARY SEARCH TREES
ROBERT SEDGEWICK KEVIN WAYNE Algorithms 3.2 BINARY SEARCH TREES BSTs ‣ ordered operations ‣ deletion ‣ Algorithms FOUR TH EDITION ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.edu3.2 BINARY SEARCH TREES BSTs ‣ ordered operations ‣ deletion ‣ Algorithms ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.eduBinary search trees Definition. A BST is a binary tree in symmetric order. root a left link a subtree A binary tree is either: Empty. right child of root Two disjoint binary trees (left and right). null links Anatomy of a binary tree parent of A and R Symmetric order. Each node has a key, key S left link of E and every node’s key is: E X 9 A R value Larger than all keys in its left subtree. associated C H with R Smaller than all keys in its right subtree. keys smaller than E keys larger than E Anatomy of a binary search tree 3Binary search tree demo Search. If less, go left; if greater, go right; if equal, search hit. successful search for H S E X A R C H M 4Binary search tree demo Insert. If less, go left; if greater, go right; if null, insert. insert G S E X A R C H G M 5BST representation in Java Java definition. A BST is a reference to a root Node. A Node is composed of four fields: A Key and a Value. A reference to the left and right subtree. smaller keys larger keys private class Node BST private Key key; private Value val; Node key val private Node left, right; public Node(Key key, Value val) left right this.key = key; this.val = val; BST with smaller keys BST with larger keys Binary search tree Key and Value are generic types; Key is Comparable 6BST implementation (skeleton) public class BSTKey extends ComparableKey, Value root of BST private Node root; private class Node / see previous slide / public void put(Key key, Value val) / see next slides / public Value get(Key key) / see next slides / public void delete(Key key) / see next slides / public IterableKey iterator() / see next slides / 7BST search: Java implementation Get. Return value corresponding to given key, or null if no such key. public Value get(Key key) Node x = root; while (x = null) int cmp = key.compareTo(x.key); if (cmp 0) x = x.left; else if (cmp 0) x = x.right; else if (cmp == 0) return x.val; return null; Cost. Number of compares is equal to 1 + depth of node. 8BST insert Put. Associate value with key. inserting L S E X A R H C Search for key, then two cases: M Key in tree ⇒ reset value. P search for L ends at this null link Key not in tree ⇒ add new node. S E X A R H C M P create new node L S E X A R C H M reset links L P on the way up Insertion into a BST 9BST insert: Java implementation Put. Associate value with key. concise, but tricky, recursive code; public void put(Key key, Value val) read carefully root = put(root, key, val); private Node put(Node x, Key key, Value val) if (x == null) return new Node(key, val); int cmp = key.compareTo(x.key); if (cmp 0) x.left = put(x.left, key, val); else if (cmp 0) x.right = put(x.right, key, val); else if (cmp == 0) x.val = val; return x; Cost. Number of compares is equal to 1 + depth of node. 10best case H C S Tree shape R X A E Many BSTs correspond to same set of keys. typical case S best case Number of compares for search/insert is equal to 1 + depth of node. H E X C S A R R X A E C H A best case typical case worst case S H C E X C S E A R R X A E R C H S X typical case S A worst case E X C BST possibilities E A R C H R S X A Bottom line. Tree shape depends on order of insertion. worst case C E BST possibilities 11 R S X BST possibilities H H HBST insertion: random order visualization Ex. Insert keys in random order. 12Sorting with a binary heap Q. What is this sorting algorithm 0. Shuffle the array of keys. 1. Insert all keys into a BST. 2. Do an inorder traversal of BST. A. It's not a sorting algorithm (if there are duplicate keys) Q. OK, so what if there are no duplicate keys Q. What are its properties 13Correspondence between BSTs and quicksort partitioning P H T S U D O E I A Y C M L Remark. Correspondence is 1–1 if array has no duplicate keys. 14BSTs: mathematical analysis Proposition. If N distinct keys are inserted into a BST in random order, the expected number of compares for a search/insert is 2 ln N. Pf. 1–1 correspondence with quicksort partitioning. Proposition. Reed, 2003 If N distinct keys are inserted in random order, expected height of tree is 4.311 ln N. How Tall is a Tree How Tall is a Tree Bruce Reed CNRS, Paris, France Bruce Reed CNRS, Paris, France reedmoka.ccr.jussieu.fr reedmoka.ccr.jussieu.fr 3 ABSTRACT purpose of this note to prove that for /3 ½ + , we 3 ABSTRACT have: Let H be the height of a random purpose binary of search this note tree to on prove n that for /3 ½ + , we nodes. We show that there exists constants a = 4.31107... have: Let H be the height of a random binary search tree on n THEOREM 1. E(H) = logn /31oglogn + O(1) and and/3 = 1.95... such that E(H) = clogn /31oglogn + nodes. We show that there exists constants a = 4.31107... Var(Hn) = O(1) . O(1), We also show that Var(H) = O(1). THEOREM 1. E(H) = logn /31oglogn + O(1) and and/3 = 1.95... such that E(H) = clogn /31oglogn + Var(Hn) = O(1) . O(1), We also show that Var(H) = O(1). Remark By the definition of a, /3 = 3 7"g" The first defi Categories and Subject Descriptors nition given is more suggestive of why this value is correct, Remark By the definition of a, /3 = 3 7"g" The first defi Categories and Subject Descriptors as we will see. E.2 Data Structures: Trees nition given is more suggestive of why this value is correct, But… Worstcase height is N. For more information on random binary search trees, one as we will see. E.2 Data Structures: Trees may consult 6,7, 1, 2, 9, 4, and 8. For more information on random binary search trees, one 1. THE RESULTS Remark After I announced these results, Drmota(unpublished) may consult 6,7, 1, 2, 9, 4, and 8. exponentially small chance when keys are inserted in random order A binary search tree is a binary tree to each node of which 1. THE RESULTS developed an alternative proof of the fact that Var(Hn) = Remark After I announced these results, Drmota(unpublished) we have associated a key; these keys axe drawn from some O(1) using completely different techniques. As our two A binary search tree is a binary tree to each node of which developed an alternative proof of the fact that Var(Hn) = totally ordered set and the key at v cannot be larger than proofs illuminate different aspects of the problem, we have we have associated a key; these keys axe drawn from some O(1) using completely different techniques. As 15our two the key at its right child nor smaller than the key at its left decided to submit the journal versions to the same journal totally ordered set and the key at v cannot be larger than proofs illuminate different aspects of the problem, we have child. Given a binary search tree T and a new key k, we and asked that they be published side by side. the key at its right child nor smaller than the key at its left decided to submit the journal versions to the same journal insert k into T by traversing the tree starting at the root child. Given a binary search tree T and a new key k, we and asked that they be published side by side. and inserting k into the first empty position at which we insert k into T by traversing the tree starting at the root 2. A MODEL arrive. We traverse the tree by moving to the left child of the and inserting k into the first empty position at which we If we construct a binary search tree from a permutation current node if k is smaller than the 2. current A MODEL key and moving arrive. We traverse the tree by moving to the left child of the of 1, ..., n and i is the first key in the permutation then: to the right child otherwise. Given some permutation of If we construct a binary search tree from a permutation current node if k is smaller than the current key and moving i appears at the root of the tree, the tree rooted at the a set of keys, we construct a binary search tree from this of 1, ..., n and i is the first key in the permutation then: to the right child otherwise. Given some permutation of left child of i contains the keys 1, ..., i 1 and its shape permutation by inserting them in the given order into an i appears at the root of the tree, the tree rooted at the a set of keys, we construct a binary search tree from this depends only on the order in which these keys appear in initially empty tree. left child of i contains the keys 1, ..., i 1 and its shape permutation by inserting them in the given order into an the permutation, mad the tree rooted at the right child of i The height Hn of a random binary search tree T, on n depends only on the order in which these keys appear in initially empty tree. contains the keys i + 1, ..., n and its shape depends only on nodes, constructed in this manner starting from a random the permutation, mad the tree rooted at the right child of i The height Hn of a random binary search tree T, on n the order in which these keys appear in the permutation. equiprobable permutation of 1,..., n, is known to be close contains the keys i + 1, ..., n and its shape depends only on nodes, constructed in this manner starting from a random From this observation, one deduces that Hn is also the num to alogn where a = 4.31107... is the unique solution on the order in which these keys appear in the permutation. equiprobable permutation of 1,..., n, is known to be close ber of levels of recursion required when Vazfilla Quicksort 2, ) of the equation a log((2e)/a) = 1 (here and elsewhere, From this observation, one deduces that Hn is also the num to alogn where a = 4.31107... is the unique solution on (i.e. the version of Quicksort in which the first element in log is the natural logarithm). First, Pittel10 showed that ber of levels of recursion required when Vazfilla Quicksort 2, ) of the equation a log((2e)/a) = 1 (here and elsewhere, the permuation is chosen as the pivot) is applied to a random H,/log n 3' almost surely as n + c for some positive (i.e. the version of Quicksort in which the first element in log is the natural logarithm). First, Pittel10 showed that permutation of 1, ..., n. constant 7. This constant was known not to exceed c 11, the permuation is chosen as the pivot) is applied to a random H,/log n 3' almost surely as n + c for some positive Our observation also allows us to construct Tn from the top and Devroye3 showed that "7 = a, as a consequence of the permutation of 1, ..., n. constant 7. This constant was known not to exceed c 11, down. To ease our exposition, we think of T, as a labelling fact that E(Hn) clogn. Robson12 has found that Hn Our observation also allows us to construct Tn from the top and Devroye3 showed that "7 = a, as a consequence of the of a subtree of T, the complete infinite binary tree. does not vary much from experiment to experiment, and down. To ease our exposition, we think of T, as a labelling fact that E(Hn) clogn. Robson12 has found that Hn We will expose the key associated with each node t of T. seems to have a fixed range of width not depending upon n. of a subtree of T, the complete infinite binary tree. does not vary much from experiment to experiment, and To underscore the relationship with Quicksort, we refer to Devroye and Reed5 proved that Var(Hn) = O((log log n)2), We will expose the key associated with each node t of T. seems to have a fixed range of width not depending upon n. the key at t as the pivot at t. Suppose then that we have but this does not quite confirm Robson's findings. It is the To underscore the relationship with Quicksort, we refer to Devroye and Reed5 proved that Var(Hn) = O((log log n)2), exposed the pivots for some of the nodes forming a subtree the key at t as the pivot at t. Suppose then that we have but this does not quite confirm Robson's findings. It is the of Too, rooted at the root of T. Suppose further that for exposed the pivots for some of the nodes forming a subtree some node t of T¢, all of the ancestors of t are in T, and of Too, rooted at the root of T. Suppose further that for Permission to make digital or hard copies of all or part of this work for we have chosen their pivots. Then, these choices determine some node t of T¢, all of the ancestors of t are in T, and personal or classroom use is granted without fee provided that copies Permission to make digital or hard copies of all or part of this work for the set of keys Kt which will appear at the (possibly empty) we have chosen their pivots. Then, these choices determine are not made or distributed for profit or commercial advantage and that personal or classroom use is granted without fee provided that copies subtree of T, rooted at t, but will have no effect on the order copies bear this notice and the full citation the on set the of first keys page. Kt To which copy will appear at the (possibly empty) are not made or distributed for profit or commercial advantage and that in which we expect the keys in Kt to appear. Indeed each otherwise, to republish, to post on servers subtree or to redistribute of T, rooted to lists, at t, but will have no effect on the order copies bear this notice and the full citation on the first page. To copy permutation of Kt is equally likely. Thus, each of the keys requires prior specific permission and/or a fee. in which we expect the keys in Kt to appear. Indeed each otherwise, to republish, to post on servers or to redistribute to lists, in Kt will be equally likely to be the pivot. We let nt be STOC 2000 Portland Oregon USA permutation of Kt is equally likely. Thus, each of the keys requires prior specific permission and/or a fee. the number of keys in this set and specify the pivot at t by Copyright ACM 2000 1581131844/00/5...5.00 in Kt will be equally likely to be the pivot. We let nt be STOC 2000 Portland Oregon USA the number of keys in this set and specify the pivot at t by Copyright ACM 2000 1581131844/00/5...5.00 479 479 ST implementations: summary guarantee guarantee average case average case operations operations implementation implementation on keys on keys search insert search hit insert sequential search equals() N N ½ N N (unordered list) binary search compareTo() lg N N lg N ½ N (ordered array) BST compareTo() N N 1.39 lg N 1.39 lg N Why not shuffle to ensure a (probabilistic) guarantee of 4.311 ln N 163.2 BINARY SEARCH TREES BSTs ‣ ordered operations ‣ deletion ‣ Algorithms ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.eduMinimum and maximum Minimum. Smallest key in table. Maximum. Largest key in table. max() max S min() E X min A R H C M Examples of BST order queries Q. How to find the min / max 18Floor and ceiling Floor. Largest key ≤ a given key. Ceiling. Smallest key ≥ a given key. floor(G) max() S min() E X A R ceiling(Q) H C M floor(D) Examples of BST order queries Q. How to find the floor / ceiling 19Computing the floor fififi nding nding nding floor(G) floor(G) floor(G) Case 1. k equals the key in the node SSS EEE XXX The floor of k is k. AAA RRR HHH CCC GGG is less than is less than is less than SSS s s so o o MMM Case 2. k is less than the key in the node floor(G) floor(G) floor(G) m m must b ust b ust be e e on the le on the le on the left ft ft The floor of k is in the left subtree. SSS EEE XXX AAA RRR Case 3. k is greater than the key in the node CCC HHH MMM The floor of k is in the right subtree GGG is g is g is gr r reat eat eate e er than r than r than EEE s s so o o floor(G) floor(G) floor(G) c c co o ould b uld b uld be e e (if there is any key ≤ k in right subtree); on the r on the r on the rig ig ight ht ht SSS EEE XXX otherwise it is the key in the node. AAA RRR CCC HHH MMM floor(G) floor(G) floor(G)in le in le in left ft ft s s sub ub ubt t tr r ree is ee is ee is null null null SSS EEE XXX AAA RRR r r res es esult ult ult CCC HHH MMM 20 C C Computing the omputing the omputing the flflfl oor func oor func oor function tion tionComputing the floor fi nding floor(G) S E X public Key floor(Key key) A R H C Node x = floor(root, key); G is less than S so M if (x == null) return null; floor(G) must be on the left return x.key; S private Node floor(Node x, Key key) E X A R if (x == null) return null; C H int cmp = key.compareTo(x.key); M G is greater than E so floor(G) could be if (cmp == 0) return x; on the right S E X if (cmp 0) return floor(x.left, key); A R C H Node t = floor(x.right, key); M if (t = null) return t; floor(G)in left else return x; subtree is null S E X A R result C H M 21 Computing the fl oor functionRank and select Q. How to implement rank() and select() efficiently A. In each node, we store the number of nodes in the subtree rooted at that node; to implement size(), return the count at the root. 8 node count S 6 1 E X 3 2 A R 2 1 C H 1 M 22BST implementation: subtree counts private class Node public int size() return size(root); private Key key; private Value val; private int size(Node x) private Node left; private Node right; if (x == null) return 0; ok to call private int count; return x.count; when x is null number of nodes in subtree initialize subtree private Node put(Node x, Key key, Value val) count to 1 if (x == null) return new Node(key, val, 1); int cmp = key.compareTo(x.key); if (cmp 0) x.left = put(x.left, key, val); else if (cmp 0) x.right = put(x.right, key, val); else if (cmp == 0) x.val = val; x.count = 1 + size(x.left) + size(x.right); return x; 23Rank Rank. How many keys k 8 node count S 6 1 E X Easy recursive algorithm (3 cases) 3 2 A R 2 1 C H 1 M public int rank(Key key) return rank(key, root); private int rank(Key key, Node x) if (x == null) return 0; int cmp = key.compareTo(x.key); if (cmp 0) return rank(key, x.left); else if (cmp 0) return 1 + size(x.left) + rank(key, x.right); else if (cmp == 0) return size(x.left); 24Inorder traversal Traverse left subtree. Enqueue key. Traverse right subtree. public IterableKey keys() BST QueueKey q = new QueueKey(); key val inorder(root, q); return q; left right private void inorder(Node x, QueueKey q) BST with smaller keys BST with larger keys if (x == null) return; smaller keys, in order key larger keys, in order inorder(x.left, q); q.enqueue(x.key); all keys, in order inorder(x.right, q); Property. Inorder traversal of a BST yields keys in ascending order. 25BST: ordered symbol table operations summary sequential binary BST search search search N lg N h insert N N h h = height of BST (proportional to log N min / max N 1 h if keys inserted in random order) floor / ceiling N lg N h rank N lg N h select N 1 h ordered iteration N log N N N order of growth of running time of ordered symbol table operations 263.2 BINARY SEARCH TREES BSTs ‣ ordered operations ‣ deletion ‣ Algorithms ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.eduST implementations: summary guarantee guarantee guarantee average case average case average case or order dered ed operations operations implementation implementation ops ops on keys on keys search insert delete search hit insert delete sequential search equals() N N N ½ N N ½ N (linked list) binary search compareTo() ✔ lg N N N lg N ½ N ½ N (ordered array) BST compareTo() ✔ N N N 1.39 lg N 1.39 lg N Next. Deletion in BSTs. 28BST deletion: lazy approach To remove a node with a given key: Set its value to null. Leave key in tree to guide search (but don't consider it equal in search). E E A S A S delete I C C tombstone I ☠ H R H R N N Cost. 2 ln N' per insert, search, and delete (if keys in random order), where N' is the number of keyvalue pairs ever inserted in the BST. Unsatisfactory solution. Tombstone (memory) overload. 29Deleting the minimum To delete the minimum key: go left until Go left until finding a node with a null left link. S reaching null X E left link Replace that node by its right link. R A Update subtree counts. C H M return that S node’s right link X E R A H C public void deleteMin() M available for root = deleteMin(root); garbage collection update links and node counts private Node deleteMin(Node x) after recursive calls 7 S 5 if (x.left == null) return x.right; X E x.left = deleteMin(x.left); R C x.count = 1 + size(x.left) + size(x.right); H return x; M Deleting the minimum in a BST 30Hibbard deletion To delete a node with key k: search for node t containing key k. Case 0. 0 children Delete t by setting parent link to null. deleting C update counts after recursive calls 7 S 5 S S E X E X 1 E X A R A R A R H H C H M M replace with M null link node to delete available for garbage collection C 31Hibbard deletion To delete a node with key k: search for node t containing key k. Case 1. 1 child Delete t by replacing parent link. deleting R update counts after recursive calls 7 S S 5 S E X E X E X A H A H A R M M C C H C node to delete M replace with available for child link garbage collection R 32deleting E Hibbard deletion node to delete S E X To delete a node with key k: search for node t containing key k. A R C H search for key E M Case 2. 2 children x has no left child Find successor x of t. t S but don't garbage collect x E X Delete the minimum in t's right subtree. x A R still a BST Put x in t's spot. successor C H deleting E min(t.right) M go right, then node to delete go left until S reaching null left link E X S A R E X x C H deleteMin(t.right) t.left search for key E H M A R C M t S 7 E X S 5 x A R H X successor C H min(t.right) A R M C M go right, then update links and go left until node counts after reaching null recursive calls left link S Deletion in a BST E X x 33 deleteMin(t.right) t.left H A R C M 7 S 5 H X A R C M update links and node counts after recursive calls Deletion in a BSTHibbard deletion: Java implementation public void delete(Key key) root = delete(root, key); private Node delete(Node x, Key key) if (x == null) return null; int cmp = key.compareTo(x.key); if (cmp 0) x.left = delete(x.left, key); search for key else if (cmp 0) x.right = delete(x.right, key); else if (x.right == null) return x.left; no right child if (x.left == null) return x.right; no left child Node t = x; replace with x = min(t.right); successor x.right = deleteMin(t.right); x.left = t.left; update subtree x.count = size(x.left) + size(x.right) + 1; counts return x; 34Hibbard deletion: analysis Unsatisfactory solution. Not symmetric. Surprising consequence. Trees not random () ⇒ √N per op. Longstanding open problem. Simple and efficient delete for BSTs. 35ST implementations: summary guarantee guarantee guarantee average case average case average case or order dered ed operations operations implementation implementation ops ops on keys on keys search insert delete search hit insert delete sequential search equals() N N N ½ N N ½ N (linked list) binary search compareTo() ✔ lg N N N lg N ½ N ½ N (ordered array) BST compareTo() ✔ N N N 1.39 lg N 1.39 lg N √ N other operations also become √N if deletions allowed Next lecture. Guarantee logarithmic performance for all operations. 36
Website URL
Comment