Tries algorithm

R-way tries and tries ppt data structure and tries in analysis of algorithm and design
Dr.BenjaminClark Profile Pic
Dr.BenjaminClark,United States,Teacher
Published Date:21-07-2017
Your Website URL(Optional)
Comment
ROBERT SEDGEWICK KEVIN WAYNE Algorithms 5.2 TRIES R-way tries ‣ ternary search tries ‣ character-based operations ‣ Algorithms FOUR TH EDITION ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.eduSummary of the performance of symbol-table implementations Order of growth of the frequency of operations. typical case typical case typical case or order dered ed operations operations implementation implementation operations operations on keys on keys search insert delete compareTo() ✔ red-black BST log N log N log N equals() † † † hash table 1 1 1 hashCode() † under uniform hashing assumption use array accesses to make R-way decisions (instead of binary decisions) Q. Can we do better? A. Yes, if we can avoid examining the entire key, as with string sorting. 2String symbol table basic API String symbol table. Symbol table specialized to string keys. public class public class String StringSTValue STValue String StringST() ST() create an empty symbol table void put( put(String String key, Value val) key, Value val) put key-value pair into the symbol table Value get( get(String String key) key) return value paired with given key void delete( delete(String String key) key) delete key and corresponding value ⋮⋮ Goal. Faster than hashing, more flexible than BSTs. 3String symbol table implementations cost summary character accesses (typical case) character accesses (typical case) character accesses (typical case) character accesses (typical case) dedup dedup search search space implementation insert moby.txt actors.txt hit miss (references) 2 2 2 red-black BST L + c lg N c lg N c lg N 4N 1.40 97.4 hashing L L L 4N to 16N 0.76 40.6 (linear probing) Parameters file size words distinct • N = number of strings moby.txt 1.2 MB 210 K 32 K • L = length of string actors.txt 82 MB 11.4 M 900 K • R = radix Challenge. Efficient performance for string keys. 45.2 TRIES R-way tries ‣ ternary search tries ‣ character-based operations ‣ Algorithms ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.eduTries 6Tries Tries. from retrieval, but pronounced "try" Store characters in nodes (not keys). Each node has R children, one for each possible character. (for now, we do not draw null links) link to trie for all keys that start with s root link to trie for all keys that start with she b s t y e h h 4 key value by 4 o a l e e 0 5 6 sea 6 value for she in node sells 1 r l l corresponding to last key character she 0 shells 3 e s l 7 1 shore 7 the 5 s 3 7Search in a trie Follow links corresponding to each character in the key. Search hit: node where search ends has a non-null value. Search miss: reach null link or node where search ends has null value. get("shells") b s s t y e h h h 4 o a l e e e 0 5 6 r l ll e s ll 7 1 return value associated s s with last key character 3 3 (return 3) 8Search in a trie Follow links corresponding to each character in the key. Search hit: node where search ends has a non-null value. Search miss: reach null link or node where search ends has null value. get("she") b s s t y e h h h 4 o a l e e 0 0 e 5 6 search may terminated r l l at an intermediate node (return 0) e s l 7 1 s 3 9Search in a trie Follow links corresponding to each character in the key. Search hit: node where search ends has a non-null value. Search miss: reach null link or node where search ends has null value. get("shell") b s s t y e h h h 4 o a l e e e 0 5 6 r l ll e s ll 7 1 no value associated with last key character s 3 (return null) 10Search in a trie Follow links corresponding to each character in the key. Search hit: node where search ends has a non-null value. Search miss: reach null link or node where search ends has null value. get("shelter") b s s t y e h h h 4 o a l e e e 0 5 6 r l ll e s l 7 1 no link to t s 3 (return null) 11Insertion into a trie Follow links corresponding to each character in the key. Encounter a null link: create new node. Encounter the last character of the key: set value in that node. put("shore", 7) b s s t y e h h h 4 o a l e 0 e 5 6 r l l e s l 7 1 s 3 12Trie construction demo trieTrie construction demo trie b s t y e h h 4 o a l e e 0 5 6 r l l e 7 s 1 l s 3Trie representation: Java implementation Node. A value, plus references to R nodes. private static class Node use Object instead of Value since private Object value; no generic array creation in Java private Node next = new NodeR; characters are implicitly neither keys nor s defined by link index characters are s explicitly stored e h e h e a l a 0 e 2 2 0 l l each node has l an array of links s and a value 1 s 1 Trie representation 15R-way trie: Java implementation public class TrieSTValue extended ASCII private static final int R = 256; private Node root = new Node(); private static class Node / see previous slide / public void put(String key, Value val) root = put(root, key, val, 0); private Node put(Node x, String key, Value val, int d) if (x == null) x = new Node(); if (d == key.length()) x.val = val; return x; char c = key.charAt(d); x.nextc = put(x.nextc, key, val, d+1); return x; ⋮ 16R-way trie: Java implementation (continued) ⋮ public boolean contains(String key) return get(key) = null; public Value get(String key) Node x = get(root, key, 0); if (x == null) return null; cast needed return (Value) x.val; private Node get(Node x, String key, int d) if (x == null) return null; if (d == key.length()) return x; char c = key.charAt(d); return get(x.nextc, key, d+1); 17Trie performance Search hit. Need to examine all L characters for equality. Search miss. Could have mismatch on first character. Typical case: examine only a few characters (sublinear). Space. R null links at each leaf. (but sublinear space possible if many short strings share common prefixes) characters are implicitly s defined by link index s e h e h e a l a e 2 0 2 0 l l each node has l an array of links s and a value 1 s 1 Trie representation Bottom line. Fast search hit and even faster search miss, but wastes space. 18Deletion in an R-way trie To delete a key-value pair: Find the node corresponding to key and set value to null. If node has null value and all null links, remove that node (and recur). delete("shells") b s s t y e h h h 4 o a l e e e 0 5 6 r l ll e s l 7 1 l s s set value to null 3 19Deletion in an R-way trie To delete a key-value pair: Find the node corresponding to key and set value to null. If node has null value and all null links, remove that node (and recur). delete("shells") b s s t y e h h 4 o a l e e 0 5 6 r l l e s 7 1 l null value and links s (delete node) 20