Question? Leave a message!




Quadratic probing

Quadratic probing
ECE 250 Algorithms and Data Structures Quadratic probing Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada ece.uwaterloo.ca dwharderalumni.uwaterloo.ca © 20062013 by Douglas Wilhelm Harder. Some rights reserved.Quadratic probing 2 Outline This topic covers quadratic probing – Similar to linear probing • Does not step forward one step at a time – Primary clustering no longer occurs – Affected by secondary clusteringQuadratic probing 3 Background Linear probing: – Look at bins k, k + 1, k + 2, k + 3, k + 4, … – Primary clusteringQuadratic probing 4 Background Linear probing causes primary clustering – All entries follow the same search pattern for bins: int initial = hashM( x.hash(), M ); for ( int k = 0; k M; ++k ) bin = (initial + k) M; // ... Quadratic probing 5 Description Quadratic probing suggests moving forward by different amounts For example, int initial = hashM( x.hash(), M ); for ( int k = 0; k M; ++k ) bin = (initial + kk) M; Quadratic probing 6 Description Problem: – Will initial + kk step through all of the bins – Here, the array size is 10: M = 10; initial = 5 for ( int k = 0; k = M; ++k ) std::cout (initial + kk) M ' '; – The output is 5 6 9 4 1 0 1 4 9 6 5Quadratic probing 7 Description Problem: – Will initial + kk step through all of the bins – Now the array size is 12: M = 12; initial = 5 for ( int k = 0; k = M; ++k ) std::cout (initial + kk) M ' '; – The output is now 5 6 9 2 9 6 5 6 9 2 9 6 5Quadratic probing 8 Making M Prime If we make the table size M = p a prime number quadratic probing p  is guaranteed to iterates through entries  2  Problems: – All operations must be done using • Cannot use , , or • The modulus operator is relatively slow – Doubling the number of bins is difficult: • What is the next prime after 2 × 263Quadratic probing 9 Generalization More generally, we could consider an approach like: int initial = hashM( x.hash(), M ); for ( int k = 0; k M; ++k ) bin = (initial + c1k + c2kk) M; Quadratic probing 10 m Using M = 2 m If we ensure M = 2 then choose c = c = ½ 1 2 int initial = hashM( x.hash(), M ); for ( int k = 0; k M; ++k ) bin = (initial + (k + kk)/2) M; – Note that k + kk is always even 2 – The growth is still Q(k ) – This guarantees that all M entries are visited before the pattern repeats • This only works for powers of twoQuadratic probing 11 m Using M = 2 For example: – Use an array size of 16: M = 16; initial = 5 for ( int k = 0; k = M; ++k ) std::cout (initial + (k + kk)/2) M ' '; – The output is now 5 6 8 11 15 4 10 1 9 2 12 7 3 0 14 13 13Quadratic probing 12 m Using M = 2 There is an even easier means of calculating this approach int bin = hashM( x.hash(), M ); for ( int k = 0; k M; ++k ) bin = (bin + k) M; 2 k kk  j – Recall that , so just keep adding the next highest value  2 j0Quadratic probing 13 Example Consider a hash table with M = 16 bins Given a 2digit hexadecimal number: – The leastsignificant digit is the primary hash function (bin) – Example: for 6B7A , the initial bin is A 16 Quadratic probing 14 Example Insert these numbers into this initially empty hash table 9A, 07, AD, 88, BA, 80, 4C, 26, 46, C9, 32, 7A, BF, 9C 0 1 2 3 4 5 6 7 8 9 A B C D E FQuadratic probing 15 Example Start with the first four values: 9A, 07, AD, 88 0 1 2 3 4 5 6 7 8 9 A B C D E FQuadratic probing 16 Example Start with the first four values: 9A, 07, AD, 88 0 1 2 3 4 5 6 7 8 9 A B C D E F 07 88 9A ADQuadratic probing 17 Example Next we must insert BA 0 1 2 3 4 5 6 7 8 9 A B C D E F 07 88 9A ADQuadratic probing 18 Example Next we must insert BA – The next bin is empty 0 1 2 3 4 5 6 7 8 9 A B C D E F 07 88 9A BA ADQuadratic probing 19 Example Next we are adding 80, 4C, 26 0 1 2 3 4 5 6 7 8 9 A B C D E F 07 88 9A BA ADQuadratic probing 20 Example Next we are adding 80, 4C, 26 – All the bins are empty—simply insert them 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 26 07 88 9A BA 4C ADQuadratic probing 21 Example Next, we must insert 46 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 26 07 88 9A BA 4C ADQuadratic probing 22 Example Next, we must insert 46 – Bin 6 is occupied – Bin 6 + 1 = 7 is occupied – Bin 7 + 2 = 9 is empty 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 26 07 88 46 9A BA 4C ADQuadratic probing 23 Example Next, we must insert C9 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 26 07 88 46 9A BA 4C ADQuadratic probing 24 Example Next, we must insert C9 – Bin 9 is occupied – Bin 9 + 1 = A is occupied – Bin A + 2 = C is occupied – Bin C + 3 = F is empty 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 26 07 88 46 9A BA 4C AD C9Quadratic probing 25 Example Next, we insert 32 – Bin 2 is unoccupied 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 32 26 07 88 46 9A BA 4C AD C9Quadratic probing 26 Example Next, we insert 7A – Bin A is occupied – Bins A + 1 = B, B + 2 = D and D + 3 = 0 are occupied – Bin 0 + 4 = 4 is empty 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 32 7A 26 07 88 46 9A BA 4C AD C9Quadratic probing 27 Example Next, we insert BF – Bin F is occupied – Bins F + 1 = 0 and 0 + 2 = 2 are occupied – Bin 2 + 3 = 5 is empty 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 32 7A BF 26 07 88 46 9A BA 4C AD C9Quadratic probing 28 Example Finally, we insert 9C – Bin C is occupied – Bins C + 1 = D, D + 2 = F, F + 3 = 2, 2 + 4 = 6 and 6 + 5 = B are occupied – Bin B + 6 = 1 is empty 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 9C 32 7A BF 26 07 88 46 9A BA 4C AD C9Quadratic probing 29 Example Having completed these insertions: – The load factor is l = 14/16 = 0.875 – The average number of probes is 32/14 ≈ 2.29 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 9C 32 7A BF 26 07 88 46 9A BA 4C AD C9Quadratic probing 30 Resizing the array To double the capacity of the array, each value must be rehashed – 80, 9C, 32, 7A, BF, 26, 07, 88 may be immediately placed • We use the leastsignificant five bits for the initial bin 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 80 26 07 88 32 7A 9C BF – If the next leastsignificant digit is • Even, use bins 0 – F • Odd, use bins 10 – 1FQuadratic probing 31 Resizing the array To double the capacity of the array, each value must be rehashed – 46 results in a collision • We place it in bin 9 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 80 26 07 88 46 32 7A 9C BFQuadratic probing 32 Resizing the array To double the capacity of the array, each value must be rehashed – 9A results in a collision • We place it in bin 1B 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 80 26 07 88 46 32 7A 9A 9C BFQuadratic probing 33 Resizing the array To double the capacity of the array, each value must be rehashed – BA also results in a collision • We place it in bin 1D 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 80 26 07 88 46 32 7A 9A 9C BA BFQuadratic probing 34 Resizing the array To double the capacity of the array, each value must be rehashed – 4C and AD don’t cause collisions 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 80 26 07 88 46 4C AD 32 7A 9A 9C BA BFQuadratic probing 35 Resizing the array To double the capacity of the array, each value must be rehashed – Finally, C9 causes a collision • We place it in bin A 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 80 26 07 88 46 C9 4C AD 32 7A 9A 9C BA BFQuadratic probing 36 Resizing the array To double the capacity of the array, each value must be rehashed – The load factor is l = 14/32 = 0.4375 – The average number of probes is 20/14 ≈ 1.43 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 80 26 07 88 46 C9 4C AD 32 7A 9A 9C BA BFQuadratic probing 37 Erase Can we erase an object like we did with linear probing – Consider erasing 9A from this table – There are M – 1 possible locations where an object which could have occupied a position could be located 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 21 43 76 9A 50 Instead, we will use the concept of lazy deletion – Mark a bin as ERASED; however, when searching, treat the bin as occupied and continue • We must have a separate ternaryvalued flag for each bin Quadratic probing 38 Erase If we erase AD, we must mark that bin as erased 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 9C 32 7A BF 26 07 88 46 9A BA 4C AD C9Quadratic probing 39 Find When searching, it is necessary to skip over this bin – For example, find AD: D, E find 5C: C, D, F, 2, 5, 9, F, 6, E 0 1 2 3 4 5 6 7 8 9 A B C D E F 80 9C 32 7A BF 26 07 88 46 9A BA 4C AD C9Quadratic probing 40 Modified insertion We must modify insert, as we may place new items into either – Unoccupied bins – Erased binsQuadratic probing 41 Implementation Storing three states can be achieved using an enumerated type: enum binstatet UNOCCUPIED, OCCUPIED, ERASED ; Now we can declare and initialize arrays: binstatet stateM; for ( int i = 0; i M; ++i ) statei = UNOCCUPIED; Quadratic probing 42 Multiple insertions and erases One problem which may occur after multiple insertions and removals is that numerous bins may be marked as ERASED – In calculating the load factor, an ERASED bin is equivalent to an OCCUPIED bin This will increase our run times…Quadratic probing 43 Multiple insertions and erases We can easily track the number of bins which are: – UNOCCUPIED – OCCUPIED – ERASED by updating appropriate counters If the load factor l grows too large, we have two choices: – If the load factor due to occupied bins is too large, double the table size – Otherwise, rehash all of the objects currently in the hash tableQuadratic probing 44 Expected number of probes It is possible to calculate the expected number of probes for quadratic probing, again, based on the load factor: 1  ln  – Successful searches: 1l  l 1 – Unsuccessful searches: 1l When l = 2/3, we requires 1.65 and 3 probes, respectively – Linear probing required 3 and 5 probes, respectively Load Factor (l) Unsuccessful search Successful search Reference: Knuth, The Art of Computer Programming, nd Vol. 3, 2 Ed., 1998, Addison Wesley, p. 530.Quadratic probing 45 Quadratic probing versus linear probing Comparing the two: Linear probing Unsuccessful search Successful search Quadratic probing Unsuccessful search Successful search Load Factor (l) Examined BinsQuadratic probing 46 Cache misses One benefit of quadratic probing: – The first few bins examined are close to the initial bin – It is unlikely to reference a section of the array far from the initial bin Modern computers use caches – 4 KiB pages of main memory are copied into faster caches – Pages are only brought into the cache when referenced – Accesses close to the initial bin are likely to reference the same pageQuadratic probing 47 Secondary clustering One weakness with quadratic problem – It reverts to linear probing if many of the hash function is not random – Objects placed in the same bin will follow the same sequenceQuadratic probing 48 Summary In this topic, we have looked at quadratic probing: – An open addressing technique – Steps forward by a quadratically growing steps – Insertions and searching are straight forward – Removing objects is more complicated: use lazy deletion – Still subject to secondary probingQuadratic probing 49 References Wikipedia, http://en.wikipedia.org/wiki/Quadraticprobing 1 Cormen, Leiserson, and Rivest, Introduction to Algorithms, McGraw Hill, 1990. rd 2 Weiss, Data Structures and Algorithm Analysis in C++, 3 Ed., Addison Wesley. These slides are provided for the ECE 250 Algorithms and Data Structures course. The material in it reflects Douglas W. Harder’s best judgment in light of the information available to him at the time of preparation. Any reliance on these course slides by any party for any other purpose are the responsibility of such parties. Douglas W. Harder accepts no responsibility for damages, if any, suffered by any party as a result of decisions made or actions based on these course slides for any other purpose than that for which it was intended.
Website URL
Comment
sharer
Presentations
Free
Document Information
Category:
Presentations
User Name:
Dr.SethPatton
User Type:
Teacher
Country:
United Kingdom
Uploaded Date:
22-07-2017