Question? Leave a message!




Radix sort

Radix sort
ECE 250 Algorithms and Data Structures Radix sort Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada ece.uwaterloo.ca dwharderalumni.uwaterloo.ca © 20062013 by Douglas Wilhelm Harder. Some rights reserved.Radix sort 2 Outline This topic covers radix sort: – Bucket sort is Q(n + m) – Is it possible to capitalize on the linear behaviour – Consider sorting digitsRadix sort 3 Radix Sort Suppose we want to sort 10 digit numbers with repetitions 10 – We could use bucket sort, but this would require the use of 10 buckets – With one byte per counter, this would require 9 GiB This may not be very practical…Radix sort 4 Radix Sort Consider the following scheme – Given the numbers 16 31 99 59 27 90 10 26 21 60 18 57 17 – If we first sort the numbers based on their last digit only, we get: 90 10 60 31 21 16 26 27 57 17 18 99 59 – Now sort according to the first digit: 10 16 17 18 21 26 27 31 57 59 60 90 99Radix sort 5 Radix Sort The resulting sequence of numbers is a sorted list Thus, consider the following algorithm: – Suppose we are sorting decimal numbers – Create an array of 10 queues – For each digit, starting with the least significant • Place the ith number in to the bin corresponding with the current digit • Remove all digits in the order they were placed into the bins in the order of the binsRadix sort 6 Radix Sort Suppose that two ndigit numbers are equal for the first m digits: ... ... a = a a a a a a a n n – 1 n – 2 n – m + 1 n–m 1 0 ... ... b = a a a a b b b n n – 1 n – 2 n – m + 1 n–m 1 0 where a b n–m n–m For example, 103574 103892 because 1 = 1, 0 = 0, 3 = 3 but 5 8 Then, on iteration n – m, a will be placed in a lower bin than b When they are taken out, a will precede b in the listRadix sort 7 Radix Sort For all subsequent iterations, a and b will be placed in the same bin, and will therefore continue to be taken out in the same order Therefore, in the final list, a must precede bRadix sort 8 Example 1 Sort the following decimal numbers: 86 198 466 709 973 981 374 766 473 342 First, interpret 86 as 086Radix sort 9 Example 1 Next, create an array of 10 queues: 0 1 2 3 4 5 6 7 8 9Radix sort 10 Example 1 rd Push according to the 3 digit: 086 198 466 709 973 981 374 766 473 342 0 1 981 2 342 3 973 473 4 374 5 6 086 466 766 7 8 198 9 709 and dequeue: 981 342 973 473 374 086 466 766 198 709Radix sort 11 Example 1 nd Enqueue according to the 2 digit: 981 342 973 473 374 086 466 766 198 709 0 709 1 2 3 4 342 5 6 466 766 7 973 473 374 8 981 086 9 198 and dequeue: 709 342 466 766 973 473 374 981 086 198Radix sort 12 Example 1 st Enqueue according to the 1 digit: 709 342 466 766 973 473 374 981 086 198 0 086 1 198 2 3 342 374 4 466 473 5 6 7 709 766 8 9 973 981 and dequeue: 086 198 342 374 466 473 709 766 973 981Radix sort 13 Example 1 The numbers 086 198 342 374 466 473 709 766 973 981 are now in order The next example uses the binary representation of numbers, which is even easier to followRadix sort 14 Example 2 Sort the following base 2 numbers: 1111 11011 11001 10000 11010 101 11100 111 1011 10101 First, interpret each as a 5bit number: 01111 11011 11001 10000 11010 00101 11100 00111 01011 10101 Next, create an array of two queues: 0 1Radix sort 15 Example 2 Place the numbers 01111 11011 11001 10000 11010 00101 11100 00111 01011 10101 th into the queues based on the 5 bit: 00 10000 11010 11100 11 01111 11011 11001 00101 00111 01011 10101 Remove them in order: 10000 11010 11100 01111 11011 11001 00101 00111 01011 10101Radix sort 16 Example 2 Place the numbers 10000 11010 11100 01111 11011 11001 00101 00111 01011 10101 th into the queues based on the 4 bit: 0 10000 11100 11001 00101 10101 0 11 11010 01111 11011 00111 01011 Remove them in order: 10000 11100 11001 00101 10101 11010 01111 11011 00111 01011Radix sort 17 Example 2 Place the numbers 10000 11100 11001 00101 10101 11010 01111 11011 00111 01011 rd into the queues based on the 3 bit: 0 10000 11001 11010 11011 01011 0 11 11100 00101 10101 01111 00111 Remove them in order: 10000 11001 11010 11011 01011 11100 00101 10101 01111 00111Radix sort 18 Example 2 Place the numbers 10000 11001 11010 11011 01011 11100 00101 10101 01111 00111 nd into the queues based on the 2 bit: 0 10000 00101 10101 00111 0 11 11001 11010 11011 01011 11100 01111 Remove them in order: 10000 00101 10101 00111 11001 11010 11011 01011 11100 01111Radix sort 19 Example 2 Place the numbers 10000 00101 10101 00111 11001 11010 11011 01011 11100 01111 st into the queues based on the 1 bit: 0 00101 00111 01011 01111 0 11 10000 10101 11001 11010 11011 11100 Remove them in order: 00101 00111 01011 01111 10000 10101 11001 11010 11011 11100Radix sort 20 Example 2 The numbers 00101 00111 01011 01111 10000 10101 11001 11010 11011 11100 are now in order This required 5n enqueues and dequeues – In this case, it n = 10Radix sort 21 Sorting binary numbers The implementation of multiple queues requires a lot of memory – Note, however, that the sum of the entries in the two queues is always n – Create a new array of size n for a twoended queues where • If the relevant bit is 0, enqueue it at at the front • Otherwise, the relevant bit is 1, so enqueue it at the backRadix sort 22 Sorting binary numbers Once we finish, the two queues have n entries – Now, suppose the 0queue has current entries • To iterate through the entries: go from 0 to current – 1, then from n 1 down to current • Go to the next bit and now use the original array as the queueRadix sort 23 Example Consider sorting the following 20 3bit numbers 110 111 001 011 101 010 000 100 010 001 111 010 001 011 101 010 111 101 000 100 – These are the numbers 6 7 1 3 5 2 0 4 2 1 7 2 1 3 5 2 7 5 0 4Radix sort 24 Example Allocate a new array 110 111 001 011 101 010 000 100 010 001 111 010 001 011 101 010 111 101 000 100Radix sort 25 Example Sort on the leastsignificant bit 110 111 001 011 101 010 000 100 010 001 111 010 001 011 101 100 111 101 000 100 110 010 000 100 010 010 010 000 100 101 111 101 011 001 111 001 101 011 001 111Radix sort 26 Example Consider the original array as empty 110 010 000 100 010 010 010 000 100 101 111 101 011 001 111 001 101 011 001 111Radix sort 27 Example Sort on the intermediate bit 000 100 000 100 001 101 001 001 101 101 111 011 111 011 111 010 010 010 010 110 110 010 000 100 010 010 010 000 100 101 111 101 011 001 111 001 101 011 001 111Radix sort 28 Example Consider the other array as empty 000 100 000 100 001 101 001 001 101 101 111 011 111 011 111 010 010 010 010 110Radix sort 29 Example Sort on the mostsignificant bit 000 100 000 100 001 101 001 001 101 101 111 011 111 011 111 010 010 010 010 110 000 000 001 001 001 010 010 010 010 011 011 111 111 111 110 101 101 101 100 100Radix sort 30 Example Now we must copy back, swapping the second half 000 000 001 001 001 010 010 010 010 011 011 111 111 111 110 101 101 101 100 100Radix sort 31 Example Now we must copy back, swapping the second half 000 000 001 001 001 010 010 010 010 011 011 100 100 101 101 101 110 111 111 111 000 000 001 001 001 010 010 010 010 011 011 111 111 111 110 101 101 101 100 100Radix sort 32 Example Deleting the temporary array and we are finished 000 000 001 001 001 010 010 010 010 011 011 100 100 101 101 101 110 111 111 111 – Thus, 6 7 1 3 5 2 0 4 2 1 7 2 1 3 5 2 7 5 0 4 is now 0 0 1 1 1 2 2 2 2 3 3 4 4 5 5 5 6 7 7 7Radix sort 33 Implementation template typename Type void radixsort( Type array, sizet n ) // Acknowledgement: Jake Greenberg Type queue = new Typen; sizet q0 = n, q1 = 0; bool copyback = false; for ( Type bit = 1; bit; bit = 1 ) sizet current = q0; q0 = 0; q1 = n 1; // q0, q1 next tail location for 0 and 1queues copyback = copyback; // Copy the entries of 'array' to either queue based on // if the bit of arrayk is 0 or 1 // the 0queue in 'array' starts at 0 to 'current – 1', // while the 1queue starts at n 1 and goes down to 'current' std::swap( array, queue ); // Alternate between the two arrays // Clean upRadix sort 34 Copy the entries // Start at front and go up to 'current 1' for ( sizet k = 0; k current; ++k ) if ( arrayk bit ) queueq1 = arrayk; q1; else queueq0 = arrayk; ++q0; // Start at the end and return to 'current' for ( sizet k = n 1; k = current; k ) if ( arrayk bit ) queueq1 = arrayk; q1; else queueq0 = arrayk; ++q0; Radix sort 35 Clean up if ( copyback ) sizet k = 0; for ( ; k q0; ++k ) queuek = arrayk; for ( sizet j = n 1; j = q0; j, ++k ) queuek = arrayj; delete array; else for ( sizet j = q0, k = n 1; j k; ++j, k ) std::swap( arrayj, arrayk ); delete queue; Radix sort 36 Signed integers and floatingpoint numbers Some observations: – This implement works with positive doubles as well as unsigned integer datatypes – If we are using signed integers stored using 2s complement, we need only store all 1s first and then all 0s – If we are sorting arbitrary doubles, we must also reverse the order of the numbers identified as negative—those with a leading bit equal to 1Radix sort 37 A fast implementation void radixsort( int array, sizet n, bool findmax = true ) int queue = new intn; unsigned int copyfrom = 1; int const a02 = array, queue; int const r02 = array + (n 1), queue + (n 1); int const an = array + n; int q0 = an, q1 = r00; unsigned int all0 = 0, all1 = all0; if ( findmax ) for ( int p = a00; p = an; ) all0 = p; all1 = p++; Find any bit that are either all 0s or all 1s – This is better than finding the maximum entry – Don’t sort these bits all0 = all0; Radix sort 38 A fast implementation for ( unsigned int mask = 1; mask; mask = 1 ) // Do not sort a bit that is all 0s or all 1s if ( (all0 mask) (all1 mask) ) int const const c0 = q0; int const const c1 = q1; q0 = a0copyfrom; q1 = r0copyfrom; copyfrom = 1; Iterate forward, placing entries as appropriate for ( int p = a0copyfrom; p = c0; ) (p mask ) ( q1 = p++ ) : ( q0++ = p++ ); for ( int p = r0copyfrom; p = c1; ) ( p mask ) ( q1 = p ) : ( q0++ = p ); Iterate backward, placing entries as appropriate Radix sort 39 A fast implementation Reverse entries in the 1queue if ( copyfrom ) for ( int p = q0, q = r00; p q; ) std::swap( p++, q ); else int p = array; Copy entries back to the original array, reversing the entries in in the 1queue for ( int q = a01; q = q0; ) p++ = q++; for ( int q = r01; q = q1; ) p++ = q; 6.5 × faster than the previous implementation delete queue; Radix sort 40 A fast implementation Some run times: 30 – On ecelinux, it takes 132 s to sort 2 int • That’s 123 ns per entry • Oneandahalf days to sort a trillion integersRadix sort 41 A fast implementation We can also run these tests with the Standard Template Library include algorithm sort( array, array + n ); – In some cases, our radix sort is faster 30 – For sorting 2 entries, • The STL is 25 faster • Requires less memory • Has Q(n ln(n)) growth • The STL uses comparison, so it is more general than radix sortRadix sort 42 Runtime analysis Bucket sort is Q(n + m) where m is the number of buckets – Restricting ourselves to either decimal digits or bits ensures that m =Q(1) How often must we iterate to sort numbers on the range 0, …, m – 1 – We require ⌊log (m)⌋ + 1 digits or ⌊log (m)⌋ + 1 bits 10 2 – This is why Arabic numbers are so powerful Run time is therefore Q(n ln(m)) – For this to be faster than previous sorting algorithms, it must be true that ln(m) ln(n) or m n – Therefore, it is only truly useful if we are sorting lists of relatively small range of numbers with very significant amount of duplicationsRadix sort 43 Runtime analysis The following table summarizes the runtimes of bucket sort for sorting n numbers on the range 0, …, m – 1 Run Time Case Comments WorstQ(n ln(m)) No worst case AverageQ(n ln(m)) BestQ(n ln(m)) No best case It requires Q(n) memory for the queues – It is only useful to use radix sort over quicksort if n = w(m)Radix sort 44 Sorting a million 32bit integers Senator Obama was asked ―What is the most efficient way to sort a million 32bit integers.‖ Many claim that radix sort would be fastest What is the problem – Radix sort would require 32 million binary comparisons – Quicksort would require 20 million bitwise operationsRadix sort 45 Summary Radix sort uses bucket sort on each digit of a set of numbers – Interesting in theory, less useful in practice – Useful only if sorting numbers with significant duplication – The idea is used elsewhereRadix sort 46 References 1 Donald E. Knuth, The Art of Computer Programming, Volume 3: Sorting nd and Searching, 2 Ed., Addison Wesley, 1998, §5.2.3, p.1448. 2 Cormen, Leiserson, and Rivest, Introduction to Algorithms, McGraw Hill, 1990, Ch. 7, p.1409. rd 3 Weiss, Data Structures and Algorithm Analysis in C++, 3 Ed., Addison Wesley, §7.5, p.2704.Radix sort 47 References Wikipedia, http://en.wikipedia.org/wiki/Radixsort nd 1 Donald E. Knuth, The Art of Computer Programming, Volume 3: Sorting and Searching, 2 Ed., Addison Wesley, 1998, §5.2.3, p.1448. 2 Cormen, Leiserson, and Rivest, Introduction to Algorithms, McGraw Hill, 1990, Ch. 7, p.1409. rd 3 Weiss, Data Structures and Algorithm Analysis in C++, 3 Ed., Addison Wesley, §7.5, p.2704. These slides are provided for the ECE 250 Algorithms and Data Structures course. The material in it reflects Douglas W. Harder’s best judgment in light of the information available to him at the time of preparation. Any reliance on these course slides by any party for any other purpose are the responsibility of such parties. Douglas W. Harder accepts no responsibility for damages, if any, suffered by any party as a result of decisions made or actions based on these course slides for any other purpose than that for which it was intended.
Website URL
Comment
sharer
Presentations
Free
Document Information
Category:
Presentations
User Name:
Dr.SethPatton
User Type:
Teacher
Country:
United Kingdom
Uploaded Date:
22-07-2017