Question? Leave a message!




Two classic sorting algorithms: mergesort and quicksort

Two classic sorting algorithms: mergesort and quicksort 2
ROBERT SEDGEWICK KEVIN WAYNE Algorithms 2.3 QUICKSORT quicksort ‣ selection ‣ duplicate keys ‣ Algorithms FOUR TH EDITION system sorts ‣ ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.edu Last updated on 3/29/17 9:02 AMTwo classic sorting algorithms: mergesort and quicksort Critical components in the world’s computational infrastructure. Full scientific understanding of their properties has enabled us
 to develop them into practical system sorts. th Quicksort honored as one of top 10 algorithms of 20 century
 in science and engineering. 
 Mergesort. last lecture 
 ... 
 
 
 Quicksort. this lecture ... 2Quicksort tshirt 3Quicksort tshirt 42.3 QUICKSORT quicksort ‣ selection ‣ duplicate keys ‣ Algorithms system sorts ‣ ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.eduQuicksort overview Step 1. Shuffle the array. Step 2. Partition the array so that, for some j Entry aj is in place. No larger entry to the left of j. No smaller entry to the right of j. Step 3. Sort each subarray recursively. input Q U I C K S O R T E X A M P L E shuffle K R A T E L E P U I M Q C X O S partitioning item partition E C A I E K L P U T M Q R X O S not greater not less sort left A C E E I K L P U T M Q R X O S sort right A C E E I K L M O P Q R S T U X result A C E E I K L M O P Q R S T U X Quicksort overview 6number). 9.9 X 10 45 is used to represent infinity. Imaginary conunent I and J are output variables, and A is the array (with Tony Hoare values of x may not be negative and reM values of x may not be subscript bounds M:N) which is operated upon by this procedure. smaller than 1. Partition takes the value X of a random element of the array A, Values of Qd'(x) may be calculated easily by hypergeometric and rearranges the values of the elements of the array in such a series if x is not too small nor (n m) too large. Qm(x) can be way that there exist integers I and J with the following properties : computed from an appropriate set of values of Pnm(X) if X is near M J I = NprovidedM N 1.0 or ix is near 0. Loss of significant digits occurs for x as small as AR = XforM = R J 1.1 if n is larger than 10. Loss of significant digits is a major diffi AR = XforJ R I Invented quicksort to translate Russian into English. culty in using finite polynomiM representations also if n is larger AR Xfor I = R N than m. However, QLEG has been tested in regions of x and n The procedure uses an integer procedure random (M,N) which both large and small; chooses equiprobably a random integer F between M and N, and but couldn't explain his algorithm or implement it procedure QLEG(m, nmax, x, ri, R, Q); value In, nmax, x, ri; also a procedure exchange, which exchanges the values of its two real In, mnax, x, ri; real array R, Q; parameters ; begin real t, i, n, q0, s; begin real X; integer F; Learned Algol 60 (and recursion). n := 20; F := random (M,N); X := AF; if nmax 13 then I:=M; J:=N; n := nmax +7; up: for I : = I step 1 until N do if ri = 0then if X A I then go to down; Implemented quicksort. begin ifm = 0then I:=N; Q0 := 0.5 X 10g((x + 1)/(x 1)) down: forJ := J step 1 until M do else if AJX then go to change; begin t := 1.0/sqrt(x X x 1); J:=M; q0 := 0; Tony Hoare
 change: if I J then begin exchange (AIL AJ); QO : = t; I := I+ 1;J:= J 1; for i : = 1 step 1 until m do 1980 Turing Award go to up begin s := (x+x)X(i1)Xt end 2: begin 4 ×Q0+ (3ii×i2)×q0; else if F then begin exchange (AIL AF) i if c = 0 then q0 := Q0; I:=I+l 3:begin Q0 := s end end; end e:= aXe;f := bXd;goto8 if x = 1 then else if F J tllen begin exchange (AF, AJ) ; end 3 ; Q0 := 9.9 I" 45; J:=J1 e:=bXc; Rn + 1 := x sqrt(x X x 1); end ; ifd 0then for i := n step 1 until 1 do end partition 4: begin Ri := (i + m)/((i + i + 1) X x f:=bXd; goto8 +(mi 1) X Ri+l); end 4; ALGORITHM 61 go to the end; f:=aXd; goto8 PROCEDURES FOR RANGE ARITHMETIC ALGORITHM 64 if m = 0 then 5: end 2; ALLAN GIBB begin if x 0.5 tben QUICKSORT ifb 0 then Q0 := arctan(x) 1.5707963 else 6: begin University of Alberta, Calgary, Alberta, Canada C. A. R. HOARE Q0 := aretan(1/x)end else if d 0 then Elliott Brothers Ltd., Borehamwood, Hertfordshire, Eng. begin begin t := 1/sqrt(x X x + 1); begin procedure RANGESUM (a, b, c, d, e, f); q0 := 0; procedure quicksort (A,M,N); value M,N; e := MIN(a X d, b X c); real a,b,c,d,e,f; q0 := t; array A; integer M,N; f:= MAX(a X c,b X d); go to8 comment The term "range number" was used by P. S. Dwyer, for i : = 2 step 1 until m do comment Quicksort is a very fast and convenient method of end 6; Linear Computations (Wiley, 1951). Machine procedures for begin s := (x + x) X (i 1) X t X Q0I sorting an array in the randomaccess store of a computer. The e:= bX c; f := aX c; go to8 range arithmetic were developed about 1958 by Ramon Moore, +(3i+iX i 2) × q0; entire contents of the store may be sorted, since no extra space is end 5; "Automatic Error Analysis in Digital Computation," LMSD qO := Q0; required. The average number of comparisons made is 2(MN) In f:=aXc; Report 48421, 28 Jan. 1959, Lockheed Missiles and Space Divi Q0 := s end end; (NM), and the average nmnber of exchanges is one sixth this if d O then sion, Palo Alto, California, 59 pp. If a x b and c y d, Rn + 1 := x sqrt(x × x + 1); amount. Suitable refinements of this method will be desirable for 7: begin then RANGESUM yields an interval e, f such that e = (x + y) for i := n step 1 until 1 do its implementation on any actual computer; e:=bXd; goto8 f. Because of machine operation (truncation or rounding) the Ri := (i + m)/((i m + 1) × Ri + 1 begin integer 1,J ; end 7 ; machine sums a 4 c and b 4 d may not provide safe endpoints (i+i+ 1) X x); if M N then begin partition (A,M,N,I,J); e:=aXd; of the output interval. Thus RANGESUM requires a nonlocal for i : = 1 step 2 until nmax do quicksort (A,M,J) ; 8: e := ADJUSTPROD (e, 1); real procedure ADJUSTSUM which will compensate for the Rill := Rill; quicksort (A, I, N) f := ADJUSTPROD (f, 1) machine arithmetic. The body of ADJUSTSUM will be de the: for i : = 1 step 1 until nnmx do end end RANGEMPY; pendent upon the type of machine for which it is written and so Qi := Qi 1 X Ri end quieksort procedure RANGEDVD (a, b, c, d, e, f) ; is not given here. (An example, however, appears below.) It end QLEG; real a, b, c, d, e, f; is assumed that ADJUSTSUM has as parameters real v and w, comment If the range divisor includes zero the program This procedure was developed in part under the sponsorship and integer i, and is accompanied by a nonlocal real procedure exists to a nonlocal label "zerodvsr". RANGEDVD assumes a of the Air Force Cambridge Research Center. CORRECTION which gives an upper bound to the magnitude ALGORITHM 65 Communications of the ACM (July 1961) nonlocal real procedure ADJUSTQUOT which is analogous of the error involved in the machine representation of a number. FIND (possibly identical) to ADJUSTPROD; The output ADJUSTSUM provides the left endpoint of the 7 begin C. A. R. HOARE output interval of RANGESUM when ADJUSTSUM is called ALGORITHM 63 if c = 0 A d 0 then go to zer0dvsr; Elliott Brothers Ltd., Borehamwood, Hertfordshire, Eng. with i = 1, and the right endpoint when called with i = 1 if c 0 then PARTITION The procedures RANGESUB, RANGEMPY, and RANGEDVD procedure find (A,M,N,K); value M,N,K; 1: begin C. A. R. HOARE provide for the remaining fundamental operations in range array A; integer M,N,K; ifb 0 then arithmetic. RANGESQR gives an interval within which the Elliott Brothers Ltd., Borehamwood, Hertfordshire, Eng. comment Find will assign to A K the value which it would 2: begin square of a range nmnber must lie. RNGSUMC, RNGSUBC, procedure partition (A,M,N,I,J); value M,N; have if the array A M:N had been sorted. The array A will be e := b/d; go to3 RNGMPYC and RNGDVDC provide for range arithmetic with array A; integer M,N,1,J; partly sorted, and subsequent entries will be faster than the first; end 2; complex range arguments, i.e. the real and imaginary parts e := b/c; are range numbers Communications of the ACM 321 3: ifa 0then begin 4: begin e := ADJUSTSUM (a, c, 1); f := a/c; go to 8 f := ADJUSTSUM (b, d, 1) end 4; end RANGESUM; f := a/d; go to 8 procedure RANGESUB (a, b, c, d, e, f) ; end 1 ; real a, b,c, d,e, f; if a 0 then comment RANGESUM is a nonlocal procedure; 5: begin begin e := a/c; go to6 RANGESUM (a, b, d, c, e, f) end 5 ; end RANGESUB ; e := a/d; procedure RANGEMPY (a, b, c, d, e, f); 6: ifb 0then real a, b, c, d, e, f; 7: begin comment ADJUSTPROD, which appears at the end of this f := b/c; go to8 procedure, is analogous to ADJUSTSUM above and is a non end 7 ; local real procedure. MAX and MIN find the maximum and f := b/d; minimum of a set of real numbers and are nonlocal; 8: e := ADJUSTQUOT (e, 1); f := ADJUSTQUOT (f,1) begin end RANGEDVD ; real v, w; procedure RANGESQR (a, b, e, f); if a 0A c = 0then real a, b, e, f; 1: begin v:=c; c:=a; a:=v; w:=d; d:=b; b:=w comment ADJUSTPROD is a non10cal procedure; end 1; begin if a = O then if a 0 then Communications of the CM 319 Tony Hoare Invented quicksort to translate Russian into English. but couldn't explain his algorithm or implement it Learned Algol 60 (and recursion). Implemented quicksort. Tony Hoare
 1980 Turing Award “ There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. ” “ I call it my billiondollar mistake. It was the invention of the null reference in 1965… This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years. ” 8Bob Sedgewick Refined and popularized quicksort. Analyzed many versions of quicksort. rithms, and we can learn from that experience to separate Bob Sedgewick good algorithms from bad ones. Third, if the tile fits into the memory of the computer, there is one algorithm, called Quicksort, which has been shown to perform well in a variety of situations. Not only is this algorithm Programming S. L. Graham, R. L. Rivest simpler than many other sorting algorithms, but empir Techniques Editors ical 2, ll, 13, 21 and analytic 9 studies show that Quicksort can be expected to be up to twice as fast as its Implementing nearest competitors. The method is simple enough to be Acta Informatica 7, 327355 (1977) learned by programmers who have no previous experi Quicksort Programs 9 by SpringerVerlag 1977 ence with sorting, and those who do know other sorting methods should also find it profitable to learn about Robert Sedgewick Quicksort. Brown University Because of its prominence, it is appropriate to study how Quicksort might be improved. This subject has The Analysis of Quicksort Programs received considerable attention (see, for example, 1, 4, This paper is a practical study of how to implement 11, 13, 14, 18, 20), but few real improvements have been Robert Sedgewick the Quicksort sorting algorithm and its best variants on suggested beyond those described by C.A.R. Hoare, the real computers, including how to apply various code Received January 19, t976 inventor of Quicksort, in his original papers 5, 6. Hoare optimization techniques. A detailed implementation also showed how to analyze Quicksort and predict its combining the most effective improvements to Summary. The Quicksort sorting algorithm and its best variants are presented running time. The analysis has since been extended to Quicksort is given, along with a discussion of how to and analyzed. Results are derived which make it possible to obtain exact formulas de the improvements that he suggested, and used to indicate implement it in assembly language. Analytic results scribing the total expected running time of particular implementations on real com how they may best be implemented 9, 15, 17. The describing the performance of the programs are puters of Quick, sort and an improvement called the medianofthree modification. subject of the careful implementation of Quicksort has Detailed analysis of the effect of an implementation technique called loop unwrapping summarized. A variety of special situations are not been studied as widely as global improvements to is presented. The paper is intended not only to present results of direct practical utility, considered from a practical standpoint to illustrate the algorithm, but but the also savings to illustrate to be the realized intriguing are as mathematics which arises in the complete analysis Quicksort's wide applicability as an internal sorting of this important algorithm. significant. The history of Quicksort is quite complex, method which requires negligible extra storage. and 15 contains a full survey of the many variants Key Words and Phrases: Quicksort, analysis of which, have been proposed. 1. Introduction algorithms, code optimization, sorting The purpose of this paper is to describe in detail how CR Categories: 4.0, 4.6, 5.25, 5.31, 5.5 In t96t62 C.A.R. Hoare presented a new algorithm called Quicksort 7, 8 Quicksort can best be implemented to handle actual which is suitable for putting files into order by computer. This method combines applications on real computers. A general description of elegance and efficiency, and it remains today the most useful generalpurpose the algorithm is followed by descriptions of the most sorting method for computers. The practical utility of the algorithm has meant effective improvements that have been proposed (as 9 not only that it has been sfibjected to countless modifications (though few real demonstrated in 15). Next, an implementation of improvements have been suggested beyond those described by Hoare), but also Quicksort in a typical high level language is presented, Introduction and assembly language that implementation it has been used issues .in are countless con applications, often to sort very large, files. sidered. This discussion should easily translate to real Consequently, it is important to be able to estimate how long an implementation One of the most widely studied practical problems in languages on real machines. Finally, a number of special of Quicksort can be expected to run, in order to be able to compare variants or computer science is sorting: the use of a computer to put issues are considered which may be of importance in estimate expenses. Fortunately, as we shall see, this is an algorithm which can be files in order. A person wishing to use a computer to sort particular sorting applications. analyzed. (Hoare recognized this, and gave some analytic results in 8.) It is is faced with the problem of determining which of the This paper is intended to be a selfcontained overview possible to derive exact formulas describing the average performance of real many available algorithms is best suited for his purpose. of the properties of Quicksort for use by those who need implementations of the algorithm. This task is becoming less difficult than it once was for to actually implement and use the algorithm. A compan The history of Quicksort is quite complex, and a full survey of the many variants three reasons. First, sorting is an area in which the ion paper 17 provides the analytical results which su which have been proposed is given in t 7. In addition, t 7 gives analytic results mathematical analysis of algorithms has been particu port much of the discussion presented here. describing many of the improvements which have been suggested for the purpose larly successful: we can predict the performance of many of determining which are the most effective. There are many examples in 7 sorting methods and compare them intelligently. Second, which illustrate that the simplicity of Quicksort is deceiving. The algorithm has we have a great deal of experience using sorting algo The Algofithm hidden subtleties which can have significant effects on performance. Furthermore, Permission to copy without fee all or part of this material is as we shall see, simple changes to the algorithm or its implementation can radically granted provided that the copies are not made or distributed for direct Quicksort is a recursive method for sorting an array commercial advantage, the ACM copyright notice and the title of the change the analysis. In this paper, we shall consider in detail how practical A1, A2 ..... AN by first "partitioning" it so that the publication and its date appear, and notice is given that copying is by implementations of the best versions of Quicksort may be analyzed. following conditions hold: permission of the Association for Computing Machinery. To copy In this paper, we will deal with the analysis of: (i) the basic Quicksort algo otherwise, or to republish, requires a fee and/or specific permission. (i) Some key v is in its final position in the array. (If it This work was supported in part by the Fannie and John Hertz rithm; (ii) an improvement called the "medianofthree" modification which is thejth smallest, it is in position Aj.) Foundation and in part by NSF Grants. No. GJ28074 and MCS75 23738. reduces the average number of comparisons required; and (iii) an implementation (ii) All elements to the left of Aj are less than or equal Author's address: Division of Applied Mathematics and Computer technique called "loop unwrapping" which reduces the amount of overhead per to it. (These elements A 1 , A 2 ..... A j 1 are Science Program, Brown University, Providence, RI 02912. comparison. These particular methods not only represent the most effective vari © 1978 ACM 00010782/78/10000847 00.75 called the "left subtile.") This work was supported in part by the Fannie and John Hertz Foundation, and 847 Communications October 1978 in part by the National Science Foundation Grants No. GJ28074 and MCS7523738. of Volume 21 the ACM Number 10 22 Acta Informatica, Vol. 7 Quicksort partitioning demo Repeat until i and j pointers cross. Scan i from left to right so long as (ai alo). Scan j from right to left so long as (aj alo). Exchange ai with aj. K R A T E L E P U I M Q C X O S lo i j 10Quicksort partitioning demo Repeat until i and j pointers cross. Scan i from left to right so long as (ai alo). Scan j from right to left so long as (aj alo). Exchange ai with aj. When pointers cross. Exchange alo with aj. E C A I E K L P U T M Q R X O S lo j hi partitioned 11The music of quicksort partitioning (by Brad Lyon) https://googledrive.com/host/0B2GQktuwcTicjRaRjV1NmRFN1U/index.html 12Quicksort: Java code for partitioning private static int partition(Comparable a, int lo, int hi) int i = lo, j = hi+1; while (true) while (less(a++i, alo)) find item on left to swap if (i == hi) break; while (less(alo, aj)) find item on right to swap if (j == lo) break; if (i = j) break; check if pointers cross exch(a, i, j); swap exch(a, lo, j); swap with partitioning item before v return j; return index of item now known to be in place lo hi during " v v v before v lo hi i j after v " v v during before " v v v v lo j hi lo hi i j after Quicksort partitioning overview during " v v v " v v v 13 lo j hi i j after Quicksort partitioning overview v " v v lo j hi Quicksort partitioning overviewQuicksort quiz 1 How many compares to partition an array of length n A. ¼ n B. ½ n C. n D. n lg n E. I don't know. scan until ≤ M scan until ≥ M M A B C D E V W X Y Z 0 1 2 3 4 5 6 7 8 9 10 n+1 compares in worst case (E and V are compared with M twice) 14Quicksort: Java implementation public class Quick private static int partition(Comparable a, int lo, int hi) / see previous slide / public static void sort(Comparable a) shuffle needed for StdRandom.shuffle(a); performance guarantee sort(a, 0, a.length 1); (stay tuned) private static void sort(Comparable a, int lo, int hi) if (hi = lo) return; int j = partition(a, lo, hi); sort(a, lo, j1); sort(a, j+1, hi); 15Quicksort trace lo j hi 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 initial values Q U I C K S O R T E X A M P L E random shuffle K R A T E L E P U I M Q C X O S 0 5 15 E C A I E K L P U T M Q R X O S 0 3 4 E C A E I K L P U T M Q R X O S 0 2 2 A C E E I K L P U T M Q R X O S 0 0 1 A C E E I K L P U T M Q R X O S 1 1 A C E E I K L P U T M Q R X O S 4 4 A C E E I K L P U T M Q R X O S 6 6 15 A C E E I K L P U T M Q R X O S no partition 7 9 15 A C E E I K L M O P T Q R X U S for subarrays 7 7 8 A C E E I K L M O P T Q R X U S of size 1 8 8 A C E E I K L M O P T Q R X U S 10 13 15 A C E E I K L M O P S Q R T U X 10 12 12 A C E E I K L M O P R Q S T U X 10 11 11 A C E E I K L M O P Q R S T U X 10 10 A C E E I K L M O P Q R S T U X 14 14 15 A C E E I K L M O P Q R S T U X 15 15 A C E E I K L M O P Q R S T U X result A C E E I K L M O P Q R S T U X Quicksort trace (array contents after each partition) 16Quicksort animation 50 random items algorithm position in order current subarray not in order http://www.sortingalgorithms.com/quicksort 17Quicksort: implementation details Partitioning inplace. Using an extra array makes partitioning easier
 (and stable), but is not worth the cost. 
 Terminating the loop. Testing whether the pointers cross is trickier
 than it might seem. 
 Equal keys. When duplicates are present, it is (counterintuitively)
 stay tuned better to stop scans on keys equal to the partitioning item's key. 
 Preserving randomness. Shuffling is needed for performance guarantee. Equivalent alternative. Pick a random partitioning item in each subarray. 18Quicksort: empirical analysis (1961) Running time estimates: Algol 60 implementation. National Elliott 405 computer. Elliott 405 magnetic disc sorting n 6word items with 1word keys (16K words) 19Quicksort: empirical analysis Running time estimates: 8 Home PC executes 10 compares/second. 12 Supercomputer executes 10 compares/second. 
 
 2 insertion sort (n ) mergesort (n log n) quicksort (n log n) 
 computer thousand million billion thousand million billion thousand million billion 
 
 317 home instant 2.8 hours instant 1 second 18 min instant 0.6 sec 12 min years 
 super instant 1 second 1 week instant instant instant instant instant instant 
 
 
 
 
 Lesson 1. Good algorithms are better than supercomputers. Lesson 2. Great algorithms are better than good ones. 20Quicksort: bestcase analysis Best case. Number of compares is n lg n. initial values random shuffle 21Quicksort: worstcase analysis 2 Worst case. Number of compares is ½ n . initial values random shuffle 22Quicksort: averagecase analysis Proposition. The average number of compares C to quicksort an array of
 n n distinct keys is 2n ln n (and the number of exchanges is ⅓ n ln n ). Pf. C satisfies the recurrence C = C = 0 and for n ≥ 2: n 0 1 right left partitioning C +C C +C C +C 0 N 1 1 N 2 N 1 0 C =(N +1) + + + ... + N N N N partitioning probability Multiply both sides by n and collect terms: NC = N(N +1) + 2(C + C + ... +C ) N 0 1 N 1 Subtract from this equation the same equation for n 1: NC (N 1)C =2N +2C N N 1 N 1 Rearrange terms and divide by n (n + 1): C C 2 N N 1 = + N+1 N N+1 23Quicksort: averagecase analysis Repeatedly apply previous equation: CC CC 22 C C C C 22 NN NN 11 NN NN 11 = + = + = = + + NN+1 +1 NN NN+1 +1 N N+1 +1 N N N N+1 +1 C 2 2 C C 22 22 N 2 NN 22 substitute previous equation = + + = = + + + + N 1 N N+1 N N 11 N N N N+1 +1 C 2 2 2 C C 22 22 22 N 3 NN 33 = + + + = = + + + + + + N 2 N 1 N N+1 N N 22 N N 11 N N N N+1 +1 2 2 2 2 22 22 22 22 = + + +... + = = + + + + + +...... + + 3 4 5 N+1 33 44 55 N N+1 +1 Approximate sum by an integral: ✓ ◆ ✓ ◆ 1 1 1 1 1 1 1 1 C = 2(N +1) + + +... C = 2(N +1) + + +... N N 3 4 5 N+1 3 4 5 N+1 Z Z N+1 N+1 1 1 ⇠ 2(N +1) dx ⇠ 2(N +1) dx x x 3 3 Finally, the desired result: C 2(N +1) lnN ⇥ 1.39N lgN N 24Quicksort: summary of performance characteristics Quicksort is a (Las Vegas) randomized algorithm. Guaranteed to be correct. Running time depends on random shuffle. 
 Average case. Expected number of compares is 1.39 n lg n. 39 more compares than mergesort. Faster than mergesort in practice because of less data movement. 
 Best case. Number of compares is n lg n. 2 Worst case. Number of compares is ½ n . but more likely that lightning bolt strikes computer during execution 25Quicksort properties Proposition. Quicksort is an inplace sorting algorithm. Pf. Partitioning: constant extra space. Depth of recursion: logarithmic extra space (with high probability). 
 can guarantee logarithmic depth by recurring 
 on smaller subarray before larger subarray (but requires using an explicit stack) 
 
 Proposition. Quicksort is not stable. Pf. by counterexample i j 0 1 2 3 B C C A 1 1 2 1 1 3 B C C A 1 1 2 1 1 3 B A C C 1 1 2 1 0 1 A B C C 1 1 2 1 26Quicksort: practical improvements Insertion sort small subarrays. Even quicksort has too much overhead for tiny subarrays. Cutoff to insertion sort for ≈ 10 items. private static void sort(Comparable a, int lo, int hi) if (hi = lo + CUTOFF 1) Insertion.sort(a, lo, hi); return; int j = partition(a, lo, hi); sort(a, lo, j1); sort(a, j+1, hi); 27Quicksort: practical improvements Median of sample. Best choice of pivot item = median. Estimate true median by taking median of sample. Medianof3 (random) items. 12/7 n ln n compares (14 fewer) 12/35 n ln n exchanges (3 more) private static void sort(Comparable a, int lo, int hi) if (hi = lo) return; int median = medianOf3(a, lo, lo + (hi lo)/2, hi); swap(a, lo, median); int j = partition(a, lo, hi); sort(a, lo, j1); sort(a, j+1, hi); 282.3 QUICKSORT quicksort ‣ selection ‣ duplicate keys ‣ Algorithms system sorts ‣ ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.eduSelection th Goal. Given an array of n items, find the k smallest item. Ex. Min (k = 0), max (k = n 1), median (k = n / 2). 
 Applications. Order statistics. Find the "top k." 
 Use theory as a guide. Easy n log n upper bound. How Easy n upper bound for k = 1, 2, 3. How Easy n lower bound. Why 
 Which is true n log n lower bound is selection as hard as sorting n upper bound is there a lineartime algorithm 30Quickselect Partition array so that: Entry aj is in place. No larger entry to the left of j. No smaller entry to the right of j. Repeat in one subarray, depending on j; finished when j equals k. before public static Comparable select(Comparable a, int k) v if ak is here if ak is here lo hi StdRandom.shuffle(a); set hi to j1 set lo to j+1 int lo = 0, hi = a.length 1; during " v v v while (hi lo)
 i j int j = partition(a, lo, hi); after v " v v if (j k) lo = j + 1; else if (j k) hi = j 1; lo hi j else return ak; Quicksort partitioning overview return ak; 31Quickselect: mathematical analysis Proposition. Quickselect takes linear time on average. 
 Pf sketch. Intuitively, each partitioning step splits array approximately in half:
 n + n / 2 + n / 4 + … + 1 2n compares. Formal analysis similar to quicksort analysis yields:
 
 C = 2 n + 2 k ln (n / k) + 2 (n – k) ln (n / (n – k)) n 
 ≤ (2 + 2 ln 2) n Ex: (2 + 2 ln 2) n ≈ 3.38 n compares to find median (k = n / 2). 32Theoretical context for selection Proposition. Blum, Floyd, Pratt, Rivest, Tarjan, 1973 Comparebased selection algorithm whose worstcase running time is linear. 
 Time Bounds for Selection 
 bY . Manuel Blum, Robert W. Floyd, Vaughan Watt, 
 Ronald L. Rive, and Robert E. Tarjan 
 
 Abstract L The number of comparisons required to select the ith smallest of 
 n numbers is shown to be at most a linear function of n by analysis of 
 i a new selection algorithm PICK. Specifically, no more than i 5.4305 n comparisons are ever required. This bound is improved for 
 L extreme values of i , and a new lower bound on the requisite number 
 L of comparisons is also proved. Remark. Constants are high ⇒ not used in practice. 
 Use theory as a guide. L Still worthwhile to seek practical lineartime (worstcase) algorithm. Until one is discovered, use quickselect (if you don’t need a full sort). 33 This work was supported by the National Science Foundation under grants GJ992 and GJ33170X. 12.3 QUICKSORT quicksort ‣ selection ‣ duplicate keys ‣ Algorithms system sorts ‣ ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.eduDuplicate keys Often, purpose of sort is to bring items with equal keys together. Sort population by age. Remove duplicates from mailing list. Sort job applicants by college attended. sorted by time sorted by city (unstable) sorted by city (stable) Chicago 09:00:00 Chicago 09:25:52 Chicago 09:00:00 Phoenix 09:00:03 Chicago 09:03:13 Chicago 09:00:59 Typical characteristics of such applications. Houston 09:00:13 Chicago 09:21:05 Chicago 09:03:13 Chicago 09:00:59 Chicago 09:19:46 Chicago 09:19:32 Huge array. Houston 09:01:10 Chicago 09:19:32 Chicago 09:19:46 Small number of key values. Chicago 09:03:13 Chicago 09:00:00 Chicago 09:21:05 Seattle 09:10:11 Chicago 09:35:21 Chicago 09:25:52 Seattle 09:10:25 Chicago 09:00:59 Chicago 09:35:21 Phoenix 09:14:25 Houston 09:01:10 Houston 09:00:13 NOT Chicago 09:19:32 Houston 09:00:13 Houston 09:01:10 sorted sorted Chicago 09:19:46 Phoenix 09:37:44 Phoenix 09:00:03 Chicago 09:21:05 Phoenix 09:00:03 Phoenix 09:14:25 Seattle 09:22:43 Phoenix 09:14:25 Phoenix 09:37:44 Seattle 09:22:54 Seattle 09:10:25 Seattle 09:10:11 Chicago 09:25:52 Seattle 09:36:14 Seattle 09:10:25 Chicago 09:35:21 Seattle 09:22:43 Seattle 09:22:43 Seattle 09:36:14 Seattle 09:10:11 Seattle 09:22:54 Phoenix 09:37:44 Seattle 09:22:54 Seattle 09:36:14 Stability when sorting on a second key key 35War story (system sort in C) A beautiful bug report. Allan Wilks and Rick Becker, 1991 We found that qsort is unbearably slow on "organpipe" inputs like "01233210":
 
 main (int argc, charargv) 
 int n = atoi(argv1), i, x100000;
 for (i = 0; i n; i++)
 xi = i;
 for ( ; i 2n; i++)
 xi = 2ni1;
 qsort(x, 2n, sizeof(int), intcmp);
 
 Here are the timings on our machine:
 time a.out 2000
 real 5.85s 
 time a.out 4000
 real 21.64s
 time a.out 8000
 real 85.11s 36War story (system sort in C) Bug. A qsort() call that should have taken seconds was taking minutes. 
 
 Why is qsort() so slow 
 
 
 
 
 
 At the time, almost all qsort() implementations based on those in: Version 7 Unix (1979): quadratic time to sort organpipe arrays. BSD Unix (1983): quadratic time to sort random arrays of 0s and 1s. 37Duplicate keys: stop on equal keys Our partitioning subroutine stops both scans on equal keys. 
 scan until ≥ P scan until ≤ P 
 
 P G E P A Q B P Y C O U P Z S R 
 
 
 
 
 Q. Why not continue scans on equal keys scan until P scan until P P G E P A Q B P Y C O U P Z S R 38Quicksort quiz 2 What is the result of partitioning the following array (skip over equal keys)
 
 scan until A scan until A 
 
 A A A A A A A A A A A A A A A A 
 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 A A A A A A A A A A A A A A A A A. 
 A A A A A A A A A A A A A A A A B. 
 A A A A A A A A A A A A A A A A C. 
 D. I don't know. 39Quicksort quiz 3 What is the result of partitioning the following array (stop on equal keys)
 
 scan until ≥ A scan until ≤ A 
 
 A A A A A A A A A A A A A A A A 
 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 A A A A A A A A A A A A A A A A A. 
 A A A A A A A A A A A A A A A A B. 
 A A A A A A A A A A A A A A A A C. 
 D. I don't know. 40Partitioning an array with all equal keys 41Duplicate keys: partitioning strategies Bad. Don't stop scans on equal keys.
 2 ½ n compares when all keys equal 
 B A A B A B B B C C C A A A A A A A A A A A 
 
 
 Good. Stop scans on equal keys.
 n lg n compares when all keys equal 
 B A A B A B C C B C B A A A A A A A A A A A 
 
 
 Better. Put all equal keys in place. How
 n compares when all keys equal A A A B B B B B C C C A A A A A A A A A A A 42DUTCH NATIONAL FLAG PROBLEM 
 Problem. Edsger Dijkstra Given an array of n buckets, each containing a red, white, or blue pebble, sort them by color. 
 input 
 
 sorted 
 
 Operations allowed. swap(i, j): swap the pebble in bucket i with the pebble in bucket j. color(i): color of pebble in bucket i. 
 Requirements. Exactly n calls to color(). At most n calls to swap(). Constant extra space. 433way partitioning Goal. Partition array into three parts so that: Entries between lt and gt equal to the partition item. No larger entries to left of lt. No smaller entries to right of gt. 
 before v 
 lo hi before v 
 =v v v during lo hi 
 lt i gt =v v v during 
 after v v =v lt i gt 
 lo hi lt gt after v v =v 
 3way partitioning 
 lo hi lt gt 
 3way partitioning Dutch national flag problem. Edsger Dijkstra Conventional wisdom until mid 1990s: not worth doing. Now incorporated into C library qsort() and Java 6 system sort. 44Dijkstra 3way partitioning demo Let v be partitioning item alo. Scan i from left to right. – (ai v): exchange alt with ai; increment both lt and i – (ai v): exchange agt with ai; decrement gt – (ai == v): increment i lt i gt P A B X W P P V P D P C Y Z lo hi before v invariant lo hi =v v during v lt i gt 45 after v v =v lo lt gt hi 3way partitioningDijkstra 3way partitioning demo Let v be partitioning item alo. Scan i from left to right. – (ai v): exchange alt with ai; increment both lt and i – (ai v): exchange agt with ai; decrement gt – (ai == v): increment i lt gt A B C D P P P P P V W Y Z X lo hi before v invariant lo hi =v v during v lt i gt 46 after v v =v lo lt gt hi 3way partitioning3way quicksort: Java implementation private static void sort(Comparable a, int lo, int hi) if (hi = lo) return; int lt = lo, gt = hi; Comparable v = alo; int i = lo; while (i = gt) int cmp = ai.compareTo(v); if (cmp 0) exch(a, lt++, i++); else if (cmp 0) exch(a, i, gt); else i++; before v sort(a, lo, lt 1); lo hi sort(a, gt + 1, hi); =v v v during lt i gt after v v =v lo lt gt hi 3way partitioning 473way quicksort: visual trace equal to partitioning element Visual trace of quicksort with 3way partitioning 48Duplicate keys: lower bound th Sorting lower bound. If there are n distinct keys and the i one occurs
 x times, then any comparebased sorting algorithm must use at least
 i ⇥ n ⇤ N x 
 i n lg n when all distinct;
 lg ⇤ x lg i x x ···x N 1 2 n linear when only a constant number of distinct keys i=1 
 compares in the worst case. 
 
 Proposition. The expected number of compares to 3way quicksort an array is entropy optimal (proportional to sorting lower bound). 
 Pf. beyond scope of course 
 
 
 Bottom line. Quicksort with 3way partitioning reduces running time
 from linearithmic to linear in broad class of applications. 49Sorting summary inplace stable best average worst remarks 2 2 2 selection ✔ ½ n ½ n ½ n n exchanges use for small n 2 2 insertion ✔ ✔ n ¼ n ½ n or partially ordered tight code; 3/2 shell ✔ n log n c n 3 subquadratic n log n guarantee; merge ✔ ½ n lg n n lg n n lg n stable improves mergesort timsort ✔ n n lg n n lg n when preexisting order n log n probabilistic guarantee;
 2 quick ✔ n lg n 2 n ln n ½ n fastest in practice improves quicksort
 2 3way quick ✔ n 2 n ln n ½ n when duplicate keys holy sorting grail ✔ ✔ n n lg n n lg n 502.3 QUICKSORT quicksort ‣ selection ‣ duplicate keys ‣ Algorithms system sorts ‣ ROBERT SEDGEWICK KEVIN WAYNE http://algs4.cs.princeton.eduSorting applications Sorting algorithms are essential in a broad variety of applications: Sort a list of names. Organize an MP3 library. obvious applications Display Google PageRank results. List RSS feed in reverse chronological order. Find the median. Identify statistical outliers. problems become easy once items are in sorted order Binary search in a database. Find duplicates in a mailing list. Data compression. Computer graphics. nonobvious applications Computational biology. Load balancing on a parallel computer.
 . . . 52Engineering a system sort (in 1993) BentleyMcIlroy quicksort. sample 9 items Cutoff to insertion sort for small subarrays. Partitioning item: median of 3 or Tukey's ninther. Partitioning scheme: BentleyMcIlroy 3way partitioning. 
 similar to Dijkstra 3way partitioning SOFTWARE—PRACTICEANDEXPERIENCE,VOL.23(11),1249–1265(NOVEMBER1993) 
 (but fewer exchanges when not many equal keys) 
 
 EngineeringaSortFunction 
 JONL.BENTLEY M.DOUGLASMcILROY 
 ATTBell Laboratories,600 Mountain Avenue, Murray Hill, NJ 07974, U.S.A. 
 SUMMARY 
 Werecountthehistoryofanewqsortfunctionfor a C library. Our function is clearer, faster and more robust than existing sorts. It chooses partitioning elements by a new sampling scheme; it partitions by a novel solution to Dijkstra’s Dutch National Flag problem; and it swaps efficiently. Its behavior was 
 assessed with timing and debugging testbeds, and with a program to certify performance. The design techniquesapplyindomainsbeyondsorting. KEYWORDS Quicksort Sortingalgorithms Performancetuning Algorithmdesignandimplementation Testing 
 INTRODUCTION 
 C libraries have long included a qsort function to sort an array, usually implemented by 1 Hoare’s Quicksort. Because existing qsorts are flawed, we built a new one. This paper Very widely used. C, C++, Java 6, …. summarizesitsevolution. Compared to existing library sorts, our new qsort is faster—typically about twice as 53 fast—clearer, and more robust under nonrandom inputs. It uses some standard Quicksort tricks, abandons others, and introduces some new tricks of its own. Our approach to build ingaqsortisrelevanttoengineeringotheralgorithms. 2 The qsort on our home system, based on Scowen’s ‘Quickersort’, had served faith fully sinceLeeMcMahonwroteit almost twodecadesago. Shipped with the landmark Sev 3 enth Edition Unix System, it became a model for other qsorts. Yet in the summer of 1991 our colleagues Allan Wilks and Rick Becker found that aqsort run that should have taken a few minutes was chewing up hours of CPU time. Had they not interrupted it, it 2 4 would have gone on for weeks. They found that it took n comparisons to sort an ‘organ pipe’arrayof2nintegers:123..nn..321. Shopping around for a betterqsort, we found that aqsort written at Berkeley in 1983 would consume quadratic time on arrays that contain a few elements repeated many 5 times—in particular arrays of random zeros and ones. In fact, among a dozen different Unix libraries we found noqsort that could not easily be driven to quadratic behavior; all were derived from the Seventh Edition or from the 1983 Berkeley function. The Seventh 00380644/93/111249–1713.50 Received 21 August 1992 1993byJohnWileySons,Ltd. Revised 10 May1993A beautiful mailing list post (Yaroslavskiy, September 2009) Replacement of quicksort in java.util.Arrays with new dualpivot quicksort Hello All, I'd like to share with you new DualPivot Quicksort which is faster than the known implementations (theoretically and experimental). I'd like to propose to replace the JDK's Quicksort implementation by new one. ... The new DualPivot Quicksort uses two pivots elements in this manner: 1. Pick an elements P1, P2, called pivots from the array. 2. Assume that P1 = P2, otherwise swap it. 3. Reorder the array into three parts: those less than the smaller pivot, those larger than the larger pivot, and in between are those elements between (or equal to) the two pivots. 4. Recursively sort the subarrays. The invariant of the DualPivot Quicksort is: P1 P1 = = P2 P2 ... http://mail.openjdk.java.net/pipermail/corelibsdev/2009September/002630.html 54A beautiful mailing list post (YaroslavskiyBlochBentley, October 2009) Replacement of quicksort in java.util.Arrays with new dualpivot quicksort Date: Thu, 29 Oct 2009 11:19:39 +0000 Subject: Replace quicksort in java.util.Arrays with dualpivot implementation Changeset: b05abb410c52 Author: alanb Date: 20091029 11:18 +0000 URL: http://hg.openjdk.java.net/jdk7/tl/jdk/rev/b05abb410c52 6880672: Replace quicksort in java.util.Arrays with dualpivot implementation Reviewedby: jjb Contributedby: vladimir.yaroslavskiy at sun.com, joshua.bloch at google.com, jbentley at avaya.com make/java/java/FILESjava.gmk src/share/classes/java/util/Arrays.java + src/share/classes/java/util/DualPivotQuicksort.java http://mail.openjdk.java.net/pipermail/compilerdev/2009October.txt 55Dualpivot quicksort Use two partitioning items p and p and partition into three subarrays: 1 2 Keys less than p . 1 Keys between p and p . 1 2 Keys greater than p . 2 
 
 p p ≥ p and ≤ p p p 1 1 1 2 2 2 
 
 lo lt gt hi 
 
 
 Recursively sort three subarrays. 
 
 degenerates to Dijkstra's 3way partitioning 
 Note. Skip middle subarray if p = p . 1 2 56Dualpivot partitioning demo Initialization. Choose alo and ahi as partitioning items. Exchange if necessary to ensure alo ≤ ahi. S E A Y R L F V Z Q T C M K lo hi exchange alo and ahiDualpivot partitioning demo Main loop. Repeat until i and gt pointers cross. If (ai alo), exchange ai with alt and increment lt and i. Else if (ai ahi), exchange ai with agt and decrement gt. Else, increment i. p p ≥ p and ≤ p p p 1 1 1 2 2 2 lo lt i gt hi K E A F R L M C Z Q T V Y S lo lt i gt hiDualpivot partitioning demo Finalize. Exchange alo with alt. Exchange ahi with a++gt. p p ≥ p and ≤ p p p 1 1 1 2 2 2 lo lt gt hi C E A F K L M R Q S Z V Y T lo lt gt hi 3way partitionedDualpivot quicksort Use two partitioning items p and p and partition into three subarrays: 1 2 Keys less than p . 1 Keys between p and p . 1 2 Keys greater than p . 2 
 
 p p ≥ p and ≤ p p p 1 1 1 2 2 2 
 
 lo lt gt hi 
 
 
 
 
 
 Now widely used. Java 7, Python unstable sort, Android, … 60Threepivot quicksort Use three partitioning items p , p , and p and partition into four subarrays: 1 2 3 Keys less than p . 1 Keys between p and p . 1 2 Keys between p and p . 2 3 Keys greater than p . 3 p p ≥ p and ≤ p p ≥ p and ≤ p p p 1 1 1 2 2 2 3 3 3 lo a1 a2 a3 hi MultiPivot Quicksort: Theory and Experiments Shrinu Kushagra Alejandro L´opezOrtiz J. Ian Munro skushagruwaterloo.ca alopezouwaterloo.ca imunrouwaterloo.ca University of Waterloo University of Waterloo University of Waterloo Aurick Qiao a2qiaouwaterloo.ca University of Waterloo November 7, 2013 61 Abstract In 2009, Vladimir Yaroslavskiy introduced a novel dualpivot partitioning algorithm. When run on a bat The idea of multipivot quicksort has recently received tery of tests under the JVM, it outperformed the stan the attention of researchers after Vladimir Yaroslavskiy dard quicksort algorithm 10. In the subsequent re proposed a dual pivot quicksort algorithm that, con lease of Java 7, the internal sorting algorithm was re trary to prior intuition, outperforms standard quicksort placed by Yaroslavskiy’s variant. Three years later, byaasignificantmarginundertheJavaJVM10. More Wild and Nebel 9 published a rigorous averagecase recently, this algorithm has been analysed in terms of analysis of the algorithm. They stated that the previ comparisonsandswapsbyWildandNebel9. Ourcon ous lower bound relied on assumptions that no longer tributions to the topic are as follows. First, we perform hold in Yaroslavskiy’s implementation. The dual pivot the previous experiments using a native C implementa approach actually uses less comparisons (1.9nlnn vs tion thus removing potential extraneous e↵ ects of the 2.0nlnn) on average. However, the di↵ erence in run JVM. Second, we provide analyses on cache behavior time is much greater than the di↵ erence in number of of these algorithms. We then provide strong evidence comparisons. We address this issue and provide an ex that cache behavior is causing most of the performance planation in §5. di↵ erences in these algorithms. Additionally, we build Aumu¨ller and Dietzfelbinger 1 (ICALP2013) have upon prior work in multipivot quicksort and propose recently addressed the following question: If the previ a 3pivot variant that performs very well in theory and ous lower bound does not hold, what is really the best practice. We show that it makes fewer comparisons and wecandowithtwopivots Theyprovea1.8nlnnlower has better cache behavior than the dual pivot quicksort bound on the number of comparisons for all dualpivot in the expected case. We validate this with experimen quicksort algorithms and introduced an algorithm that tal results, showing a 78 performance improvement actually achieves that bound. In their experimentation, in our tests. the algorithm is outperformed by Yaroslavskiy’s quick sort when sorting integer data. However, their algo 1 Introduction rithm does perform better with large data (eg. strings) 1.1 Background Up until about a decade ago it was since comparisons incur high cost. thought that the classic quicksort algorithm 3 using one pivot is superior to any multipivot scheme. It was 1.2 The ProcessorMemory Performance Gap previously believed that using more pivots introduces Bothpresentlyandhistorically,theperformanceofCPU too much overhead for the advantages gained. In 2002, registers have far outpaced that of main memory. Ad SedgewickandBentley7recognisedandoutlinedsome ditionally, this performance gap between the processor of the advantages to a dualpivot quicksort. However, and memory has been increasing since their introduc theimplementationdidnotperformaswellastheclassic tion. Every year, the performance of memory improves quicksort algorithm 9 and this path was not explored by about 10 while that of the processor improves by again until recent years. 60 5. The performance di↵ erence grows so quickly Copyright © 2014. 47 by the Society for Industrial and Applied Mathematics. Downloaded 02/07/14 to 96.248.80.75. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.phpQuicksort quiz 4 Why do 2pivot (and 3pivot) quicksort perform better than 1pivot A. Fewer compares. B. Fewer exchanges. C. Fewer cache misses. D. I don't know. 62Quicksort quiz 4 Why do 2pivot (and 3pivot) quicksort perform better than 1pivot A. Fewer compares. entries scanned is a good proxy B. Fewer exchanges. for cache performance when comparing quicksort variants C. Fewer cache misses.
 
 partitioning compares exchanges entries scanned 
 1pivot 2 n ln n 0.333 n ln n 2 n ln n 
 medianof3 1.714 n ln n 0.343 n ln n 1.714 n ln n 
 2pivot 1.9 n ln n 0.6 n ln n 1.6 n ln n 
 3pivot 1.846 n ln n 0.616 n ln n 1.385 n ln n 
 Reference: Analysis of Pivot Sampling in DualPivot Quicksort by WildNebelMartínez Bottom line. Caching can have a significant impact on performance. beyond scope of this course 63System sort in Java 7 Arrays.sort(). Has one method for objects that are Comparable. Has an overloaded method for each primitive type. Has an overloaded method for use with a Comparator. Has overloaded methods for sorting subarrays. 
 Algorithms. Dualpivot quicksort for primitive types. Timsort for reference types. 
 
 Q. Why use different algorithms for primitive and reference types 
 
 
 Bottom line. Use the system sort 64
Website URL
Comment