Question? Leave a message!




Dynamic programming

Dynamic programming
ECE 250 Algorithms and Data Structures Dynamic Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering programming University of Waterloo Waterloo, Ontario, Canada ece.uwaterloo.ca dwharderalumni.uwaterloo.ca © 2006-2013 by Douglas Wilhelm Harder. Some rights reserved.Dynamic programming 2 Dynamic programming To begin, the word programming is used by mathematicians to describe a set of rules which must be followed to solve a problem – Thus, linear programming describes sets of rules which must be solved a linear problem – In our context, the adjective dynamic describes how the set of rules worksDynamic programming 3 Outline We will begin by looking at two problems: – Fibonacci numbers – Newton polynomial coefficients – Determining connectedness in a DAG We will examine the naïve top-down approach – We will implement the naïve top-down approach – We will discuss the run-times of these algorithms – We will consider the application of memoization – We will discuss the benefits of a bottom-up approachDynamic programming 4 Fibonacci numbers Consider this function: double F( int n ) return ( n = 1 ) ? 1.0 : F(n - 1) + F(n - 2); The run-time of this algorithm is  (1) n 1  T n   T n1 T n 2(1) n1  Dynamic programming 5 Fibonacci numbers Consider this function calculating Fibonacci numbers: double F( int n ) return ( n = 1 ) ? 1.0 : F(n - 1) + F(n - 2); The runtime is similar to the actual definition of Fibonacci numbers:  (1) n 1  11 n  T n   Fn   T n1 T n 2(1) n1  F n1 F n 21 n1    n Therefore, T(n) = W(F(n)) = W(f ) Tn  n lim 2 – In actual fact, T(n) = (f ), only n  Fn Dynamic programming 6 Fibonacci numbers To demonstrate, consider: double F( int n ) include iostream return ( n = 1 ) ? 1.0 : F(n - 1) + F(n - 2); include ctime using namespace std; int main() cout.precision( 16 ); // print 16 decimal digits of precision for doubles // 53/lg(10) = 15.95458977... for ( int i = 33; i 100; ++i ) cout "F(" i ") = " F(i) '\t' time(0) endl; return 0; Dynamic programming 7 Fibonacci numbers The output:  F(33) = 5702887 1206474355  F(34) = 9227465 1206474355 F(33), F(34), and F(35) in 1 s   F(35) = 14930352 1206474355  F(36) = 24157817 1206474356 F(37) = 39088169 1206474358 F(38) = 63245986 1206474360 F(39) = 102334155 1206474363 F(40) = 165580141 1206474368 F(41) = 267914296 1206474376 F(42) = 433494437 1206474389 F(43) = 701408733 1206474411 F(44) = 1134903170 1206474469 1 min to calculate F(44)Dynamic programming 8 Fibonacci numbers Problem: – To calculate F(44), it is necessary to calculate F(43) and F(42) – However, to calculate F(43), it is also necessary to calculate F(42) – It gets worse, for example • F(40) is called 5 times • F(30) is called 620 times • F(20) is called 75 025 times • F(10) is called 9 227 465 times • F(0) is called 433 494 437 times Surely we don’t have to recalculate F(10) almost ten million times…Dynamic programming 9 Fibonacci numbers Here is a possible solution: – To avoid calculating values multiple times, store intermediate calculations in a table – When storing intermediate results, this process is called memoization • The root is memo – We save (memoize) computed answers for possible later reuse, rather than re-computing the answer multiple timesDynamic programming 10 Fibonacci numbers Once we have calculated a value, can we not store that value and return it if a function is called a second time? – One solution: use an array – Another: use an associative hash tableDynamic programming 11 Fibonacci numbers static const int ARRAY_SIZE = 1000; double array = new doubleARRAY_SIZE; array0 = 1.0; array1 = 1.0; // use 0.0 to indicate we have not yet calculated a value for ( int i = 2; i ARRAY_SIZE; ++i ) arrayi = 0.0; double F( int n ) if ( arrayn == 0.0 ) arrayn = F( n – 1 ) + F( n – 2 ); return arrayn; Dynamic programming 12 Fibonacci numbers Recall the characteristics of an associative container: template typename S, typename T class Hash_table public: // is something stored with the given key? bool member( S key ) const; // returns value associated with the inserted key T retrieve( S key ) const; void insert( S key, T value ); // ... ; Dynamic programming 13 Fibonacci numbers This program uses the Standard Template Library: include map / calculate the nth Fibonacci number / double F( int n ) static std::mapint, double memo; // shared by all calls to F // the key is int, the value is double if ( n = 1 ) return 1.0; else if ( memon == 0.0 ) memon = F(n - 1) + F(n - 2); return memon; Dynamic programming 14 Fibonacci numbers This prints the output up to F(1476): int main() std::cout.precision( 16 ); for ( int i = 0; i 1476; ++i ) std::cout "F(" i ") = " F(i) std::endl; return 0; The last two lines are F(1475) = 1.306989223763399e+308 F(1476) = inf Exact value: F(1475) = 13069892237633993180363115538027198309839244390741264072 600665946019279307047923174028868108777701772109546315497901227623432224693693964718536670636 848936266084414744994134846280092275581896963474334898291642495406274413596986561540727649241 0653721774590669544801490837649161732095972658064630033793347171632Dynamic programming 15 Top-down and bottom-up algorithms This also allows for another approach: – Our implementation is top-down – From the very top, we break the problem into sub-problems – All divide-and-conquer algorithms we have seen are top-down An alternative approach is bottom-up approach – Solve smaller problems and use these to build a solution for the problem – Merge sort could be implemented bottom-up: • Divide the array into groups of size m and sort each group with insertion sort • Merge adjacent pairs into groups of size 2m where possible • Repeat this successively until the entire list is sortedDynamic programming 16 Top-down and bottom-up algorithms Here is a bottom-up approach to calculating Fibonacci numbers: double F( int n ) if ( n = 1 ) return 1.0; double a = 1.0, b = 1.0; for ( int i = 2; i = n; ++i ) a += b; b = a - b; return a; Dynamic programming 17 Newton polynomials The coefficient of a Newton polynomial which allows a polynomial of degree n – 1 passing through n points may be calculated by: yy kk ,...,21,...,1 y k,...,1 xx k 1 where k = 1, …, n and the coefficients are defined recursively: yy i,..., j 1 i 1,..., j y ij ,..., xx jk yy jj1 y jj,1 xx jj1Dynamic programming 18 Newton polynomials A recursive implementation for calculating Newton polynomials n coefficients would have a run time of (2 ) double Newton_polynomials( int i, int j, double xs, double ys ) if ( i == j ) return ysi; else if ( i + 1 == j ) return ( ysj – ysi )/( xsj – xsi ); else return ( Newton_polynomials( i + 1, j, xs, ys ) - Newton_polynomials( i, j - 1, xs, ys ) ) / (xsj - xi); Dynamic programming 19 Newton polynomials nn1  Note, however, that there are only coefficients 2 – This shouldn’t require such recursion x y 0 0 y y 1 0 y 1,0 x x 1 0 y y 2,1 1,0 x y 1 1 y 2,1,0 y y x x 2 1 y y 2 0 y 3,2,1 2,1,0 2,1 y 3,2,1,0 x x 2 1 y y x x 3,2 2,1 3 0 x y y 2 2 3,2,1 y y 3 2 x x 3 1 y 3,2 x x 3 2 x y 3 3Dynamic programming 20 Newton polynomials Here is an array-based implementation: include cmath static const int ARRAY_SIZE = 1000; double array = new doubleARRAY_SIZEARRAY_SIZE; for ( int i = 0; i ARRAY_SIZE; ++i ) for ( int j = 0; j ARRAY_SIZE; ++j ) arrayij = nan( "" ); double Newton_polynomials( int i, int j, double xs, double ys ) if ( i == j ) return ysi; else if ( i + 1 == j ) return ( ysj – ysi )/( xsj – xsi ); else if ( isnan( arrayij ) arrayij = ( Newton_polynomials( i + 1, j, xs, ys ) - Newton_polynomials( i, j - 1, xs, ys ) ) / (xsj - xi); return arrayij; Dynamic programming 21 Newton polynomials Here is a hash-table based implementation: include map include cmath include utility double Newton_polynomials( int i, int j, double xs, double ys, bool clear = true ) static std::map std::pair int, int , double memo; if ( clear ) memo.clear(); if ( i == j ) return ysi; else if ( i + 1 == j ) return ( ysj - ysi )/( xsj - xsi ); else if ( memo.find( std::pairint, int( i, j ) ) == memo.end() ) memostd::pairint, int( i, j ) = ( Newton_polynomials( i + 1, j, xs, ys, false ) - Newton_polynomials( i, j - 1, xs, ys, false ) ) / (xsj - xsi); return memostd::pairint, int( i, j ); Dynamic programming 22 Newton polynomials Here is a bottom-up approach: void Newton_polynomials( int n, double xs, double ys, double coeffs ) for ( int i = 0; i n; ++i ) coeffsi = ysi; for ( int i = 1; i n; ++i ) for ( int j = n - 1; j = i; j ) coeffsj = ( coeffsj – coeffsj – 1 ) / ( xsj – xsj – i ); 2 Note that the memory required is now (n) and not (n )Dynamic programming 23 Connected Determining if two vertices are connected in a DAG, we could implement the following: bool Weighted_graph::connected( int i, int j ) if ( adjacent( i, j ) ) return true; for ( int v : neighbors( i ) ) if ( connected( v, j ) ) return true; return false; – What are the issues with this implementation?Dynamic programming 24 Dynamic programming In solving optimization problems, the top-down approach may require repeatedly obtaining optimal solutions for the same sub- problem – Mathematician Richard Bellman initially formulated the concept of dynamic programming in 1953 to solve such problems – This isn’t new, but Bellman formalized defined this processDynamic programming 25 Dynamic programming Dynamic programming is distinct from divide-and-conquer, as the divide-and-conquer approach works well if the sub-problems are essentially unique – Storing intermediate results would only waste memory If sub-problems re-occur, the problem is said to have overlapping sub-problemsDynamic programming 26 Matrix chain multiplication Suppose A is k × m and B is m × n – Then AB is k × n and calculating AB is (kmn) – The number of multiplications is given exactly kmn Suppose we are multiplying three matrices ABC – Matrix multiplication is associative so we may choose (AB)C or A(BC) – The order of the multiplications may significantly affect the run time For example, if A and B are n × n matrices and v is an n-dimensional column vector: 3 – Calculating (AB)v is (n ) 2 – Calculating A(Bv) is (n )Dynamic programming 27 Matrix chain multiplication Suppose we want to multiply four matrices ABCD – There are may ways of parenthesizing this product: ((AB)C)D (AB)(CD) (A(BC))D A((BC)D) A(B(CD)) – Which has the least number of operations?Dynamic programming 28 Matrix chain multiplication For example, consider these four: Matrix Dimensions A 20 × 5 B 5 × 40 C 40 × 50 D 50 × 10Dynamic programming 29 Matrix chain multiplication Considering each order: ((AB)C)D The required number of multiplications is: AB 20 × 5 × 40 = 4000 (AB)C 20 × 40 × 50 = 40000 ((AB)C)D 20 × 50 × 10 = 10000 This totals to 54000 multiplicationsDynamic programming 30 Matrix chain multiplication Considering the next order: (AB)(CD) The required number of multiplications is: AB 20 × 5 × 40 = 4000 CD 40 × 50 × 10 = 20000 (AB)(CD) 20 × 40 × 10 = 8000 This totals to 32000 multiplicationsDynamic programming 31 Matrix chain multiplication Considering the next order: (A(BC))D The required number of multiplications is: BC 5 × 40 × 50 = 10000 A(BC) 20 × 5 × 50 = 5000 (A(BC))D 20 × 50 × 10 = 10000 This totals to 25000 multiplicationsDynamic programming 32 Matrix chain multiplication And the the next order: A((BC)D) The required number of multiplications is: BC 5 × 40 × 50 = 10000 (BC)D 5 × 50 × 10 = 2500 A((BC)D) 20 × 5 × 10 = 1000 This totals to 13500 multiplicationsDynamic programming 33 Matrix chain multiplication Repeating this for the last, we get the following table: Order Multiplications ((AB)C)D 54000 (AB)(CD) 32000 (A(BC))D 25000 A((BC)D) 13500 A(B(CD)) 23000 The optimal run time uses A((BC)D) Dynamic programming 34 Matrix chain multiplication Thus, the optimal run time may be found by calculating the product in the order A((BC)D) Problem: What if we are multiply n matrices?Dynamic programming 35 Matrix chain multiplication Can we generate a greedy algorithm to achieve this? – Greedy by the smallest number of operations? • Unfortunately, 2 × 1 × 2 + 2 × 2 × 3 = 16 12 = 1 × 2 × 3 + 2 × 1 × 3 even though 2 × 1 × 2 = 4 6 = 1 × 2 × 3 3 3 8 9  64   2 1 0 5  – Greedy by generating the smallest matrices (sum of dimensions)? • Unfortunately, 2 × 1 × 2 + 2 × 2 × 4 = 20 16 = 1 × 2 × 4 + 2 × 1 × 4 even though 2 + 2 = 4 5 = 1 + 4 3 2 7 9 4  64   2 0 8 1 5 Dynamic programming 36 Matrix chain multiplication If we are multiplying A A A A⋅⋅⋅A , starting top-down, there are n – 1 1 2 3 4 n different ways of parenthesizing this sequence: (A )(A A A⋅⋅⋅A ) 1 2 3 4 n (A A )(A A⋅⋅⋅A ) 1 2 3 4 n (A A A )(A⋅⋅⋅A ) 1 2 3 4 n ⋅⋅⋅ (A⋅⋅⋅A )(A A ) 1 n – 2 n – 1 n (A⋅⋅⋅A A )(A ) 1 n – 2 n – 1 n For each one we must ask: – What is the work required to perform this multiplication – What is the minimal amount of work required to perform both of the other products?Dynamic programming 37 Matrix chain multiplication For example, in finding the best product of (A⋅⋅⋅A )(A⋅⋅⋅A ) 1 i i + 1 n the work required is: – The product columns(A ) rows(A ) rows(A ) 1 i n • Note that rows(A ) and columns(A ) must be equal i i + 1 – The minimal work required to multiply A⋅⋅⋅A 1 i – The minimal work required to multiply A⋅⋅⋅A i + 1 nDynamic programming 38 Matrix chain multiplication int matrix_chain( Matrix ms, int i, int j ) // There is only one matrix if ( i + 1 == j ) return 0; else if ( i + 2 == j ) assert( msi.columns() == msi + 1.rows() ); return msi.rows() msi.columns() msi + 1.columns(); // We are multiplying at least three matrices // Start with calculating the work for Mi (Mi + 1 ... Mj - 1) assert( msi.columns() == msi + 1.rows() ); int minimum = matrix_chain( ms, i + 1, j ) + msi.rows() msi.columns() msj - 1.columns(); for ( int k = i + 2; k j; ++k ) // Find the work for (Mi ... Mk - 1) (Mk ... Mj - 1) and update if it is less assert( msk - 1.columns() == msk.rows() ); int current = matrix_chain( ms, i, k ) + matrix_chain( ms, k, j ) + + msi.rows() msk.rows() msj - 1.columns(); if ( current minimum ) minimum = current; return minimum; Dynamic programming 39 Matrix chain multiplication Because of the recursive nature, we will on numerous occasions be asking for the optimal behaviour of a given subsequence – We will asked the optimal way to multiply A⋅⋅⋅ A when we ask the 3 n – 2 optimal way to multiply any of the following: A (A (( (A⋅⋅⋅ A ) A )A )) 1 2 3 n – 2 n – 1 n A (A ( (A⋅⋅⋅ A ) (A A ))) 1 2 3 n – 2 n – 1 n A ((A (A⋅⋅⋅ A ) ) (A A )) 1 2 3 n – 2 n – 1 n A ((A ( (A⋅⋅⋅ A ) A )) A ) 1 2 3 n – 2 n – 1 n A ((A (A⋅⋅⋅ A ) ) A ) A ) 1 2 3 n – 2 n – 1 n (A A ) (( (A⋅⋅⋅ A ) A ) A ) 1 2 3 n – 2 n – 1 n (A A ) ( (A⋅⋅⋅ A ) (A A )) 1 2 3 n – 2 n – 1 n (A (A ( (A⋅⋅⋅ A ) )) (A A ) 1 2 3 n – 2 n – 1 n ((A A ) (A⋅⋅⋅ A ) )(A A ) 1 2 3 n – 2 n – 1 n (A (A (A⋅⋅⋅ A ) A ))) A 1 2 3 n – 2 n – 1 n (A ((A (A⋅⋅⋅ A ) )A )) A 1 2 3 n – 2 n – 1 n ((A A ) ( (A⋅⋅⋅ A ) A ) A 1 2 3 n – 2 n – 1 n ((A (A (A⋅⋅⋅ A ) )) A ) A 1 2 3 n – 2 n – 1 n (((A A ) (A⋅⋅⋅ A ) )A ) A 1 2 3 n – 2 n – 1 nDynamic programming 40 Matrix chain multiplication The actual number of possible orderings is given by the following recurrence relation: 1 n 1  n1  Pn  PnPn k n 1    k1 Without proof, this recurrence relation is solved by P(n) = C(n – 1) th where C(n – 1) is the (n – 1) Catalan number C(1) = 1 C(6) = 132 n C(2) = 2 C(7) = 429 2n   14 C n    3/2 C(3) = 5 C(8) = 1430 nn1 n   C(4) = 14 C(9) = 4862 C(5) = 42 C(10) = 16796Dynamic programming 41 Matrix chain multiplication The number of function calls of the implementation is given by 1 n1,2  n Tn 3    n3 4 3 n 3  Can we speed this up? – Memoization: once we’ve found the optimal number of solutions for a given sequence, store itDynamic programming 42 include map include utility Matrix chain multiplication int matrix_chain_memo( Matrix ms, int i, int j, bool clear = true ) static std::map std::pair int, int , int memo; if ( clear ) memo.clear(); Associate a pair (i, j) with an integer if ( i + 1 == j ) return 0; else if ( memostd::pairint, int(i, j) == 0 ) if ( i + 2 == j ) memostd::pairint, int(i, j) = msi.rows() msi.columns() msi + 1.columns(); else int minimum = matrix_chain_memo( ms, i + 1, j, false ) + msi.rows() msi.columns() msj - 1.columns(); for ( int k = i + 2; k j; ++k ) int current = matrix_chain_memo( ms, i, k, false ) + matrix_chain_memo( ms, k, j, false ) + msi.rows() msk.rows() msj - 1.columns(); if ( current minimum ) minimum = current; memostd::pairint, int(i, j) = minimum; return memostd::pairint, int(i, j); Dynamic programming 43 Matrix chain multiplication Our memoized version now runs in 11 n  2 T n n    n1 n1 n 2   This is a top-down implementation – Can we implement a bottom-up version?Dynamic programming 44 Matrix chain multiplication For a bottom-up implementation, we need a matrix that stores our best current solution to A to A : i j ··· j = 1 2 3 4 n i = 1 0 ··· 2 0 ··· 3 0 ··· 4 0 · · · n 0Dynamic programming 45 Matrix chain multiplication As we calculate the minimum number of multiplications required for a specific sequence A ···A , we fill the entry a in the table i j ij ··· j = 1 2 3 4 n i = 1 0 ··· 2 0 ··· 3 0 ··· 4 0 · · · n 0Dynamic programming 46 Matrix chain multiplication 2 This table has n entries, and therefore our run time must be at least 2 W(n ) 3 However, at each step, the actual run time is (n )Dynamic programming 47 Matrix chain multiplication For example, given the previous example Matrix Dimensions A 20 × 5 1 A 5 × 40 2 A 40 × 50 3 A 50 × 10 4 we can calculate the table as follows...Dynamic programming 48 Matrix chain multiplication We can calculate the off-diagonal easily: 1 2 3 4 Matrix Dimensions 1 0 4000 20×5 20×40 A 20 × 5 1 2 0 10000 A 5 × 40 2 5×40 5×50 A 40 × 50 3 3 0 20000 A 50 × 10 40×50 40×10 4 4 0 50×10Dynamic programming 49 Matrix chain multiplication Next, we may calculate either: A (A A ) 20 × 5 × 50 + 10000 = 15000 1 2 3 (A A )A 4000 + 20 × 40 × 50 = 44000 1 2 3 1 2 3 4 Matrix Dimensions 1 0 4000 15000 20×5 20×40 20×50 A 20 × 5 1 2 0 10000 A 5 × 40 2 5×40 5×50 A 40 × 50 3 3 0 24000 A 50 × 10 40×50 40×10 4 4 0 50×10Dynamic programming 50 Matrix chain multiplication We continue: A (A A ) 5 × 40 × 10 + 20000 = 22000 2 3 4 (A A )A 10000 + 5 × 50 × 10 = 12500 2 3 4 1 2 3 4 Matrix Dimensions 1 0 4000 15000 20×5 20×40 20×50 A 20 × 5 1 2 0 10000 12500 A 5 × 40 2 5×40 5×50 5×10 A 40 × 50 3 3 0 20000 A 50 × 10 40×50 40×10 4 4 0 50×10Dynamic programming 51 Matrix chain multiplication Finally we calculate: A ((A A )A ) 20 × 5 × 10 + 12500 = 13500 1 2 3 4 (A A )(A A ) 4000 + 20 × 40 × 10 + 20000 = 32000 1 2 3 4 (A (A A ))A 15000 + 20 × 50 × 10 = 25000 1 2 3 4 1 2 3 4 Matrix Dimensions 1 0 4000 15000 13500 20×5 20×40 20×50 20×10 A 20 × 5 1 2 0 10000 12500 A 5 × 40 2 5×40 5×50 5×10 A 40 × 50 3 3 0 20000 A 50 × 10 40×50 40×10 4 4 0 50×10Dynamic programming 52 Matrix chain multiplication Thus, counting the number of calculations required (each (1)): n – 1 (n – 2) 2 (n – 3) 3 which suggests the sum n1 n1 n1 3 2 3 n n n(n1)(2n1) n n 2 (n i)i n i i  2 6 6 i1 i1 i1 3 Therefore, the run time is (n )Dynamic programming 53 Matrix chain multiplication int matrix_chain_iterative( Matrix ms, int n ) int arraynn; for ( int i = 0; i n; i++ ) arrayii = 0; for ( int i = 1; i n; i++ ) for ( int j = 0; j n - i; j++ ) arrayjj + i = arrayjj + arrayj + 1j + i + msj.rows()msj + 1.rows()msj + i.columns(); for ( int k = j + 1; k j + i; k++) int current = arrayjk + arrayk + 1j + i + msj.rows()msk + 1.rows()msj + i.columns(); if ( current arrayjj + i ) arrayjj + i = current; return array0n - 1; Dynamic programming 54 Matrix chain multiplication How can you estimate run times? – Which is faster—the top-down implementation with memoization or the bottom-up implementation? – You could plot run times… – Question: what is the growth of this plotted data? • Quadratic? • Cubic? 2 •(n ln(n))Dynamic programming 55 Matrix chain multiplication If a function grows in polynomial time, it is of the form: b T(n) = an Take the logarithm of both sides: b ln(T(n)) = ln(an ) = ln(a) + b ln(n) – It grows linearly with a slope b Top-down with memoization Bottom-up b ≈ 2 b ≈ 3Dynamic programming 56 Optimal polygon triangulation In graphics and geometry, convex polygons are a basic unit – Applications in graphics and finite-element methodsDynamic programming 57 Optimal polygon triangulation In graphics and geometry, convex polygons are a basic unit – Dividing such a polygon into simpler triangles is a common operation Dynamic programming 58 Optimal polygon triangulation In graphics and geometry, convex polygons are a basic unit – Some triangulations may be better than others – For example, fewer thin trianglesDynamic programming 59 Optimal polygon triangulation Now, a convex quadrilateral (tetragon) can only be triangulated in two different waysDynamic programming 60 Optimal polygon triangulation A convex pentagon can be triangulated in five different waysDynamic programming 61 Optimal polygon triangulation And a convex hexagon can be triangulated in 14 different waysDynamic programming 62 Optimal polygon triangulation The sequence 2, 5, 13 may already appear familiar – The Catalan numbers... If we can put a weight on each generated triangle, can we find an optimal triangulation? – Can we come up with a good algorithm? – Consider the previous problem of finding an optimal order for multiplying matricesDynamic programming 63 Optimal polygon triangulation Choose a side and begin numbering the sides in order – Any two adjacent sides can be joined to create a triangle – Any two adjacent matrices could be multipliedDynamic programming 64 Optimal polygon triangulation Taking two adjacent sides and creating a triangle is similar to bracketingDynamic programming 65 Optimal polygon triangulation Instead of sides 1 and 2, we could choose 2 and 3Dynamic programming 66 Optimal polygon triangulation Or 3 and 4Dynamic programming 67 Optimal polygon triangulation Suppose the triangle 3/4 was optimal – Next, do we add the side 2?Dynamic programming 68 Optimal polygon triangulation Suppose the triangle 3/4 was optimal – Or do we add the side 5?Dynamic programming 69 Optimal polygon triangulation The analogy is not exact because there is no logic to bracketing and multiplying matrices A and A 1 7 Never-the-less, this strongly suggests that there is an efficient algorithm based on dynamic programming that will find an optimal triangulationDynamic programming 70 Interval scheduling Process Interval A problem we saw in the topic on A 5 – 8 greedy algorithms was interval 10 – 13 B scheduling C 6 – 9 12 – 15 D E 3 – 7 8 – 11 F 1 – 6 G H 8 – 12 3 – 5 J K 2 – 4 11 – 16 L M 10 – 15Dynamic programming 71 Interval scheduling Process Interval The earliest-deadline-first greedy A 5 – 8 algorithm will maximize the number 10 – 13 B of processes that can be scheduled C 6 – 9 12 – 15 D E 3 – 7 8 – 11 F 1 – 6 G H 8 – 12 3 – 5 J K 2 – 4 11 – 16 L M 10 – 15Dynamic programming 72 Interval scheduling Process Interval Weight Suppose, however, that there are A 5 – 8 1.7 weights associated with the processes 10 – 13 B 1.3 – Can we find the schedulable processes that have maximal weight? C 6 – 9 3.9 12 – 15 D 3.2 E 3 – 7 5.9 Currently, no greedy algorithm is known that can solve this problem 8 – 11 F 3.3 1 – 6 G 5.4 H 8 – 12 1.2 3 – 5 J 5.8 K 2 – 4 4.8 11 – 16 L 3.9 M 10 – 15 2.6Dynamic programming 73 Interval scheduling Process Interval Weight Note: if we wanted to optimize based A 5 – 8 3 on processor usage, the weight would 10 – 13 B 3 be equal to the computation time C 6 – 9 3 12 – 15 D 3 E 3 – 7 4 8 – 11 F 3 1 – 6 G 5 H 8 – 12 4 3 – 5 J 2 K 2 – 4 2 11 – 16 L 5 M 10 – 15 5Dynamic programming 74 Interval scheduling Place Process Interval Weight First, as before, we sort the processes by their 1 K 2 – 4 4.8 deadlines 3 – 5 2 J 5.8 3 G 1 – 6 5.4 3 – 7 4 E 5.9 5 A 5 – 8 1.7 6 – 9 6 C 3.9 8 – 11 7 F 3.3 8 H 8 – 12 1.2 10 – 13 9 B 1.3 10 D 12 – 15 3.2 10 – 15 11 M 2.6 12 L 11 – 16 3.9Dynamic programming 75 Interval scheduling Place Process Interval Weight Previous Process L could only be th run if nothing after the 7 1 K 2 – 4 4.8 0 process is chosen 3 – 5 2 J 5.8 0 3 G 1 – 6 5.4 0 3 – 7 4 E 5.9 0 5 A 5 – 8 1.7 2 6 – 9 6 C 3.9 3 8 – 11 7 F 3.3 5 8 H 8 – 12 1.2 5 10 – 13 9 B 1.3 6 10 D 12 – 15 3.2 8 10 – 15 11 M 2.6 6 12 L 11 – 16 3.9 7Dynamic programming 76 Interval scheduling Place Process Interval Weight Previous Process M could only be th run if nothing after the 6 1 K 2 – 4 4.8 0 process is chosen 3 – 5 2 J 5.8 0 3 G 1 – 6 5.4 0 3 – 7 4 E 5.9 0 5 A 5 – 8 1.7 2 6 – 9 6 C 3.9 3 8 – 11 7 F 3.3 5 8 H 8 – 12 1.2 5 10 – 13 9 B 1.3 6 10 D 12 – 15 3.2 8 10 – 15 11 M 2.6 6 12 L 11 – 16 3.9 7Dynamic programming 77 Interval scheduling Place Process Interval Weight Previous If a process must be run first, we mark its previous 1 K 2 – 4 4.8 0 process as 0 3 – 5 2 J 5.8 0 3 G 1 – 6 5.4 0 3 – 7 4 E 5.9 0 5 A 5 – 8 1.7 2 6 – 9 6 C 3.9 3 8 – 11 7 F 3.3 5 8 H 8 – 12 1.2 5 10 – 13 9 B 1.3 6 10 D 12 – 15 3.2 8 10 – 15 11 M 2.6 6 12 L 11 – 16 3.9 7Dynamic programming 78 Interval scheduling Place Process Interval Weight Previous If a process must be run first, we mark its previous 1 K 2 – 4 4.8 0 process as 0 3 – 5 2 J 5.8 0 3 G 1 – 6 5.4 0 3 – 7 4 E 5.9 0 5 A 5 – 8 1.7 2 6 – 9 6 C 3.9 3 8 – 11 7 F 3.3 5 8 H 8 – 12 1.2 5 10 – 13 9 B 1.3 6 10 D 12 – 15 3.2 8 10 – 15 11 M 2.6 6 12 L 11 – 16 3.9 7Dynamic programming 79 Interval scheduling Thus, given process n, the optimal weight is either: The weight of this process plus the optimal weight of that previous process that could have been run or st The optimal weight of the (n – 1) process This can be quickly converted to code: int optimal_weight( int n ) const // The zeroeth process has weight 0 if ( n == 0 ) return 0; return std::max( weightn + optimal_weight( previousn ), optimal_weight( n - 1 ) ); Dynamic programming 80 Interval scheduling Unfortunately, consider the worst-case run-time analysis: – Suppose it is true that previousn == (n - 1) for all n, the run time analysis is: int optimal_weight( int n ) const if ( n == 0 ) return 0;  10 n   Tn   2T n1 1 n 0    return std::max( weightn + optimal_weight( previousn ), optimal_weight( n - 1 ) ); Dynamic programming 81 Interval scheduling In the optimal case where we can run every process, the run time is exponential:   10 n   n Tn 2    2T n1 1 n 0   Dynamic programming 82 Interval scheduling However, using memoization: int optimal_weight_memo( int n, bool clear = true ) const static std::mapint, int memo; if ( clear ) memo.clear(); if ( n == 0 ) return 0; else if ( memon == 0 ) memon = std::max( weightn + optimal_weight_memo( previousn, false ), optimal_weight_memo( n – 1, false ) ); return memon; Dynamic programming 83 Interval scheduling We can also implement an iterative bottom-up approach: int optimal_weight_iterative( int n ) const int optimaln + 1; optimal0 = 0; for ( int k = 1; k = n; ++k ) optimalk = max( weightn + optimalpreviousn, optimaln – 1 ); return optimaln; Dynamic programming 84 Interval scheduling There is only one loop: the run-time is T n  n  int optimal_weight_iterative( int n ) const int optimaln + 1; optimal0 = 0; for ( int k = 1; k = n; ++k ) optimalk = max( weightn + optimalpreviousn, optimaln – 1 ); return optimaln; Dynamic programming 85 Project management 0/1 knapsack problem Recall our problem of project management or 0/1 knapsack problem – The constraint W is an integer value – There are n items we could include: th • The n item has an integral weight w k • Its value is v k – The optimal value of objects we can place in the knapsack is:  00 k  knapsack k, w knapsack k1, w w w   k  max knapsack k1, w ,knapsack k1, w w v w w   k k k  – Again, this is a recursive optimization problemDynamic programming 86 Project management 0/1 knapsack problem Here is a hash-table based implementation: include map include cmath include utility double knapsack( int weight, double value, int k, int W, bool clear = true ) static std::map std::pair int, int , double memo; if ( clear ) memo.clear(); if ( k == 0 ) return 0; else if ( memo.find( std::pairint, int( k, W ) ) == memo.end() ) if ( weightk - 1 W ) memostd::pairint, int( k, W ) = knapsack( weight, value, k - 1, W, false ); else memostd::pairint, int( k, W ) = std::max( knapsack( weight, value, k - 1, W, false ), knapsack( weight, value, k - 1, W - weightk - 1, false ) + valuek - 1 ); return memostd::pairint, int( k, W ); Dynamic programming 87 Project management 0/1 knapsack problem The run-time is O(nW) – This is significantly slower than our greedy algorithm with O(n ln(n)) – The greedy algorithm also works if the weights are not integral For our previous example, the output of int main() int weight10 = 15, 12, 10, 9, 8, 7, 5, 4, 3, 1; double value10 = 210, 220, 180, 120, 160, 170, 90, 40, 60, 10; std::cout knapsack( weight, value, 10, 26 ) std::endl; return 0; is 520, which is the optimal resultDynamic programming 88 Arrays versus hash tables Both simple arrays and hash tables may be dynamically resized As we saw before, this does not significantly affect the run time Two problems with arrays are: – They are indexed only by integers – They are dense (we may not be solving all sub-problems)Dynamic programming 89 Remember tables in Maple Maple is a programming language with built-in memoization – In Maple, memoization is achieved through the use of hash tables termed remember tables – To demonstrate, we will implement the calculation of Fibonacci numbers in MapleDynamic programming 90 Remember tables in Maple In Maple, we state that a function must have option remember This gives the function remember tables F := proc(n) option remember; Maple’s version of ―this‖ function printf( "Calling F(%d)\n", n ); procname( n - 1 ) + procname( n - 2 ); end proc: F( 0 ) := 1: assign F(0) = 1 to remember table F( 1 ) := 1: assign F(1) = 1 to remember tableDynamic programming 91 Remember tables in Maple F(5); the first time it is called, we call F(5), F(4), F(3), and F(2) Calling F(5) Calling F(4) Calling F(3) Calling F(2) 8 F(5); the second time, the result comes from the remember table 8 F(7); should only call F(7) and F(6) Calling F(7) Calling F(6) 21Dynamic programming 92 Remember tables in Maple Recall however, that remember tables are hash tables (with 1024 bins) Therefore, they can fill up, and therefore accessing previous results can become expensiveDynamic programming 93 Remember tables in Maple Maple uses a static chained hash table with 1024 bins – Making n accesses to a hash table should occur in (n) time – The following shows the time (in seconds) it takes to calculate F( 1024n ) for n up to 80 – Behaviour is initially linear, but it reverts back to quadratic growthDynamic programming 94 Applications Other applications of dynamic programming include: – The Floyd-Warshall algorithm solving the all-pairs shortest path problem – Longest common subsequence between two strings • This has applications to bioinformatics such as DNA matching 2 n – Solves the Travelling salesman problem in O(n 2 ) time – The Earley parser of context-free languages (ECE 351 Compilers) – The Levenshtein distance between two strings (used in spell checkers) These will be discussed in the homeworkDynamic programming 95 Applications And finally: – the Duckworth-Lewis method for calculating the target score in an interrupted cricket match: • On the 2006 One-Day International between India and Pakistan, India batted first and was all out in the 49th over for 328 • Pakistan was 7 wickets down for 311 when bad light stopped play after the 47th over • With three full overs left to play and three wickets in hand, the Duckworth-Lewis method calculated the target to be 304, so the result of the match is officially listed as ―Pakistan won by 7 runs (D/L Method)‖ Jono4174 http://en.wikipedia.org/wiki/Duckworth-Lewis_methodDynamic programming 96 Summary In this class, we have considered the algorithm design strategy of dynamic programming – Useful when recursive algorithms have overlapping sub-problems – Storing calculated values allows significant reductions in time • Memoization – Considered top-down and bottom-up approaches to solving problems – Looked at both simple and more complex solutions: • Fibonacci numbers • Newton polynomial coefficients • Matrix chain multiplications • Optimal polygon triangulation • Interval scheduling – Also considered Maple’s remember tablesDynamic programming 97 References Wikipedia, http://en.wikipedia.org/wiki/Topological_sorting 1 Cormen, Leiserson, and Rivest, Introduction to Algorithms, McGraw Hill, 1990, §11.1, p.200. rd 2 Weiss, Data Structures and Algorithm Analysis in C++, 3 Ed., Addison Wesley, §9.2, p.342-5. These slides are provided for the ECE 250 Algorithms and Data Structures course. The material in it reflects Douglas W. Harder’s best judgment in light of the information available to him at the time of preparation. Any reliance on these course slides by any party for any other purpose are the responsibility of such parties. Douglas W. Harder accepts no responsibility for damages, if any, suffered by any party as a result of decisions made or actions based on these course slides for any other purpose than that for which it was intended.
Website URL
Comment