Question? Leave a message!




Dijkstra's algorithm

Dijkstra's algorithm
4. GREEDY ALGORITHMS II Dijkstra's algorithm ‣ minimum spanning trees ‣ Prim, Kruskal, Boruvka ‣ singlelink clustering ‣ mincost arborescences ‣ Lecture slides by Kevin Wayne Copyright © 2005 PearsonAddison Wesley Copyright © 2013 Kevin Wayne http://www.cs.princeton.edu/wayne/kleinbergtardos Last updated on Sep 8, 2013 6:30 AM4. GREEDY ALGORITHMS II Dijkstra's algorithm ‣ minimum spanning trees ‣ Prim, Kruskal, Boruvka ‣ singlelink clustering ‣ mincost arborescences ‣ SECTION 4.4Shortestpaths problem Problem. Given a digraph G = (V, E), edge lengths ≥ 0, source s ∈ V, e and destination t ∈ V, find the shortest directed path from s to t. 3 1 15 5 4 12 source s 3 0 8 9 7 2 7 9 6 1 11 5 5 4 13 6 4 20 destination t length of path = 9 + 4 + 1 + 11 = 25 3Car navigation 4Shortest path applications PERT/CPM. Map routing. Seam carving. Robot navigation. Texture mapping. Typesetting in LaTeX. Urban traffic planning. Telemarketer operator scheduling. Routing of telecommunications messages. Network routing protocols (OSPF, BGP, RIP). Optimal truck routing through given traffic congestion pattern. Reference: Network Flows: Theory, Algorithms, and Applications, R. K. Ahuja, T. L. Magnanti, and J. B. Orlin, Prentice Hall, 1993. 5Dijkstra's algorithm Greedy approach. Maintain a set of explored nodes S for which algorithm has determined the shortest path distance d(u) from s to u. Initialize S = s , d(s) = 0. Repeatedly choose unexplored node v which minimizes shortest path to some node u in explored part, followed by a single edge (u, v) e v d(u) u S s 6Dijkstra's algorithm Greedy approach. Maintain a set of explored nodes S for which algorithm has determined the shortest path distance d(u) from s to u. Initialize S = s , d(s) = 0. Repeatedly choose unexplored node v which minimizes add v to S, and set d(v) = π(v). shortest path to some node u in explored part, followed by a single edge (u, v) d(v) e v d(u) u S s 7Dijkstra's algorithm: proof of correctness Invariant. For each node u ∈ S, d(u) is the length of the shortest s↝u path. Pf. by induction on S Base case: S = 1 is easy since S = s and d(s) = 0. Inductive hypothesis: Assume true for S = k ≥ 1. Let v be next node added to S, and let (u, v) be the final edge. The shortest s↝u path plus (u, v) is an s↝v path of length π(v). Consider any s↝v path P. We show that it is no shorter than π(v). Let (x, y) be the first edge in P that leaves S, and let P' be the subpath to x. P' y x P is already too long as soon as it reaches y. P s S u v ≥ d(x) + (x, y) ≥ π (y) ≥ π (v) (P) ≥ (P') + (x, y) ▪ definition nonnegative Dijkstra chose v inductive lengths instead of y hypothesis of π(y) 8Dijkstra's algorithm: efficient implementation Critical optimization 1. For each unexplored node v, explicitly maintain π(v) instead of computing directly from formula: π(v) = min d(u) +  . e e = (u,v) : u∈ S For each v ∉ S, π (v) can only decrease (because S only increases). More specifically, suppose u is added to S and there is an edge (u, v) € leaving u. Then, it suffices to update: π (v) = min π (v), d(u) + (u, v) Critical optimization 2. Use a priority queue to choose the unexplored node that minimizes π (v). 9Dijkstra's algorithm: efficient implementation Implementation. Algorithm stores d(v) for each explored node v. Priority queue stores π (v) for each unexplored node v. Recall: d(u) = π (u) when u is deleted from priority queue. DIJKSTRA (V, E, s) Create an empty priority queue. FOR EACH v ≠ s : d(v) ← ∞; d(s) ← 0. FOR EACH v ∈ V : insert v with key d(v) into priority queue. WHILE (the priority queue is not empty) u ← deletemin from priority queue. FOR EACH edge (u, v) ∈ E leaving u: IF d(v) d(u) + (u, v) decreasekey of v to d(u) + (u, v) in priority queue. d(v) ← d(u) + (u, v). 10Dijkstra's algorithm: which priority queue Performance. Depends on PQ: n insert, n deletemin, m decreasekey. Array implementation optimal for dense graphs. Binary heap much faster for sparse graphs. 4way heap worth the trouble in performancecritical situations. Fibonacci/Brodal best in theory, but not worth implementing. PQ implementation insert deletemin decreasekey total 2 unordered array O(1) O(n) O(1) O(n ) binary heap O(log n) O(log n) O(log n) O(m log n) dway heap O(d log n) O(d log n) O(log n) O(m log n) d d d m/n (Johnson 1975) Fibonacci heap † † O(1) O(log n) O(1) O(m + n log n) (FredmanTarjan 1984) Brodal queue O(1) O(log n) O(1) O(m + n log n) (Brodal 1996) † amortized 11Extensions of Dijkstra's algorithm Dijkstra's algorithm and proof extend to several related problems: Shortest paths in undirected graphs: d(v) ≤ d(u) + (u, v). Maximum capacity paths: d(v) ≥ min π (u), c(u, v) . Maximum reliability paths: d(v) ≥ d(u) 𐄂 γ(u, v) . … Key algebraic structure. Closed semiring (tropical, bottleneck, Viterbi). e v d(u) u S s 124. GREEDY ALGORITHMS II Dijkstra's algorithm ‣ minimum spanning trees ‣ Prim, Kruskal, Boruvka ‣ singlelink clustering ‣ mincost arborescences ‣ SECTION 6.1Cycles and cuts Def. A path is a sequence of edges which connects a sequence of nodes. Def. A cycle is a path with no repeated nodes or edges other than the starting and ending nodes. 2 3 4 1 6 5 7 8 cycle C = (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 1) 14Cycles and cuts Def. A cut is a partition of the nodes into two nonempty subsets S and V – S. Def. The cutset of a cut S is the set of edges with exactly one endpoint in S. 2 3 4 1 6 5 cut S 7 8 cutset D = (3, 4), (3, 5), (5, 6), (5, 7), (8, 7) 15Cyclecut intersection Proposition. A cycle and a cutset intersect in an even number of edges. 2 3 4 1 6 5 7 8 cutset D = (3, 4), (3, 5), (5, 6), (5, 7), (8, 7) cycle C = (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 1) intersection C ∩ D = (3, 4), (5, 6) 16Cyclecut intersection Proposition. A cycle and a cutset intersect in an even number of edges. Pf. by picture S cycle C 17Spanning tree properties Proposition. Let T = (V, F) be a subgraph of G = (V, E). TFAE: T is a spanning tree of G. T is acyclic and connected. T is connected and has n – 1 edges. T is acyclic and has n – 1 edges. T is minimally connected: removal of any edge disconnects it. T is maximally acyclic: addition of any edge creates a cycle. T has a unique simple path between every pair of nodes. T = (V, F) 18Minimum spanning tree Given a connected graph G = (V, E) with edge costs c , an MST is a subset of e the edges T ⊆ E such that T is a spanning tree whose sum of edge costs is minimized. 24 4 23 9 6 18 16 5 11 8 7 10 14 21 MST cost = 50 = 4 + 6 + 8 + 5 + 11 + 9 + 7 n–2 can't solve by brute force Cayley's theorem. There are n spanning trees of K . n 19Applications MST is fundamental problem with diverse applications. Dithering. Cluster analysis. Max bottleneck paths. Realtime face verification. LDPC codes for error correction. Image registration with Renyi entropy. Find road networks in satellite and aerial imagery. Reducing data storage in sequencing amino acids in a protein. Model locality of particle interactions in turbulent fluid flows. Autoconfig protocol for Ethernet bridging to avoid cycles in a network. Approximation algorithms for NPhard problems (e.g., TSP, Steiner tree). Network design (communication, electrical, hydraulic, computer, road). 20Fundamental cycle Fundamental cycle. Adding any nontree edge e to a spanning tree T forms unique cycle C. Deleting any edge f ∈ C from T ∪ e results in new spanning tree. e f T = (V, F) Observation. If c c , then T is not an MST. e f 21Fundamental cutset Fundamental cutset. Deleting any tree edge f from a spanning tree T divide nodes into two connected components. Let D be cutset. Adding any edge e ∈ D to T – f results in new spanning tree. e f T = (V, F) Observation. If c c , then T is not an MST. e f 22The greedy algorithm Red rule. Let C be a cycle with no red edges. Select an uncolored edge of C of max weight and color it red. Blue rule. Let D be a cutset with no blue edges. Select an uncolored edge in D of min weight and color it blue. Greedy algorithm. Apply the red and blue rules (nondeterministically) until all edges are colored. The blue edges form an MST. Note: can stop once n – 1 edges colored blue. 23Greedy algorithm: proof of correctness Color invariant. There exists an MST T containing all of the blue edges and none of the red edges. Pf. by induction on number of iterations Base case. No edges colored ⇒ every MST satisfies invariant. 24Greedy algorithm: proof of correctness Color invariant. There exists an MST T containing all of the blue edges and none of the red edges. Pf. by induction on number of iterations Induction step (blue rule). Suppose color invariant true before blue rule. let D be chosen cutset, and let f be edge colored blue. if f ∈ T, T still satisfies invariant. Otherwise, consider fundamental cycle C by adding f to T. let e ∈ C be another edge in D. e is uncolored and c ≥ c since e f e ∈ T ⇒ e not red blue rule ⇒ e not blue and c ≥ c e f T Thus, T ∪ f – e satisfies invariant. f e 25Greedy algorithm: proof of correctness Color invariant. There exists an MST T containing all of the blue edges and none of the red edges. Pf. by induction on number of iterations Induction step (red rule). Suppose color invariant true before red rule. let C be chosen cycle, and let e be edge colored red. if e ∉ T, T still satisfies invariant. Otherwise, consider fundamental cutset D by deleting e from T. let f ∈ D be another edge in C. f is uncolored and c ≥ c since e f f ∉ T ⇒ f not blue red rule ⇒ f not red and c ≥ c e f T Thus, T ∪ f – e satisfies invariant. ▪ f e 26Greedy algorithm: proof of correctness Theorem. The greedy algorithm terminates. Blue edges form an MST. Pf. We need to show that either the red or blue rule (or both) applies. Suppose edge e is left uncolored. Blue edges form a forest. Case 1: both endpoints of e are in same blue tree. ⇒ apply red rule to cycle formed by adding e to blue forest. e Case 1 27Greedy algorithm: proof of correctness Theorem. The greedy algorithm terminates. Blue edges form an MST. Pf. We need to show that either the red or blue rule (or both) applies. Suppose edge e is left uncolored. Blue edges form a forest. Case 1: both endpoints of e are in same blue tree. ⇒ apply red rule to cycle formed by adding e to blue forest. Case 2: both endpoints of e are in different blue trees. ⇒ apply blue rule to cutset induced by either of two blue trees. ▪ e Case 2 284. GREEDY ALGORITHMS II Dijkstra's algorithm ‣ minimum spanning trees ‣ Prim, Kruskal, Boruvka ‣ singlelink clustering ‣ mincost arborescences ‣ SECTION 6.2Prim's algorithm Initialize S = any node. Repeat n – 1 times: Add to tree the min weight edge with one endpoint in S. Add new node to S. Theorem. Prim's algorithm computes the MST. Pf. Special case of greedy algorithm (blue rule repeatedly applied to S). ▪ S 30Prim's algorithm: implementation Theorem. Prim's algorithm can be implemented in O(m log n) time. Pf. Implementation almost identical to Dijkstra's algorithm. d(v) = weight of cheapest known edge between v and S PRIM (V, E, c) Create an empty priority queue. s ← any node in V. FOR EACH v ≠ s : d(v) ← ∞; d(s) ← 0. FOR EACH v : insert v with key d(v) into priority queue. WHILE (the priority queue is not empty) u ← deletemin from priority queue. FOR EACH edge (u, v) ∈ E incident to u: IF d(v) c(u, v) decreasekey of v to c(u, v) in priority queue. d(v) ← c(u, v). 31Kruskal's algorithm Consider edges in ascending order of weight: Add to tree unless it would create a cycle. Theorem. Kruskal's algorithm computes the MST. Pf. Special case of greedy algorithm. all other edges in cycle are blue Case 1: both endpoints of e in same blue tree. ⇒ color red by applying red rule to unique cycle. Case 2. If both endpoints of e are in different blue trees. ⇒ color blue by applying blue rule to cutset defined by either tree. ▪ no edge in cutset has smaller weight (since Kruskal chose it first) e 32Kruskal's algorithm: implementation Theorem. Kruskal's algorithm can be implemented in O(m log m) time. Sort edges by weight. Use unionfind data structure to dynamically maintain connected components. KRUSKAL (V, E, c) SORT m edges by weight so that c(e ) ≤ c(e ) ≤ … ≤ c(e ) 1 2 m S ← φ FOREACH v ∈ V: MAKESET(v). FOR i = 1 TO m (u, v) ← e i are u and v in IF FINDSET(u) ≠ FINDSET(v) same component S ← S ∪ e i make u and v in UNION(u, v). same component RETURN S 33Reversedelete algorithm Consider edges in descending order of weight: Remove edge unless it would disconnect the graph. Theorem. The reversedelete algorithm computes the MST. Pf. Special case of greedy algorithm. Case 1: removing edge e does not disconnect graph. ⇒ apply red rule to cycle C formed by adding e to existing path between its two endpoints any edge in C with larger weight would have been deleted when considered Case 2: removing edge e disconnects graph. ⇒ apply blue rule to cutset D induced by either component. ▪ e is the only edge in the cutset (any other edges must have been colored red / deleted) 3 Fact. Thorup 2000 Can be implemented in O(m log n (log log n) ) time. 34Review: the greedy MST algorithm Red rule. Let C be a cycle with no red edges. Select an uncolored edge of C of max weight and color it red. Blue rule. Let D be a cutset with no blue edges. Select an uncolored edge in D of min weight and color it blue. Greedy algorithm. Apply the red and blue rules (nondeterministically) until all edges are colored. The blue edges form an MST. Note: can stop once n – 1 edges colored blue. Theorem. The greedy algorithm is correct. Special cases. Prim, Kruskal, reversedelete, ... 35Borůvka's algorithm Repeat until only one tree. Apply blue rule to cutset corresponding to each blue tree. Color all selected edges blue. assume edge Theorem. Borůvka's algorithm computes the MST. costs are distinct Pf. Special case of greedy algorithm (repeatedly apply blue rule). ▪ 7 11 8 12 5 13 36Borůvka's algorithm: implementation Theorem. Borůvka's algorithm can be implemented in O(m log n) time. Pf. To implement a phase in O(m) time: compute connected components of blue edges for each edge (u, v) ∈ E, check if u and v are in different components; if so, update each component's best edge in cutset At most log n phases since each phase (at least) halves total trees. ▪ 2 7 11 8 12 5 13 37Borůvka's algorithm: implementation Node contraction version. After each phase, contract each blue tree to a single supernode. Delete parallel edges (keeping only cheapest one) and self loops. Borůvka phase becomes: take cheapest edge incident to each node. graph G contract nodes 2 and 5 6 1 2 3 1 3 6 8 8 5 3 2 3 5 2 4 2, 5 9 7 9 4 7 5 6 4 6 1 1 delete parallel edges and self loops 1 3 8 5 3 2 2, 5 9 7 4 6 1 38Borůvka's algorithm on planar graphs Theorem. Borůvka's algorithm runs in O(n) time on planar graphs. Pf. To implement a Borůvka phase in O(n) time: use contraction version of algorithm in planar graphs, m ≤ 3n – 6. graph stays planar when we contract a blue tree Number of nodes (at least) halves. At most log n phases: cn + cn / 2 + cn / 4 + cn / 8 + … = O(n). ▪ 2 planar not planar 39BorůvkaPrim algorithm BorůvkaPrim algorithm. Run Borůvka (contraction version) for log log n phases. 2 2 Run Prim on resulting, contracted graph. Theorem. The BorůvkaPrim algorithm computes an MST and can be implemented in O(m log log n) time. Pf. Correctness: special case of the greedy algorithm. The log log n phases of Borůvka's algorithm take O(m log log n) time; 2 2 resulting graph has at most n / log n nodes and m edges. 2 Prim's algorithm (using Fibonacci heaps) takes O(m + n) time on a graph with n / log n nodes and m edges. ▪ 2 n n O m+ log logn logn 40Does a lineartime MST algorithm exist deterministic comparebased MST algorithms year worst case discovered by 1975 Yao O(m log log n) 1976 CheritonTarjan O(m log log n) 1984 FredmanTarjan O(m logn) O(m + n log n) 1986 GabowGalilSpencerTarjan O(m log (log n)) 1997 Chazelle O(m α(n) log α(n)) 2000 Chazelle O(m α(n)) 2002 PettieRamachandran optimal 20xx O(m) Remark 1. O(m) randomized MST algorithm. KargerKleinTarjan 1995 Remark 2. O(m) MST verification algorithm. DixonRauchTarjan 1992 414. GREEDY ALGORITHMS II Dijkstra's algorithm ‣ minimum spanning trees ‣ Prim, Kruskal, Boruvka ‣ singlelink clustering ‣ mincost arborescences ‣ SECTION 4.7Clustering Goal. Given a set U of n objects labeled p , …, p , partition into clusters so 1 n that objects in different clusters are far apart. outbreak of cholera deaths in London in 1850s (Nina Mishra) Applications. Routing in mobile ad hoc networks. Document categorization for web search. Similarity searching in medical image databases 9 Skycat: cluster 10 sky objects into stars, quasars, galaxies. ... 43Clustering of maximum spacing kclustering. Divide objects into k nonempty groups. Distance function. Numeric value specifying "closeness" of two objects. d(p , p ) = 0 iff p = p identity of indiscernibles i j i j d(p , p ) ≥ 0 nonnegativity i j d(p , p ) = d(p , p ) symmetry i j j i Spacing. Min distance between any pair of points in different clusters. Goal. Given an integer k, find a kclustering of maximum spacing. distance between two clusters distance between two closest clusters 4clustering 44Greedy clustering algorithm “Wellknown” algorithm in science literature for singlelinkage kclustering: Form a graph on the node set U, corresponding to n clusters. Find the closest pair of objects such that each object is in a different cluster, and add an edge between them. Repeat n – k times until there are exactly k clusters. Key observation. This procedure is precisely Kruskal's algorithm (except we stop when there are k connected components). Alternative. Find an MST and delete the k – 1 longest edges. 45Greedy clustering algorithm: analysis Theorem. Let C denote the clustering C , …, C formed by deleting the 1 k k – 1 longest edges of an MST. Then, C is a kclustering of max spacing. Pf. Let C denote some other clustering C , …, C . 1 k st The spacing of C is the length d of the (k – 1) longest edge in MST. Let p and p be in the same cluster in C, say C , but different clusters i j r in C, say C and C . s t Some edge (p, q) on p – p path in C spans two different clusters in C. i j r Edge (p, q) has length ≤ d since it wasn't deleted. Spacing of C is ≤ d since p and q are in different clusters. ▪ C C t s C r edges left after deleting k – 1 longest edges p j from a MST p q p i 46Dendrogram of cancers in human Tumors in similar tissues cluster together. gene 1 gene n gene expressed Reference: Botstein Brown group gene not expressed 474. GREEDY ALGORITHMS II Dijkstra's algorithm ‣ minimum spanning trees ‣ Prim, Kruskal, Boruvka ‣ singlelink clustering ‣ mincost arborescences ‣ SECTION 4.9Arborescences Def. Given a digraph G = (V, E) and a root r ∈ V, an arborescence (rooted at r) is a subgraph T = (V, F) such that T is a spanning tree of G if we ignore the direction of edges. There is a directed path in T from r to each other node v ∈ V. r Warmup. Given a digraph G, find an arborescence rooted at r (if one exists). Algorithm. BFS or DFS from r is an arborescence (iff all nodes reachable). 49Arborescences Def. Given a digraph G = (V, E) and a root r ∈ V, an arborescence (rooted at r) is a subgraph T = (V, F) such that T is a spanning tree of G if we ignore the direction of edges. There is a directed path in T from r to each other node v ∈ V. Proposition. A subgraph T = (V, F) of G is an arborescence rooted at r iff T has no directed cycles and each node v ≠ r has exactly one entering edge. Pf. ⇒ If T is an arborescence, then no (directed) cycles and every node v ≠ r has exactly one entering edge—the last edge on the unique r↝v path. ⇐ Suppose T has no cycles and each node v ≠ r has one entering edge. To construct an r↝v path, start at v and repeatedly follow edges in the backward direction. Since T has no directed cycles, the process must terminate. It must terminate at r since r is the only node with no entering edge. ▪ 50Mincost arborescence problem Problem. Given a digraph G with a root node r and with a nonnegative cost c ≥ 0 on each edge e, compute an arborescence rooted at r of minimum cost. e 6 r 2 5 1 3 8 9 7 4 Assumption 1. G has an arborescence rooted at r. Assumption 2. No edge enters r (safe to delete since they won't help). 51Simple greedy approaches do not work Observations. A mincost arborescence need not: Be a shortestpaths tree. Include the cheapest edge (in some cut). Exclude the most expensive edge (in some cycle). 6 r 2 5 1 3 4 52A sufficient optimality condition Property. For each node v ≠ r, choose one cheapest edge entering v and let F denote this set of n – 1 edges. If (V, F) is an arborescence, then it is a mincost arborescence. Pf. An arborescence needs exactly one edge entering each node v ≠ r and (V, F) is the cheapest way to make these choices. ▪ 6 r 1 5 2 3 8 9 7 4 53A sufficient optimality condition Property. For each node v ≠ r, choose one cheapest edge entering v and let F denote this set of n – 1 edges. If (V, F) is an arborescence, then it is a mincost arborescence. Note. F may not be an arborescence (since it may have directed cycles). 6 r 2 5 1 3 8 9 7 4 54Reduced costs Def. For each v ≠ r, let y(v) denote the min cost of any edge entering v. The reduced cost of an edge (u, v) is c'(u, v) = c(u, v) – y(v) ≥ 0. Observation. T is a mincost arborescence in G using costs c iff T is a mincost arborescence in G using reduced costs c'. Pf. Each arborescence has exactly one edge entering v. costs c reduced costs c' 1 9 0 9 r 1 r 2 3 0 0 7 1 3 4 0 4 3 y(v) 55Edmonds branching algorithm: intuition Intuition. Recall F = set of cheapest edges entering v for each v ≠ r. Now, all edges in F have 0 cost with respect to costs c'(u, v). If F does not contain a cycle, then it is a mincost arborescence. If F contains a cycle C, can afford to use as many edges in C as desired. Contract nodes in C to a supernode. Recursively solve problem in contracted network G' with costs c'(u, v). r 4 0 0 3 0 0 1 0 1 4 0 7 0 56Edmonds branching algorithm: intuition Intuition. Recall F = set of cheapest edges entering v for each v ≠ r. Now, all edges in F have 0 cost with respect to costs c'(u, v). If F does not contain a cycle, then it is a mincost arborescence. If F contains a cycle C, can afford to use as many edges in C as desired. Contract nodes in C to a supernode (removing any selfloops). Recursively solve problem in contracted network G' with costs c'(u, v). r 4 0 0 1 3 1 0 7 57Edmonds branching algorithm EDMONDSBRANCHING(G, r , c) FOREACH v ≠ r y(v) ← min cost of an edge entering v. c'(u, v) ← c'(u, v) – y(v) for each edge (u, v) entering v. FOREACH v ≠ r: choose one 0cost edge entering v and let F be the resulting set of edges. IF F forms an arborescence, RETURN T = (V, F). ELSE C ← directed cycle in F. Contract C to a single supernode, yielding G' = (V', E'). T' ← EDMONDSBRANCHING(G', r , c') Extend T' to an arborescence T in G by adding all but one edge of C. RETURN T. 58Edmonds branching algorithm Q. What could go wrong A. Mincost arborescence in G' has exactly one edge entering a node in C (since C is contracted to a single node) But mincost arborescence in G might have more edges entering C. mincost arborescence in G cycle C a b r 59Edmonds branching algorithm: key lemma Lemma. Let C be a cycle in G consisting of 0cost edges. There exists a min cost arborescence rooted at r that has exactly one edge entering C. Pf. Let T be a mincost arborescence rooted at r. Case 0. T has no edges entering C. Since T is an arborescence, there is an r↝v path fore each node v ⇒ at least one edge enters C. Case 1. T has exactly one edge entering C. T satisfies the lemma. Case 2. T has more than one edge that enters C. We construct another mincost arborescence T' that has exactly one edge entering C. 60Edmonds branching algorithm: key lemma Case 2 construction of T'. Let (a, b) be an edge in T entering C that lies on a shortest path from r. We delete all edges of T that enter a node in C except (a, b). path from r to C uses We add in all edges of C except the one that enters b. only one node in C cycle C T a b r 61Edmonds branching algorithm: key lemma Case 2 construction of T'. Let (a, b) be an edge in T entering C that lies on a shortest path from r. We delete all edges of T that enter a node in C except (a, b). path from r to C uses We add in all edges of C except the one that enters b. only one node in C Claim. T' is a mincost arborescence. The cost of T' is at most that of T since we add only 0cost edges. T' has exactly one edge entering each node v ≠ r. T is an arborescence rooted at r T' has no directed cycles. (T had no cycles before; no cycles within C; now only (a, b) enters C) and the only path in T' to cycle C T' T a is the path from r to a a b b (since any path must follow unique entering edge back to r) r 62Edmonds branching algorithm: analysis Theorem. ChuLiu 1965, Edmonds 1967 The greedy algorithm finds a mincost arborescence. Pf. by induction on number of nodes in G If the edges of F form an arborescence, then mincost arborescence. Otherwise, we use reduced costs, which is equivalent. After contracting a 0cost cycle C to obtain a smaller graph G', the algorithm finds a mincost arborescence T' in G' (by induction). Key lemma: there exists a mincost arborescence T in G that corresponds to T'. ▪ Theorem. The greedy algorithm can be implemented in O(m n) time. Pf. At most n contractions (since each reduces the number of nodes). Finding and contracting the cycle C takes O(m) time. Transforming T' into T takes O(m) time. ▪ 63Mincost arborescence Theorem. GabowGalilSpencerTarjan 1985 There exists an O(m + n log n) time algorithm to compute a mincost arborescence. 64