Question? Leave a message!




graph connectivity and graph traversal

graph connectivity and graph traversal
3. GRAPHS basic definitions and applications ‣ graph connectivity and graph traversal ‣ testing bipartiteness ‣ connectivity in directed graphs ‣ DAGs and topological ordering ‣ Lecture slides by Kevin Wayne
 Copyright © 2005 PearsonAddison Wesley
 http://www.cs.princeton.edu/wayne/kleinbergtardos Last updated on 1/10/17 10:58 AM3. GRAPHS basic definitions and applications ‣ graph connectivity and graph traversal ‣ testing bipartiteness ‣ connectivity in directed graphs ‣ DAGs and topological ordering ‣Undirected graphs Notation. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V , m = E . V = 1, 2, 3, 4, 5, 6, 7, 8 E = 12, 13, 23, 24, 25, 35, 37, 38, 45, 56, 78 
 m = 11, n = 8 3One week of Enron emails 4The evolution of FCC lobbying coalitions “The Evolution of FCC Lobbying Coalitions” by Pierre de Vries in JoSS Visualization Symposium 2010 5The Spread of Obesity in a Large Social Network Over 32 Years educational level; the ego’s obesity status at the ing both their weights. We estimated these mod previous time point (t); and most pertinent, the els in varied ego–alter pair types. 25 alter’s obesity status at times t and t + 1. We To evaluate the possibility that omitted vari used generalized estimating equations to account ables or unobserved events might explain the as for multiple observations of the same ego across sociations, we examined how the type or direc 26 examinations and across ego–alter pairs. We tion of the social relationship between the ego assumed an independent working correlation and the alter affected the association between the 26,27 structure for the clusters. ego’s obesity and the alter’s obesity. For example, The use of a timelagged dependent variable if unobserved factors drove the association be (lagged to the previous examination) eliminated tween the ego’s obesity and the alter’s obesity, serial correlation in the errors (evaluated with a then the directionality of friendship should not 28 Lagrange multiplier test ) and also substantial have been relevant. ly controlled for the ego’s genetic endowment and We evaluated the role of a possible spread in any intrinsic, stable predisposition to obesity. The smokingcessation behavior as a contributor to use of a lagged independent variable for an alter’s the spread of obesity by adding variables for the 25 weight status controlled for homophily. The smoking status of egos and alters at times t and key variable of interest was an alter’s obesity at t + 1 to the foregoing models. We also analyzed time t + 1. A significant coefficient for this vari the role of geographic distance between egos able would suggest either that an alter’s weight and alters by adding such a variable. Framingham heart study affected an ego’s weight or that an ego and an We calculated 95 confidence intervals by sim alter experienced contemporaneous events affect ulating the first difference in the alter’s contem Figure 1. Largest Connected Subcomponent of the Social Network in the Framingham Heart Study in the Year 2000. Each circle (node) represents one person in the data set. There are 2200 persons in this subcomponent of the social network. Circles with red borders denote women, and circles with blue borders denote men. The size of each circle is proportional to the person’s bodymass index. The interior color of the circles indicates the person’s obesity status: yellow denotes an obese person (bodymass index, ≥30) and green denotes a nonobese person. The colors of the ties between the nodes indicate the relationship between them: purple denotes a friendship or marital tie and orange denotes a familial tie. “The Spread of Obesity in a Large Social Network over 32 Years” by Christakis and Fowler in New England Journal of Medicine, 2007 6 373 n engl j med 357;4 www.nejm.org july 26, 2007Some graph applications graph node edge communication telephone, computer fiber optic cable circuit gate, register, processor wire mechanical joint rod, beam, spring financial stock, currency transactions transportation street intersection, airport highway, airway route internet class C network connection game board position legal move social relationship person, actor friendship, movie cast neural network neuron synapse protein network protein proteinprotein interaction molecule atom bond 7Graph representation: adjacency matrix Adjacency matrix. nbyn matrix with A = 1 if (u, v) is an edge. uv Two representations of each edge. 2 Space proportional to n . Checking if (u, v) is an edge takes Θ(1) time. 2 Identifying all edges takes Θ(n ) time. 1 2 3 4 5 6 7 8 1 0 1 1 0 0 0 0 0 2 1 0 1 1 1 0 0 0 3 1 1 0 0 1 0 1 1 4 0 1 0 0 1 0 0 0 5 0 1 1 1 0 1 0 0 6 0 0 0 0 1 0 0 0 7 0 0 1 0 0 0 0 1 8 0 0 1 0 0 0 1 0 8Graph representation: adjacency lists Adjacency lists. Node indexed array of lists. Two representations of each edge. degree = number of neighbors of u Space is Θ(m + n). Checking if (u, v) is an edge takes O(degree(u)) time. Identifying all edges takes Θ(m + n) time. 1 2 3 2 1 3 4 5 8 1 2 5 7 3 4 2 5 5 2 3 4 6 5 6 7 3 8 8 3 7 9Paths and connectivity Def. A path in an undirected graph G = (V, E) is a sequence of nodes
 v , v , …, v with the property that each consecutive pair v , v is joined
 1 2 k i–1 i by an edge in E. Def. A path is simple if all nodes are distinct. Def. An undirected graph is connected if for every pair of nodes u and v, there is a path between u and v. 10Cycles Def. A cycle is a path v , v , …, v in which v = v , k 2, and the first k – 1 1 2 k 1 k nodes are all distinct. cycle C = 124531 11Trees Def. An undirected graph is a tree if it is connected and does not contain a cycle. Theorem. Let G be an undirected graph on n nodes. Any two of the following statements imply the third. G is connected. G does not contain a cycle. G has n – 1 edges. 12Rooted trees Given a tree T, choose a root node r and orient each edge away from r. Importance. Models hierarchical structure. root r parent of v v child of v a tree the same tree, rooted at 1 13Phylogeny trees Describe evolutionary history of species. 14GUI containment hierarchy Describe organization of GUI widgets. http://java.sun.com/docs/books/tutorial/uiswing/overview/anatomy.html 153. GRAPHS basic definitions and applications ‣ graph connectivity and graph traversal ‣ testing bipartiteness ‣ connectivity in directed graphs ‣ DAGs and topological ordering ‣Connectivity st connectivity problem. Given two node s and t, is there a path between s and t st shortest path problem. Given two node s and t, what is the length of the shortest path between s and t Applications. Friendster. Maze traversal. Kevin Bacon number. Fewest number of hops in a communication network. 17Breadthfirst search BFS intuition. Explore outward from s in all possible directions, adding nodes one "layer" at a time. s L L L 1 2 n–1 BFS algorithm. L = s . 0 L = all neighbors of L . 1 0 L = all nodes that do not belong to L or L , and that have an edge to a 2 0 1 node in L . 1 L = all nodes that do not belong to an earlier layer, and that have an i+1 edge to a node in L . i Theorem. For each i, L consists of all nodes at distance exactly i
 i from s. There is a path from s to t iff t appears in some layer. 18Breadthfirst search Property. Let T be a BFS tree of G = (V, E), and let (x, y) be an edge of G.
 Then, the level of x and y differ by at most 1. L 0 L 1 L 2 L 3 19Breadthfirst search: analysis Theorem. The above implementation of BFS runs in O(m + n) time if the graph is given by its adjacency representation. Pf. 2 Easy to prove O(n ) running time: at most n lists Li each node occurs on at most one list; for loop runs ≤ n times when we consider node u, there are ≤ n incident edges (u, v),
 and we spend O(1) processing each edge Actually runs in O(m + n) time: when we consider node u, there are degree(u) incident edges (u, v) total time processing edges is Σ degree(u) = 2m. ▪ ∈ u V each edge (u, v) is counted exactly twice
 in sum: once in degree(u) and once in degree(v) 20Connected component Connected component. Find all nodes reachable from s. Connected component containing node 1 = 1, 2, 3, 4, 5, 6, 7, 8 . 21Flood fill Flood fill. Given lime green pixel in an image, change color of entire blob of neighboring lime pixels to blue. Node: pixel. Edge: two neighboring lime pixels. Blob: connected component of lime pixels. recolor lime green blob to blue 22Flood fill Flood fill. Given lime green pixel in an image, change color of entire blob of neighboring lime pixels to blue. Node: pixel. Edge: two neighboring lime pixels. Blob: connected component of lime pixels. recolor lime green blob to blue 23Connected component Connected component. Find all nodes reachable from s. R s u v it's safe to add v Theorem. Upon termination, R is the connected component containing s. BFS = explore in order of distance from s. DFS = explore in a different way. 243. GRAPHS basic definitions and applications ‣ graph connectivity and graph traversal ‣ testing bipartiteness ‣ connectivity in directed graphs ‣ DAGs and topological ordering ‣Bipartite graphs Def. An undirected graph G = (V, E) is bipartite if the nodes can be colored blue or white such that every edge has one white and one blue end. Applications. Stable marriage: men = blue, women = white. Scheduling: machines = blue, jobs = white. a bipartite graph 26Testing bipartiteness Many graph problems become: Easier if the underlying graph is bipartite (matching). Tractable if the underlying graph is bipartite (independent set). Before attempting to design an algorithm, we need to understand structure of bipartite graphs. v 2 v v 2 3 v 1 v 4 v v v v 3 6 5 4 v 5 v 6 v v 7 1 v 7 a bipartite graph G another drawing of G 27An obstruction to bipartiteness Lemma. If a graph G is bipartite, it cannot contain an odd length cycle. Pf. Not possible to 2color the odd cycle, let alone G. bipartite
 not bipartite
 (2colorable) (not 2colorable) 28Bipartite graphs Lemma. Let G be a connected graph, and let L , …, L be the layers produced 0 k by BFS starting at node s. Exactly one of the following holds. (i) No edge of G joins two nodes of the same layer, and G is bipartite. (ii) An edge of G joins two nodes of the same layer, and G contains an
 oddlength cycle (and hence is not bipartite). L L L L L L 2 3 1 2 3 1 Case (ii) Case (i) 29Bipartite graphs Lemma. Let G be a connected graph, and let L , …, L be the layers produced 0 k by BFS starting at node s. Exactly one of the following holds. (i) No edge of G joins two nodes of the same layer, and G is bipartite. (ii) An edge of G joins two nodes of the same layer, and G contains an
 oddlength cycle (and hence is not bipartite). Pf. (i) Suppose no edge joins two nodes in same layer. By BFS property, each edge join two nodes in adjacent levels. Bipartition: white = nodes on odd levels, blue = nodes on even levels. L L L 2 3 1 Case (i) 30Bipartite graphs Lemma. Let G be a connected graph, and let L , …, L be the layers produced 0 k by BFS starting at node s. Exactly one of the following holds. (i) No edge of G joins two nodes of the same layer, and G is bipartite. (ii) An edge of G joins two nodes of the same layer, and G contains an
 oddlength cycle (and hence is not bipartite). Pf. (ii) Suppose (x, y) is an edge with x, y in same level L . j z = lca(x, y) Let z = lca(x, y) = lowest common ancestor. Let L be level containing z. i Consider cycle that takes edge from x to y,
 then path from y to z, then path from z to x. Its length is 1 + (j – i) + (j – i), which is odd. ▪ (x, y) path from
 path from
 y to z z to x 31The only obstruction to bipartiteness Corollary. A graph G is bipartite iff it contain no odd length cycle. 5cycle C bipartite
 not bipartite
 (2colorable) (not 2colorable) 323. GRAPHS basic definitions and applications ‣ graph connectivity and graph traversal ‣ testing bipartiteness ‣ connectivity in directed graphs ‣ DAGs and topological ordering ‣Directed graphs Notation. G = (V, E). Edge (u, v) leaves node u and enters node v. Ex. Web graph: hyperlink points from one web page to another. Orientation of edges is crucial. Modern web search engines exploit hyperlink structure to rank web pages by importance. 34World wide web Web graph. Node: web page. Edge: hyperlink from one page to another (orientation is crucial). Modern search engines exploit hyperlink structure to rank web pages by importance. cnn.com netscape.com novell.com cnnsi.com timewarner.com hbo.com sorpranos.com 35Road network To see all the details that are visible on the screen,use the Address Holland Tunnel "Print" link next to the map. New York, NY 10013 Vertex = intersection; edge = oneway street. ©2008 Google Map data ©2008 Sanborn, NAVTEQ™ Terms of Use 36Political blogosphere graph Vertex = political blog; edge = link. The Political Blogosphere and the 2004 U.S. Election: Divided They Blog, Adamic and Glance, 2005 Figure 1: Community structure of political blogs (expanded set), shown using utilizing a GEM 37 layout 11 in the GUESS3 visualization and analysis tool. The colors reflect political orientation, red for conservative, and blue for liberal. Orange links go from liberal to conservative, and purple ones from conservative to liberal. The size of each blog reflects the number of other blogs that link to it. longer existed, or had moved to a different location. When looking at the front page of a blog we did not make a distinction between blog references made in blogrolls (blogroll links) from those made in posts (post citations). This had the disadvantage of not differentiating between blogs that were actively mentioned inapostonthatday, fromblogrolllinksthatremainstaticovermanyweeks10. Since posts usually contain sparse references to other blogs, and blogrolls usually contain dozens of blogs, we assumed that the network obtained by crawling the front page of each blog would strongly reflect blogroll links. 479 blogs had blogrolls through blogrolling.com, while many others simply maintained a list of links to their favorite blogs. We did not include blogrolls placed on a secondary page. We constructed a citation network by identifying whether a URL present on the page of one blog references another political blog. We called a link found anywhere on a blog’s page, a “page link” to distinguishitfroma“postcitation”,alinktoanotherblogthatoccursstrictlywithinapost. Figure1 shows the unmistakable division between the liberal and conservative political (blogo)spheres. In fact, 91 of the links originating within either the conservative or liberal communities stay within that community. An effect that may not be as apparent from the visualization is that even though we started with a balanced set of blogs, conservative blogs show a greater tendency to link. 84 of conservative blogs link to at least one other blog, and 82 receive a link. In contrast, 74 of liberal blogs link to another blog, while only 67 are linked to by another blog. So overall, we see a slightly higher tendency for conservative blogs to link. Liberal blogs linked to 13.6 blogs on average, while conservative blogs linked to an average of 15.1, and this difference is almost entirely due to the higher proportion of liberal blogs with no links at all. Although liberal blogs may not link as generously on average, the most popular liberal blogs, Daily Kos and Eschaton (atrios.blogspot.com), had 338 and 264 links from our singleday snapshot 4Ecological food web Food web graph. Node = species. Edge = from prey to predator. Reference: http://www.twingroves.district96.k12.il.us/Wetlands/Salamander/SalGraphics/salfoodweb.giff 38Some directed graph applications directed graph node directed edge transportation street intersection oneway street web web page hyperlink food web species predatorprey relationship WordNet synset hypernym scheduling task precedence constraint financial bank transaction cell phone person placed call infectious disease person infection game board position legal move citation journal article citation object graph object pointer inheritance hierarchy class inherits from control flow code block jump 39Graph search Directed reachability. Given a node s, find all nodes reachable from s. Directed st shortest path problem. Given two node s and t, what is the length of the shortest path from s and t Graph search. BFS extends naturally to directed graphs. Web crawler. Start from web page s. Find all web pages linked from s,
 either directly or indirectly. 40Strong connectivity Def. Nodes u and v are mutually reachable if there is a both path from u to v and also a path from v to u. Def. A graph is strongly connected if every pair of nodes is mutually reachable. Lemma. Let s be any node. G is strongly connected iff every node is reachable from s, and s is reachable from every node. Pf. ⇒ Follows from definition. Pf. ⇐ Path from u to v: concatenate u↝s path with s↝v path.
 Path from v to u: concatenate v↝s path with s↝u path. ▪ ok if paths overlap s u v 41Strong connectivity: algorithm Theorem. Can determine if G is strongly connected in O(m + n) time. Pf. Pick any node s. reverse orientation of every edge in G Run BFS from s in G. reverse Run BFS from s in G . Return true iff all nodes reached in both BFS executions. Correctness follows immediately from previous lemma. ▪ strongly connected not strongly connected 42Strong components Def. A strong component is a maximal subset of mutually reachable nodes. 
 
 
 
 
 
 A digraph and its strong components 
 
 
 Theorem. Tarjan 1972 Can find all strong components in O(m + n) time. COMPUT. SIAM J. Vol. 1, No. 2, June 1972 DEPTHFIRST SEARCH AND LINEAR GRAPH ALGORITHMS ROBERT TARJAN" Abstract. The value of depthfirst search or "bacltracking" as a technique for solving problems is illustrated by two examples. An improved version of an algorithm for finding the strongly connected components of a directed graph and algorithm for finding the biconnected components of an un ar direct graph are presented. The space and time requirements of both algorithms are bounded by k k forsomeconstants and k where Vis thenumber ofverticesandE is thenumber + d kl, k2, a, 1V k2E ofedges of the examined. graph being search, Key words. Algorithm, backtracking, biconnectivity, connectivity, depthfirst, graph, 43 spanning tree, strongconnectivity. U 1. Introduction. Consider a graph G, consisting of a set of vertices and a g. set of are ordered edges The graph may either be directed (the edges pairs (v, w) of v is the tail and w is the head of the edge) or undirected edges are vertices; (the form a suitable unordered pairs of vertices, also represented as (v, Graphs w)). abstraction for in electrical and problems many areas; chemistry, engineering, for Thus it is to have the most economical algo sociology, example. important rithms for answering graphtheoretical questions. we cannot avoid at least a few definitions. In studying graph algorithms These definitions are moreorless standard in the literature. Harary (See 3, for is a a w in G is a ofvertices IfG (, g) graph, path p’v sequence instance.) and from v to w.A is simple ifall its vertices are distinct.A edges leading path path v is called a closed path. A closed path v is a cycle if all its edges are p’v p’v distinct and the only vertex to occur twice in p is v, which occurs exactly twice. Two which are of each other are considered to be the cycles cyclic permutations same cycle. The undirected version of a directed graph is the graph formed by each of into an undirected and converting edge the directed graph edge removing An undirected is connected if there is a between every duplicate edges. graph path pair of vertices. T is is A rooted) tree a directed graph whose undirected version (directed one vertex which is the head of no the connected, having edges (called root), root of one The and such that all vertices except the are the head exactly edge. relation is an of T" is denoted v w. The relation "There is a "(v, w) edge by from v tow in T" is denoted by v w. Ifv w, v is the ofw andw is a path father descendant of is son of v. Ifv w, v is an ancestor ofw andw is a v. Every vertex an a descendant of itself. If v is a vertex in a tree T is the subtree ofT ancestor and T, as vertices all the descendants of v in T. IfG is a directed graph, a tree T having if of T contains all the vertices of G. is a spanning tree ofG T is a subgraph G and IfR and S are R is the transitive closure of R, R1 is the binary relations, inverse ofR, and RS R e w)lZlv((u,v) (v, w) (u, S). Received by the editors August 30, 1971, and in revised form March 9, 1972. Department of Computer Science, Cornell University, Ithaca, New York 14850. This research was the Hertz Foundation and the National Science Foundation under Grant GJ992. supported by " 1463. GRAPHS basic definitions and applications ‣ graph connectivity and graph traversal ‣ testing bipartiteness ‣ connectivity in directed graphs ‣ DAGs and topological ordering ‣Directed acyclic graphs Def. A DAG is a directed graph that contains no directed cycles. Def. A topological order of a directed graph G = (V, E) is an ordering of its nodes as v , v , …, v so that for every edge (v , v ) we have i j. 1 2 n i j v v 2 3 v v v v v v v v v v 6 5 4 1 2 3 4 5 6 7 v v 7 1 a DAG a topological ordering 45Precedence constraints Precedence constraints. Edge (v , v ) means task v must occur before v . i j i j Applications. Course prerequisite graph: course v must be taken before v . i j Compilation: module v must be compiled before v . Pipeline of i j computing jobs: output of job v needed to determine input of job v . i j 46Directed acyclic graphs Lemma. If G has a topological order, then G is a DAG. Pf. by contradiction Suppose that G has a topological order v , v , …, v and that G also has a 1 2 n directed cycle C. Let's see what happens. Let v be the lowestindexed node in C, and let v be the node just
 i j before v ; thus (v , v ) is an edge. i j i By our choice of i, we have i j. On the other hand, since (v , v ) is an edge and v , v , …, v is a topological j i 1 2 n order, we must have j i, a contradiction. ▪ the directed cycle C v v v v 1 i j n the supposed topological order: v , …, v 1 n 47Directed acyclic graphs Lemma. If G has a topological order, then G is a DAG. Q. Does every DAG have a topological ordering Q. If so, how do we compute one 48Directed acyclic graphs Lemma. If G is a DAG, then G has a node with no entering edges. Pf. by contradiction Suppose that G is a DAG and every node has at least one entering edge. Let's see what happens. Pick any node v, and begin following edges backward from v. Since v has at least one entering edge (u, v) we can walk backward to u. Then, since u has at least one entering edge (x, u), we can walk backward to x. Repeat until we visit a node, say w, twice. Let C denote the sequence of nodes encountered between successive visits to w. C is a cycle. ▪ w x u v 49Directed acyclic graphs Lemma. If G is a DAG, then G has a topological ordering. Pf. by induction on n Base case: true if n = 1. Given DAG on n 1 nodes, find a node v with no entering edges. G – v is a DAG, since deleting v cannot create cycles. By inductive hypothesis, G – v has a topological ordering. Place v first in topological ordering; then append nodes of G – v in topological order. This is valid since v has no entering edges. ▪ DAG v 50Topological sorting algorithm: running time Theorem. Algorithm finds a topological order in O(m + n) time. Pf. Maintain the following information: count(w) = remaining number of incoming edges S = set of remaining nodes with no incoming edges Initialization: O(m + n) via single scan through graph. Update: to delete v remove v from S decrement count(w) for all edges from v to w;
 and add w to S if count(w) hits 0 this is O(1) per edge ▪ 51
Website URL
Comment
sharer
Presentations
Free
Document Information
Category:
Presentations
User Name:
Dr.AlexanderTyler
User Type:
Teacher
Country:
India
Uploaded Date:
21-07-2017