Question? Leave a message!




k-d Trees

k-d Trees
ECE 250 Algorithms and Data Structures kdTrees Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada ece.uwaterloo.ca dwharderalumni.uwaterloo.ca © 20062013 by Douglas Wilhelm Harder. Some rights reserved.kd trees 2 kd trees This may be a “KD” tree, but it is not a kd tree http://www.kraft.com/kd trees 3 kd trees In a vain attempt to massproduce and improve their product, Kraft considers genetic engineering... – This is not a kd tree, either my apologies to http://www.explorecrete.com/kd trees 4 Outline In this topic, we will cover: – The idea behind a kd tree – Partitioning points in a kdimensional space – How subtrees restrict the region of its elements – Counting elements, searching, insertions, and nearest neighbour – Difficulties with removals and balancing – Other applicationskd trees 5 kd trees Used in image processing as a means of partitioning points in a k dimensional space Binary search trees are excellent at storing and searching data 1 dimensional (i.e., linear) spacekd trees 6 kd trees As these examples demonstrate: – with binary trees, it is easy to find all points which fall inside a given interval (a 1d box) – how do we store points so that we can find all points which fall within a • rectangle (a 2d box), or • a 3d boxkd trees 7 kd trees To solve this problem, we will define a binary tree where st – depths 0, k, 2k, etc., contain binary search nodes sorted on the 1 coordinate – depths 1, k + 1, 2k + 1, etc., contain binary search nodes nodes sorted nd on the 2 coordinate, and in general – depths j, k + j, 2k + j, etc., contain binary search nodes sorted on the st (k + 1) coordinatekd trees 8 kd trees Comment on the naming convention: – while the k refers to the dimension, it is common to refer to such a tree as, for example, a 3dimensional kd tree instead of a 3d tree, as numerous data structures could be referred to as 3d trees kd trees 9 kd trees Given a set of points, we can construct a balanced kd tree by always using the median element with respect the coordinate being searched on We will assume no duplication, but if duplication occurs, we can choose any of the median elementskd trees 10 kd trees Suppose we wish to partition the following points in a 2dimensional kd tree: (0.03, 0.90), (0.37, 0.04), (0.56, 0.78), (0.01, 0.48), (0.41, 0.89), (0.95, 0.07), (0.97, 0.09), (0.54, 0.65), (0.04, 0.61), (0.73, 0.69), (0.46, 0.58), (0.08, 0.89), (0.04, 0.41), (0.94, 0.02), (0.33, 0.07), (0.55, 0.54), (0.06, 0.05), (0.04, 0.06), (0.74, 0.97), (0.29, 0.15), (0.05, 0.88), (0.23, 0.23), (0.55, 0.02), (0.02, 0.97), (0.05, 0.07), (0.06, 0.28), (0.09, 0.55), (0.02, 0.91), (0.05, 0.97), (0.68, 0.42), (0.97, 0.18)kd trees 11 kd trees st The first step is to order the points based on the 1 coordinate and find the median: (0.01, 0.48), (0.02, 0.91), (0.02, 0.97), (0.03, 0.90), (0.04, 0.06), (0.04, 0.41), (0.04, 0.61), (0.05, 0.07), (0.05, 0.88), (0.05, 0.97), (0.06, 0.05), (0.06, 0.28), (0.08, 0.89), (0.09, 0.55), (0.23, 0.23), (0.29, 0.15), (0.33, 0.07), (0.37, 0.04), (0.41, 0.89), (0.46, 0.58), (0.54, 0.65), (0.55, 0.02), (0.55, 0.54), (0.56, 0.78), (0.68, 0.42), (0.73, 0.69), (0.74, 0.97), (0.94, 0.02), (0.95, 0.07), (0.97, 0.09), (0.97, 0.18)kd trees 12 kd trees The median point, (0.29, 0.15), forms the root of our kd treekd trees 13 kd trees This partitions the remaining points into two sets: (0.01, 0.48), (0.02, 0.91), (0.02, 0.97), (0.03, 0.90), (0.04, 0.06), (0.04, 0.41), (0.04, 0.61), (0.05, 0.07), (0.05, 0.88), (0.05, 0.97), (0.06, 0.05), (0.06, 0.28), (0.08, 0.89), (0.09, 0.55), (0.23, 0.23) (0.33, 0.07), (0.37, 0.04), (0.41, 0.89), (0.46, 0.58), (0.54, 0.65), (0.55, 0.02), (0.55, 0.54), (0.56, 0.78), (0.68, 0.42), (0.73, 0.69), (0.74, 0.97), (0.94, 0.02), (0.95, 0.07), (0.97, 0.09), (0.97, 0.18)kd trees 14 kd trees nd Starting with the first partition, we order these according to the 2 coordinate: (0.06, 0.05), (0.04, 0.06), (0.05, 0.07), (0.23, 0.23), (0.06, 0.28), (0.04, 0.41), (0.01, 0.48), (0.09, 0.55), (0.04, 0.61), (0.03, 0.90), (0.02, 0.91), (0.02, 0.97), (0.05, 0.88), (0.08, 0.89), (0.05, 0.97)kd trees 15 kd trees This point creates the left child of the root kd trees 16 kd trees Starting with the second partition, we also order these according to nd the 2 coordinate: (0.55, 0.02), (0.94, 0.02), (0.37, 0.04), (0.33, 0.07), (0.95, 0.07), (0.97, 0.09), (0.97, 0.18), (0.68, 0.42), (0.55, 0.54), (0.46, 0.58), (0.54, 0.65), (0.73, 0.69), (0.56, 0.78), (0.41, 0.89), (0.74, 0.97)kd trees 17 kd trees This point creates the right child of the root kd trees 18 kd trees st Next, ordering the partitioned elements by the 1 coordinate, we choose the medians to find the children of the left child (0.09, 0.55): (0.01, 0.48), (0.04, 0.06), (0.04, 0.41), (0.05, 0.07), (0.06, 0.05), (0.06, 0.28), (0.23, 0.23), (0.02, 0.91), (0.02, 0.97), (0.03, 0.90), (0.04, 0.61), (0.05, 0.88), (0.05, 0.97), (0.08, 0.89) kd trees 19 kd trees Doing the same with the two right partitions, we get the children of the right child of the root: (0.33, 0.07), (0.37, 0.04), (0.55, 0.02), (0.94, 0.02), (0.95, 0.07), (0.97, 0.09), (0.97, 0.18) (0.41, 0.89), (0.46, 0.58), (0.54, 0.65), (0.55, 0.54), (0.56, 0.78), (0.73, 0.69), (0.74, 0.97)kd trees 20 kd trees nd At the next level, we order the points again based on the 2 coordinate and choose the medians: (0.04, 0.06), (0.04, 0.41), (0.01, 0.48) (0.06, 0.05), (0.23, 0.23), (0.06, 0.28) (0.03, 0.90), (0.02, 0.91), (0.02, 0.97) (0.05, 0.88), (0.08, 0.89), (0.05, 0.97) (0.55, 0.02), (0.37, 0.04), (0.33, 0.07) (0.95, 0.07), (0.97, 0.09), (0.97, 0.18) (0.46, 0.58), (0.54, 0.65), (0.41, 0.89) (0.73, 0.69), (0.56, 0.78), (0.74, 0.97)kd trees 21 kd trees The remaining points fit in their appropriate locationskd trees 22 kd trees The result is a 2dimensional kd tree of the given 31 pointskd trees 23 kd trees In this example, all points are in the box 0, 1 × 0, 1, however: – all points in the left subtree are in the box 0, 0.29 × 0, 1 – all points in the right subtree are in the box 0.29, 1 × 0, 1kd trees 24 kd trees Looking at the left subtree, all points fall within 0, 0.29 × 0, 1, and – all points in its left subtree are in 0, 0.29 × 0, 0.55 – all points in its right subtree are in 0.29, 1 × 0.55, 1kd trees 25 kd trees Following paths similarly limits the regions of the points within the subtrees In this example we will follow the red path starting at the rootkd trees 26 kd trees All points with (0.09, 0.55) as its root appear in the shaded area This region is restricted to 0, 0.29 × 0, 1kd trees 27 kd trees Moving down the tree, the points under (0.04, 0.61) are further restricted nd This restriction is in the 2 coordinate: 0, 0.29 × 0.55, 1kd trees 28 kd trees Stepping further down, all points below (0.08, 0.89) are again further restricted to 0.08, 0.29 × 0.55, 1kd trees 29 kd trees Finally, the last point, a leaf node, falls within the given boxkd trees 30 kd trees A useful application of a kd tree provides an efficient data structure for counting the number of points which fall within a given kdimensional rectanglekd trees 31 kd trees This is used in image processing: locating objects within a scene, ray tracing, etc. Find the points which lie in the quadrant 0.5, 1 × 0, 0.5kd trees 32 kd trees The traversal rules we will follow are: – we always match the coordinate corresponding to the level we are current at – if that coordinate is less than the corresponding interval of the box, we only need to visit the right subtree – if that coordinate is greater than the corresponding interval, we need only visit the left subtree – otherwise, we check if the root is in the box and we visit both subtreeskd trees 33 kd trees There is one special case: – Because we can track the subinterval under which a tree may fall, it may be possible to add an entire subtreekd trees 34 kd trees Starting at the root: st – We examine the 1 coordinate – Note: 0.29 0.5, 1 – We visit only the right subtreekd trees 35 kd trees Visiting the right subtree: 0.42 ∈ 0, 0.5 Also (0.68, 0.42) ∈ 0.5, 1 × 0, 0.5 and we visit both subtreeskd trees 36 kd trees Starting with the left subtree: 0.94 ∈ 0.5, 1 We note that (0.94, 0.02) ∈ 0.5, 1 × 0, 0.5 and we visit both subtreeskd trees 37 kd trees However, at this point, we notice that the right subtree is restricted to 0.94, 1 × 0, 0.42 and 0.94, 1 × 0, 0.42 ⊆ 0.5, 1 × 0, 0.5 Thus, the entire right subtree is in the boxkd trees 38 kd trees Continuing with the left subtree, we note that 0.04 ∈ 0, 0.5 We note that (0.37, 0.04) ∉ 0.5, 1 × 0, 0.5 but we still visit both subtreeskd trees 39 kd trees Inspecting the left leaf node (0.55, 0.02) ∈ 0.5, 1 × 0, 0.5 and we are finished kd trees 40 kd trees Inspecting the right leaf node, we note: (0.33, 0.07) ∉ 0.5, 1 × 0, 0.5 and we are finished kd trees 41 kd trees Continuing with the right subtree, we note that 0.55 ∈ 0.5, 1 (we must visit both subtrees) However, the point is not in the box 0.5, 1 × 0, 0.5kd trees 42 kd trees Visiting the next node, we note (0.54, 0.65) is not in the box and 0.65 0.5, hence we need only visit the left subtreekd trees 43 kd trees This leaf node is not in the regionkd trees 44 kd trees Visiting the next node, we note (0.56, 0.78) is not in the box and 0.56 0.5, hence we need only visit the left subtreekd trees 45 kd trees This leaf node is not in the region – The traversal is finishedkd trees 46 kd trees Thus, there are six points in the quadrant 0.5, 1 × 0, 0.5 – We had to visit fewer than half the points in the tree to determine thiskd trees 47 kd trees Suppose we want to add a new element to an existing kd tree For example, add the point (0.15, 0.66) to the kd tree which we just built: – we add the new element the same way we did with binary search trees, however, we must compare the appropriate coordinate at each levelkd trees 48 kd trees First, we compare the first coordinate: 0.15 0.29 Thus we add the new element to the left subtreekd trees 49 kd trees Next, we compare the second coordinate: 0.66 0.55 Thus we add the new element to the right subtreekd trees 50 kd trees Next, we compare the first coordinate again: 0.15 0.04 Thus we add the new element to the right subtreekd trees 51 kd trees Next, we compare the first coordinate: 0.66 0.89 Thus we add the new element to the left subtreekd trees 52 kd trees Finally, we compare the first coordinate again: 0.15 0.05 Thus we add the new element as a new leaf node to the rightkd trees 53 kd trees Thus, we have successfully added the point (0.15, 0.66) to the existing kd treekd trees 54 kd trees It is also a reasonably fast algorithm to find a point’s nearest neighbour The nearest neighbour is a point which is closest to, but not equal to the given node – first we search as if we are performing an insertion, but – we may have to back up the tree until we ensure that neighbouring regions may not have a point closerkd trees 55 kd trees Removing elements is more difficult, as we must maintain the structure Finding the “largest element” with respect to a particular coordinate may be very difficultkd trees 56 kd trees For example, consider removing the root node from the kd tree we previously constructed: We note: – the minimum element in the right subtree is not in the obvious location, and – neither is the maximum element in the left subtree...kd trees 57 kd trees It may be easier to flag removed nodes as being “deleted” and leave the tree unchanged otherwise With numerous removals, this may significantly increase the run time of other operations...kd trees 58 kd trees Balancing kd trees is much more difficult than balancing AVL trees You cannot perform rotations in the same way in which rotations are performed with AVL treeskd trees 59 kd trees Consider the following naïve rotation We will rotate (2, J) to the root and we reattach nodes (4, M) and (6, F) as appropriate, as well as two subtreeskd trees 60 kd trees Unfortunately, tree A may have: – elements less than 2, and – if nonempty, all elements are greater than Mkd trees 61 kd trees Similarly, tree B may have: – has all elements greater than 6 (good), but – there is no restriction which guarantees that the elements are all M or all Mkd trees 62 kd trees The use of kd trees is not only restricted to 2d, 3d, or higher dimensional spaces Consider the following problem: – suppose we have a database of phone numbers of individuals within a particular province – find all individuals 20 km from some point who are between the ages of 19 and 32 – we could design a 3dimensional kd tree based on the three coordinates (x, y, age)kd trees 63 kd trees The previous example introduces an interesting problem: 20 km from some point is a circular region – kd trees can only detect boxed regions – thus, we find the smallest bounding box of the region, find all points in the box, and select those within the regionkd trees 64 kd trees Suppose we wish to find all Blackberry users within UW east of Westmount Dr. kd trees 65 kd trees In the example, we started with a given set and by using the median, we arrived at a balanced tree If we are adding or removing points, we may require rotations to keep the tree balanced These rotations are more complex than rotations for, say, an AVL treekd trees 66 kd trees In this topic, we have covered redblack trees – Simple rules govern how nodes must be distributed based on giving each node a colour of either red or black – Insertions and deletions may be performed without recursing back to the root – Only one bit is required for the “colour” – This makes them, under some circumstances, more suited than AVL treeskd trees 67 References rd 1 Weiss, Data Structures and Algorithm Analysis in C++, 3 Ed., Addison Wesley, §12.6, p.54953.kd trees 68 Usage Notes • These slides are made publicly available on the web for anyone to use • If you choose to use them, or a part thereof, for a course at another institution, I ask only three things: – that you inform me that you are using the slides, – that you acknowledge my work, and – that you alert me of any mistakes which I made or changes which you make, and allow me the option of incorporating such changes (with an acknowledgment) in my set of slides Sincerely, Douglas Wilhelm Harder, MMath dwharderalumni.uwaterloo.ca
sharer
Presentations
Free
Document Information
Category:
Presentations
User Name:
Dr.SethPatton
User Type:
Teacher
Country:
United Kingdom
Uploaded Date:
22-07-2017