Chapter 7: Optimization of Distributed
• Basic Concepts
• Distributed Cost Model
• Database Statistics
• Joins and Semijoins
• Query Optimization Algorithms
Acknowledgements: I am indebted to Arturas Mazeika for providing me his slides of this course.
DDB 2008/09 J. Gamper Page 1Basic Concepts
• Query optimization: Process of producing an op-
timal (close to optimal) query execution plan which
represents an execution strategy for the query
– The main task in query optimization is to con-
sider different orderings of the operations
• Centralized query optimization:
– Find (the best) query execution plan in the
space of equivalent query trees
– Minimize an objective cost function
– Gather statistics about relations
• Distributed query optimization brings additional issues
– Linear query trees are not necessarily a good choice
– Bushy query trees are not necessarily a bad choice
– What and where to ship the relations
– How to ship relations (ship as a whole, ship as needed)
– When to use semi-joins instead of joins
DDB 2008/09 J. Gamper Page 2Basic Concepts . . .
• Search space: The set of alternative query execution plans (query trees)
– Typically very large
– The main issue is to optimize the joins
– ForN relations, there areO(N) equivalent join trees that can be obtained by
applying commutativity and associativity rules
• Example: 3 equivalent query trees (join trees) of the joins in the following query
FROM EMP, ASG, PROJ
WHERE EMP.ENO=ASG.ENO AND ASG.PNO=PROJ.PNO
DDB 2008/09 J. Gamper Page 3Basic Concepts . . .
• Reduction of the search space
– Restrict by means of heuristics
∗ Perform unary operations before binary operations, etc
– Restrict the shape of the join tree
∗ Consider the type of trees (linear trees, vs. bushy ones)
Linear Join Tree Bushy Join Tree
DDB 2008/09 J. Gamper Page 4Basic Concepts . . .
• There are two main strategies to scan the search space
• Deterministic scan of the search space
– Start from base relations and build plans by adding one relation at each step
– Breadth-ﬁrst strategy: build all possible plans before choosing the “best” plan
(dynamic programming approach)
– Depth-ﬁrst strategy: build only one plan (greedy approach)
DDB 2008/09 J. Gamper Page 5Basic Concepts . . .
• Randomized scan of the search space
– Search for optimal solutions around a particular starting point
– e.g., iterative improvement or simulated annealing techniques
– Trades optimization time for execution time
∗ Does not guarantee that the best solution is obtained, but avoid the high cost of
– The strategy is better when more than 5-6 relations are involved
DDB 2008/09 J. Gamper Page 6Distributed Cost Model
• Two different types of cost functions can be used
– Reduce total time
∗ Reduce each cost component (in terms of time) individually, i.e., do as little for each
cost component as possible
∗ Optimize the utilization of the resources (i.e., increase system throughput)
– Reduce response time
∗ Do as many things in parallel as possible
∗ May increase total time because of increased total activity
DDB 2008/09 J. Gamper Page 7Distributed Cost Model . . .
• Total time: Sum of the time of all individual components
– Local processing time: CPU time + I/O time
– Communication time: ﬁxed time to initiate a message + time to transmit the data
Total time =T ∗instructions +T ∗I/Os +
T ∗messages +T ∗bytes
• The individual components of the total cost have different weights:
– Wide area network
∗ Message initiation and transmission costs are high
∗ Local processing cost is low (fast mainframes or minicomputers)
∗ Ratio of communication to I/O costs is 20:1
– Local area networks
∗ Communication and local processing costs are more or less equal
∗ Ratio of communication to I/O costs is 1:1.6 (10MB/s network)
DDB 2008/09 J. Gamper Page 8Distributed Cost Model . . .
• Response time: Elapsed time between the initiation and the completion of a query
Response time =T ∗seq instructions +T ∗seq I/Os +
T ∗seq messages +T ∗seq bytes
– whereseq x (x in instructions, I/O, messages, bytes) is the maximum number of
x which must be done sequentially.
• Any processing and communication done in parallel is ignored
DDB 2008/09 J. Gamper Page 9Distributed Cost Model . . .
• Example: Query at site 3 with data from sites 1 and 2.
– Assume that only the communication cost is considered
– Total time =T ∗2+T ∗(x+y)
– Response time = maxT +T ∗x,T +T ∗y
MSG TR MSG TR
DDB 2008/09 J. Gamper Page 10Database Statistics
• The primary cost factor is the size of intermediate relations
– that are produced during the execution and
– must be transmitted over the network, if a subsequent operation is located on a
• It is costly to compute the size of the intermediate relations precisely.
• Instead global statistics of relations and fragments are computed and used to
DDB 2008/09 J. Gamper Page 11Database Statistics . . .
• LetR(A ,A ,...,A ) be a relation fragmented intoR ,R ,...,R .
1 2 1 2 r
• Relation statistics
– min and max values of each attribute: minA,maxA.
– length of each attribute:length(A )
– number of distinct values in each fragment (cardinality):card(A ),
• Fragment statistics
– cardinality of the fragment:card(R )
– cardinality of each attribute of each fragment:card(Π (R ))
DDB 2008/09 J. Gamper Page 12Database Statistics . . .
• Selectivity factor of an operation: the proportion of tuples of an operand relation that
participate in the result of that operation
• Assumption: independent attributes and uniform distribution of attribute values
• Selectivity factor of selection
SF (A =value) =
SF (Avalue) =
SF (Avalue) =
• Properties of the selectivity factor of the selection
SF (p(A )∧p(A )) =SF (p(A ))∗SF (p(A ))
σ i j σ i σ j
SF (p(A )∨p(A )) =SF (p(A ))+SF (p(A ))−(SF (p(A ))∗SF (p(A ))
σ i j σ i σ j σ i σ j
SF (A∈values) =SF (A =value)∗card(values)
DDB 2008/09 J. Gamper Page 13Database Statistics . . .
• Cardinality of intermediate results
card(σ (R)) =SF (P)∗card(R)
∗ More difﬁcult: duplicates, correlations between projected attributes are unknown
∗ Simple if the projected attribute is a key
card(Π (R)) =card(R)
– Cartesian Product
∗ upper bound:card(R∪S)≤card(R)+card(S)
∗ lower bound:card(R∪S)≥ maxcard(R),card(S)
– Set Difference
∗ upper bound:card(R−S) =card(R)
∗ lower bound: 0
DDB 2008/09 J. Gamper Page 14Database Statistics . . .
• Selectivity factor for joins
• Cardinality of joins
– Upper bound: cardinality of Cartesian Product
– General case (if SF is given):
card(R⋊⋉S) =SF ∗card(R)∗card(S)
– Special case:R.A is a key ofR andS.A is a foreign key ofS;
∗ eachS-tuple matches with at most one tuple ofR
card(R⋊⋉ S) =card(S)
DDB 2008/09 J. Gamper Page 15Database Statistics . . .
• Selectivity factor for semijoins: fraction of R-tuples that join with S-tuples
– An approximation is the selectivity ofA inS
SF (R⊲ S) =SF (S.A) =
⊲ A ⊲
• Cardinality of semijoin (general case):
card(R⊲ S) =SF (S.A)∗card(R)
• Example:R.A is a foreign key inS (S.A is a primary key)
ThenSF = 1 and the result size corresponds to the size ofR
DDB 2008/09 J. Gamper Page 16Join Ordering in Fragment Queries
• Join ordering is an important aspect in centralized DBMS, and it is even more
important in a DDBMS since joins between fragments that are stored at different sites
may increase the communication time.
• Two approaches exist:
– Optimize the ordering of joins directly
∗ INGRES and distributed INGRES
∗ SystemR and SystemR
– Replace joins by combinations of semijoins in order to minimize the communication
∗ Hill Climbing and SDD-1
DDB 2008/09 J. Gamper Page 17Join Ordering in Fragment Queries . . .
• Direct join odering of two relation/fragments located at different sites
– Move the smaller relation to the other site
– We have to estimate the size ofR andS
DDB 2008/09 J. Gamper Page 18Join Ordering in Fragment Queries . . .
• Direct join ordering of queries involving more than two relations is substantially more
• Example: Consider the following query and the respective join graph, where we make
also assumptions about the locations of the three relations/fragments
PROJ⋊⋉ ASG⋊⋉ EMP
DDB 2008/09 J. Gamper Page 19Join Ordering in Fragment Queries . . .
• Example (contd.): The query can be evaluated in at least 5 different ways.
– Plan 1: EMP→Site 2
Site 2: EMP’=EMP⋊⋉ASG
Site 3: EMP’⋊⋉PROJ
– Plan 2: ASG→Site 1
Site 1: EMP’=EMP⋊⋉ASG
– Plan 4: PROJ→Site 2
Site 2: PROJ’=PROJ⋊⋉ASG
Site 3: EMP’⋊⋉PROJ
Site 1: PROJ’⋊⋉EMP
– Plan 3: ASG→Site 3
Site 3: ASG’=ASG⋊⋉PROJ
– Plan 5: EMP→Site 2
Site 1: ASG’⋊⋉EMP
Site 2: EMP⋊⋉PROJ⋊⋉ASG
• To select a plan, a lot of information is needed, including
– size(EMP),size(ASG),size(PROJ),size(EMP ⋊⋉ASG),
– Possibilities of parallel execution if response time is used
DDB 2008/09 J. Gamper Page 20