Planning problems in Artificial Intelligence

problems in artificial intelligence research and challenges in artificial intelligence development and what are the problems in artificial intelligence pdf free download
HalfoedGibbs Profile Pic
HalfoedGibbs,United Kingdom,Professional
Published Date:02-08-2017
Your Website URL(Optional)
Comment
In which we see how an agent can take advantage of the structure of a problem to construct complex plans of action. The task of coming up with a sequence of actions that will achieve a goal is called planning. We have seen two examples of planning agents so far: the search-based probleim-solving agent of Chapter 3 and the logical planning agent of Chapter 10. This chapter is concerned primarily with scaling up to complex planning problems that defeat the approaches we have seen so far. Section 11.1 develops an expressive yet carefully constrained language for representing planning problems, including actions and states. The language is closely related to the propo- sitional and first-order representations of actions in Chapters 7 and 10. Section 11.2 shows how forward and backward search algorithms can take advantage of this representation, pri- marily through accurate heuristics that can be derived automatically from the structure of the representation. (This is analogous to the way in which effective heuristics were constructed for constraint satisfaction problems in Chapter 5.) Sections 11.3 through 11.5 describe plan- ning algorithms that go beyond forward and backward search, taking advantage of the rep- resentation of the probleim. In particular, we explore approaches that are not constrained to consider only totally ordered sequences of actions. For this chapter, we consider only environments that are fully observable, deterministic, finite, static (change happens only when the agent acts), and discrete (in time, action, objects, CLASSICAL and effects). These are called classical planning environments. In contrast, nonclassical PLANNING planning is for partially olbservable or stochastic environments and involves a different set of algorithms and agent designs, outlined in Chapters 12 and 17. Let us consider what car) happen when an ordinary problem-solving agent using standard search algorithms-depth-first, A, and so on-comes up against large, real-world problems. That will help us design better planning agents. 376 Chapter 1 1. Planning The most obvious difficulty is that the problem-solving agent can be overwhelmed by irrelevant actions. Consider the task of buying a copy of AI: A Modern Approach from an -digit ISBN number, for a online bookseller. Suppose there is one buying action for each 10 total of 10 billion actions. The search algorithm would have to examine the outcome states of all 10 billion actions to find one that satisfies the goal, which is to own a copy of ISBN 0137903952. A sensible planning agent, on the other hand, should be able to work back from an explicit goal description such as Have(ISBN0137903952) and generate the action Buy(ISBN0137903952) directly. To do this, the agent simply needs the general knowledge that Buy (x) results in Have (x) . Given this knowledge and the goal, the planner can decide in a single unification step that Buy(ISBN0137903952) is the right action. The next difficulty is finding a good heuristic function. Suppose the agent's goal is to buy four different books online. Then there will be lo4' plans of just four steps, so searching without an accurate heuristic is out of the question. It is obvious to a human that a good heuristic estimate for the cost of a state is the number of books that remain to be bought; unfortunately, this insight is not obvious to a problem-solving agent, because it sees the goal test only as a black box that returns true or false for each state. Therefore, the problem- solving agent lacks autonomy; it requires a human to supply a heuristic function for each new problem. On the other hand, if a planning agent has access to an explicit representation of the goal as a conjunction of subgoals, then it can use a single domain-independent heuristic: the number of unsatisfied conjuncts. For the book-buying problem, the goal would be Have(A) A Have (B) A Have(C) A Have(D), and a state containing Have (A) A Have(C) would have cost 2. Thus, the agent automatically gets the right heuristic for this problem, and for many others. We shall see later in the chapter how to construct more sophisticated heuristics that examine the available actions as well as the structure of the goal. Finally, the problem solver might be inefficient because it cannot take advantage of problem decomposition. Consider the problem of delivering a set of overnight packages to P oECoMPoslTloN ROBLEM their respective destinations, which are scattered across Australia. It makes sense to find out - the nearest airport for each destination and divide the overall problem into several subprob lems, one for each airport. Within the set of packages routed through a given airport, whether further decomposition is possible depends on the destination city. We saw in Chapter 5 that the ability to do this kind of decomposition contributes to the efficiency of constraint satisfac- tion problem solvers. The same holds true for planners: in the worst case, it can take O(n) time to find the best plan to deliver n packages, but only O((n/k) x k) time if the problem can be decomposed into k equal parts. As we noted in Chapter 5, perfectly decomposable problems are delicious but rare.' The design of many planning systems-particularly the partial-order planners described in Section 11.3-is based on the assumption that most real-world problems are nearly decom- NEARLY posable. That is, the planner can work on subgoals independently, but might need to do some additional work to combine the resulting subplans. For some problems, this assump- Notice that even the delivery of a package is not perfectly decomposable. There may be cases in which it is better to assign packages to a more distant airport if that renders a flight to the nearest airport unnecessary. Nevertheless, most delivery companies prefer the computational and organizational simplicity of sticking with decomposed solutions. Section 1 1.1. The Planning Problem 377 tion breaks down because working on one subgoal is likely to undo another subgoal. These interactions among subgoals are what makes puzzles (like the &-puzzle) puzzling. The language of planning problems The preceding discussior suggests that the representation of planning problems-states, ac- tions, and goals-should make it possible for planning algorithms to take advantage of the logical structure of the problem. The key is to find a language that is expressive enough to describe a wide variety of problems, but restrictive enough to allow efficient algolrithms to operate over it. In this selction, we first outline the basic representation language of classical planners, known as the STRIPS language.2 Later, we point out some of the many possible variations in STRIPS-like languages. Representation of states. Planners decompose the world into logical conditions and represent a state as a conjunction of positive literals. We will consider propositional literals; for example, Poor A Unknown might represent the state of a hapless agent. We will also use first-order literals; for example, At(Plane1, Melbourne) A At(Plane2, Sydney) might represent a state in the package delivery problem. Literals in first-order state descriptions must be ground and function-free. Literals such as At(x, y) or At(Father(Fred), Sydney) are not allowed. The closed-world assumption is used, meaning that any conditions that are not mentioned in a state are assumed false. Representation of goals. A goal is a partially specified state, represented as a conjunc- tion of positive ground literals, such as Rich A Famous or At(P2, Tahiti). A propositional GOALSAISFACTION state s satisfies a goal g if s contains all the atoms in g (and possibly others). For example, the state Rich A Famous A Miserable satisfies the goal Rich A Famous. Representation of actions. An action is specified in terms of the preconditions that must hold before it can be executed and the effects that ensue when it is executed. For example, an action for flying a plane from one location to another is: Action(Fly(p, from, to), PRECOND:A(, from) A Plane(p) A Airport(from) A Airport(to) EFFECT:A(, from) A At(p, to)) ACTION SCHEMA 'This is more properly callled an action schema, meaning that it: represents a number of dif- ferent actions that can be derived by instantiating the variables p, from, and to to different constants. In general, an action schema consists of three parts: The action name andl parameter list-for example, Fly(p, from, to)-serves to identify the action. PRECONDITION The precondition is a conjunction of function-free positive literals stating wlliat must be true in a state before the action can be executed. Any variables in the precondition must also appear in the action's parameter list. EFFECT The effect is a conjunction of function-free literals describing how the state changes when the action is executed. A positive literal P in the effect is asserted to be true in STRIPS stands for STanford Research Institute Problem Solver. 378 Chapter 1 1. Planning the state resulting from the action, whereas a negative literal 1P is asserted to be false. Variables in the effect must also appear in the action's parameter list. ADD LIST To improve readability, some planning systems divide the effect into the add list for positive DELETE LIST literals and the delete list for negative literals. Having defined the syntax for representations of planning problems, we can now define the semantics. The most straightforward way to do this is to describe how actions affect states. (An alternative method is to specify a direct translation into successor-state axioms, whose semantics comes from first-order logic; see Exercise 11.3.) First, we say that an action APPLICABLE is applicable in any state that satisfies the precondition; otherwise, the action has no effect. For a first-order action schema, establishing applicability will involve a substitution 0 for the variables in the precondition. For example, suppose the current state is described by At(Pl, JFK) A At(P2, SFO) A Plane(P1) A Plane(P2) A Airport (JFK) A Airport (SFO) . This state satisfies the precondition At(p, from) A Plane(p) A Airport(from) A Airport(to) with substitution p/Pl, from/JFK, to/SFO) (among others-see Exercise 1 1.2). Thus, the concrete action Fly(Pl, JFK, SFO) is applicable. RESULT Starting in state s, the result of executing an applicable action a is a state sf that is the f same as s except that any positive literal P in the effect of a is added to s and any negative literal TP is removed from st. Thus, after Fly(Pl, JFK, SFO), the current state becomes At(Pl, SFO) A At(P2, SFO) A Plane(Pl) A Plane(P2) A Airport (JFK) A Airport (SFO) . Note that if a positive effect is already in s it is not added twice, and if a negative effect is not in s, then that part of the effect is ignored. This definition embodies the so-called STRIPS STRIPS ASSUMPTION assumption: that every literal not mentioned in the effect remains unchanged. In this way, STRIPS avoids the representational frame problem described in Chapter 10. SOLUTION Finally, we can define the solution for a planning problem. In its simplest form, this is just an action sequence that, when executed in the initial state, results in a state that satisfies the goal. Later in the chapter, we will allow solutions to be partially ordered sets of actions, provided that every action sequence that respects the partial order is a solution. Expressiveness and extensions The various restrictions imposed by the STRIPS representation were chosen in the hope of making planning algorithms simpler and more efficient, without making it too difficult to describe real problems. One of the most important restrictions is that literals be function- free. With this restriction, we can be sure that any action schema for a given problem can be propositionalized-that is, turned into a finite collection of purely propositional action representations with no variables. (See Chapter 9 for more on this topic.) For example, in the air cargo domain for a problem with 10 planes and five airports, we could translate the Fly(p, from, to) schema into 10 x 5 x 5 = 250 purely propositional actions. The planners Section 1 1.1. The Planning Problem 379 STRIPS Language 1 ADL Language 1 Positive and negative literals in states: Only positive literals in states: Poor A Unknown 1 Rich A 1 Famous Open World Assumption: Closed World Assumption: Unmentioned literals are false. Unmentioned literals are unknown. Effect P A 1Q means add P and 1Q Effect P A 1Q means add P and delete Q. and delete 1P and Q. Only ground literals in goals: Quantified variables in goals: 3xAt (PI, x) A At(P2, x) is the goal of Rich A Famous having PI and P2 in the same place. Goals are conjunctions: Goals allow conjunction and disjunction: Rich A Famous 1Poor A (Famous V Smart) Effects are conjunctions. Conditional effects allowed: when P: E means E is an effect only if P is satisfied. No support for equality. Equality predicate (x = y) is built No support for types. Variables can have types, as in (p : Figure 11.1 Cornpaison of STRIPS and ADL languages for representing planning prob- lems. In both cases, goals behave as the preconditions of an action with no parameters. in Sections 1 1.4 and 11.5 work directly with propositionalized descriptions. If we allow function symbols, then infinitely many states and actions can be constructed. In recent years, it has become clear that STRIPS is insufficiently expressive for some real domains. As a result, many language variants have been developed. Figure 11 I briefly ADL describes one important one, the Action Description Language or ADL, by comparing it with the basic STRIPS language. In ADL, the Fly action could be written as Action(Fly(p : Plane, from : Airport, to : Airport), PRECOND:A(, from) A (from to) EFFECT:A(, from) A At(p, to)) . 'The notation p : Plane in the parameter list is an abbreviation for Plane(p) in the precondi- tion; this adds no expressive power, but can be easier to read. (It also cuts down on the number of possible propositional actions that can be constructed.) The precondition (from to) ex- presses the fact that a flight cannot be made from an airport to itself. This could not be expressed succinctly in STRIPS. The various planning formalisms used in A1 have been systematized within a standard syntax called the Planning Domain Definition Language, or PDDL. This language allows researchers to exchange benchmark problems and compare results. PDDL includes sublan- guages for STRIPS, ADL, and the hierarchical task networks we will see in Chapter 12. 380 Chapter 1 1. Planning - - Inzt(At(Cl, SFO) A At(C2, JFK) A At(P1, SFO) A At(P2, JFK) A Cargo(Cl) A Cargo(C2) A Plane(P1) A Plane(P2) A Azrport(J8'K) A Azrport(SF0)) Goal (At (CI , JFK) A At (C2, SFO)) Actzon(Load(c, p, a), PRECOND: At(c, a) A At(p, a) A Cargo(c) A Plane(p) A Azrport(a) EFFECT: 1 At(c, a) A In(c, p)) Actzon( Unload(c, p, a), PRECOND: In(c, p) A At(p, a) A Cargo(c) A Plane) A Airport(a) EFFECT: At(c, a) A 1 In(c, p)) Actzon(Fly(p, from, to), PRECOND: At, from) A Plane(p) A Azrport(from) A Azrport(to) EFFECT: 1 At, from) A At(p, to)) / Figure 11.2 A STRIPS problem involving transportation of air cargo between airports. 1 The STRIPS and ADL notations are adequate for many real domains. The subsections that follow show some simple examples. There are still some significant restrictions, how- ever. The most obvious is that they cannot represent in a natural way the ramifications of actions. For example, if there are people, packages, or dust motes in the airplane, then they too change location when the plane flies. We can represent these changes as the direct ef- fects of flying, whereas it seems more natural to represent the location of the plane's contents as a logical consequence of the location of the plane. We will see more examples of such STATE CONSTRAINTS state constraints in Section 11.5. Classical planning systems do not even attempt to address the qualification problem: the problem of unrepresented circumstances that could cause an action to fail. We will see how to address qualifications in Chapter 12. Example: Air cargo transport Figure 1 1.2 shows an air cargo transport problem involving loading and unloading cargo onto and off of planes and flying it from place to place. The problem can be defined with three actions: Load, Unload, and Fly. The actions affect two predicates: In(c, p) means that cargo c is inside plane p, and At(x, a) means that object x (either plane or cargo) is at airport a. Note that cargo is not At anywhere when it is In a plane, so At really means "available for use at a given location." It takes some experience with action definitions to handle such details consistently. The following plan is a solution to the problem: Load (C1, PI, SFO) , Fly (PI, SFO, JFK) , Unload (Cl , PI, JFK) , Load(Cz, P2, JFK), Fly(P2, JFK, SFO), Unload (Cz, P2, SFO) . Our representation is pure STRIPS. In particular, it allows a plane to fly to and from the same airport. Inequality literals in ADL could prevent this. Section 1 1.1. The Planning Problenn 38 1 Example: The spare Itire problem Consider the problem of changing a flat tire. More precisely, the goal is to have a good spare tire properly mounted onto the car's axle, where the initial state has a flat tire on the axle and a good spare tire in the trunk. To keep it simple, our version of the problem is a very abstract one, with no sticky lug nuts or other complications. There are just four actions: removing the spare from the trunk, removing the flat tire from the axle, putting the spare on the axle, and leaving the car unattended overnight. We assume that the car is in a particularly bad neighborhood, so that the: effect of leaving it overnight is that the tires disappear. The ADL descriptilon of the problem is shown in Figure 11.3. Notice that it is purely propositional. It goes beyond STRIPS in that it uses a negated precondition, lAt(Flrzt, Axle), for the PutOn(Spare, Axle) action. This could be avoided by using Clear(Ax1e) instead, as we will see in the next example. Inzt(At (Flat, Axle) A At(Spare, Trunk)) Goal (At (Spare, Axle)) Action(Remove(Spare, nunk), PRECOND: At(Spare, Trunk) EFFECT: 1 At (Spare, Trunk) A At (Spare, Ground)) Action(Remove(Flat, .Axle), PRECOND: At(Flat, Axle) EFFECT: 1 At(Flat, Axle) A At(Flat, Ground)) Action(PutOn(Spare, Axle), PRECOND: At(Spare, Ground) A 1 At (Flat, Axle) EFFECT: 1 At(Spare, Ground) A At(Spare, Axle)) Action(LeaveOvernight, PRECOND: EFFECT: 1 At(Spare, Ground) A 1 At(Spam, Axle) A 1 At(Spare, Trunk) A 1 At(Flat, Ground) A 1 At(Flat, Axle)) I Figure 11.3 The simple spare tire problem. _Ij Example: The blocks world BLOCKSHIORLD One of the most famous planning domains is known as the blocks world. This domain consists of a set of cube-shaped blocks sitting on a table.3 The blocks can be stacked, but only one block can fit directly on top of another. A robot arm can pick up a block and move it to another position, either on the table or on top of another block. The arm can pick up only one block at a time, so it cannot pick up a block that has another one on it. The goal will always be to build one or more stacks of blocks, specified in terms of what blocks are on top of what other blocks. For example, a goal might be to get block A on B and block C on D. The blocks world used in planning research is much simpler than SHRDLU's version, shown on paglz 20. 382 Chapter 1 1. Planning We will use On(b, x) to indicate that block b is on x, where x is either another block or the table. The action for moving block b from the top of x to the top of y will be Move(b, x, y). Now, one of the preconditions on moving b is that no other block be on it. In first-order logic, this would be 13 x On (x, b) or, alternatively, 'd x 1 On(x, b). These could be stated as preconditions in ADL. We can stay within the STRIPS language, however, by introducing a new predicate, Clear(x), that is true when nothing is on x. The action Move moves a block b from x to y if both b and y are clear. After the move is made, x is clear but y is not. A formal description of Move in STRIPS is Action (Move (b, z, y) , PRECOND: On(b, x) A Clear(b) A Clear(y), EFFECT: On(b, y) A Clear(x) A lOn(b, x) A lClear(y)) . Unfortunately, this action does not maintain Clear properly when x or y is the table. When x = Table, this action has the effect Clear(Table), but the table should not become clear, and when y = Table, it has the precondition Clear(Table), but the table does not have to be clear to move a block onto it. To fix this, we do two things. First, we introduce another action to move a block b from x to the table: Action(MoveTo Table(b, x) , PRECOND: On(b, x) A Clear(b), EFFECT: On(b, Table) A Clear(x) A lOn(b, x)) . Second, we take the interpretation of Clear(b) to be "there is a clear space on b to hold a block." Under this interpretation, Clear( Table) will always be true. The only problem is that nothing prevents the planner from using Move (b, x, Table) instead of Move To Table (b, x) . We could live with this problem-it will lead to a larger-than-necessary search space, but will not lead to incorrect answers-or we could introduce the predicate Block and add Bloctk(b) A Block(y) to the precondition of Move. Finally, there is the problem of spurious actions such as Move(B, C, C), which should be a no-op, but which has contradictory effects. It is common to ignore such problems, because they seldom cause incorrect plans to be produced. The correct approach is add in- equality preconditions as shown in Figure 1 1.4. 11.2 PLANNING WITH STATE-SPACE SEARCH Now we turn our attention to planning algorithms. The most straightforward approach is to -space search. Because the descriptions of actions in a planning problem specify use state both preconditions and effects, it is possible to search in either direction: either forward from the initial state or backward from the goal, as shown in Figure 11.5. We can also use the explicit action and goal representations to derive effective heuristics automatically. Forward state-space search Planning with forward state-space search is similar to the problem-solving approach of Chap- PROGRESSION ter 3. It is sometimes called progression planning, because it moves in the forward direction. Section 11.2. Planning with State-Space Search 383 Init(On(A, Table) A On(B, Table) A On(C, Table) A Block(A) A Block(B) A Block(C) Clear(A) A Clear(B) A Clear(C)) Goal(On(A, B) A On(B, C)) Action(Move(b, x, y), PRECOND: On(b,x) A Clear(b) A Clear(y) A Block(b) A (b 2) (b Y) (x I), EFFECT: On(b, y) /\ Clear(x) A 1 On(b, x) A 7 Clear(y)) Action(MoveToTable(b, x), PRECOND: On(b,x) A Clear(b) A Block(b) A (b x), EFFECT: On(b, Table) A Clear(x) A 7 On(b, x)) Figure 11.4 A planning problem in the blocks world: building a three-block tower. One solution is the sequence Move(B, Table, C), Move(A, Table, B). I AY P27 A) (4 AYP2I 9) L (b) Figure 11.5 Two approaches to searching for a plan. (a) Forward (progression) state-space search, starting in the initial state and using the problem's actions to search forward for the goal state. (b) Backward (regression) state-space search: a belief-state search (see page 84) starting at the goal state(s) and using the inverse of the actions to search backward fcr the initial state. We start in the problem's initial state, considering sequences of actions until we find a se- quence that reaches a goal state. The formulation of planning problems as state-space search problems is as follows: The initial state of the search is the initial state from the planning problem. In general, each state will be a set of positive ground literals; literals not appearing are false. 3 84 Chapter 1 1. Planning The actions that are applicable to a state are all those whose preconditions are satisfied. The successor state resulting from an action is generated by adding the positive effect literals and deleting the negative effect literals. (In the first-order case, we must apply the unifier from the preconditions to the effect literals.) Note that a single successor function works for all planning problems-a consequence of using an explicit action representation. The goal test checks whether the state satisfies the goal of the planning problem. The step cost of each action is typically 1. Although it would be easy to allow different costs for different actions, this is seldom done by STRIPS planners. Recall that, in the absence of function symbols, the state space of a planning problem is finite. Therefore, any graph search algorithm that is complete-for example, A- will be a complete planning algorithm. From the earliest days of planning research (around 1961) until recently (around 1998) it was assumed that forward state-space search was too inefficient to be practical. It is not hard to come up with reasons why-just refer back to the start of Section 1 1.1. First, forward search does not address the irrelevant action problem-all applicable actions are considered from each state. Second, the approach quickly bogs down without a good heuristic. Consider an air cargo problem with 10 airports, where each airport has 5 planes and 20 pieces of cargo. The goal is to move all the cargo at airport A to airport B. There is a simple solution to the problem: load the 20 pieces of cargo into one of the planes at A, fly the plane to B, and unload the cargo. But finding the solution can be difficult because the average branching factor is huge: each of the 50 planes can fly to 9 other airports, and each of the 200 packages can be either unloaded (if it is loaded), or loaded into any plane at its airport (if it is unloaded). On average, let's say there are about 1000 possible actions, so the search tree up to the depth of the obvious solution has about 1000 nodes. It is clear that a very accurate heuristic will be needed to make this kind of search efficient. We will discuss some possible heuristics after looking at backward search. Backward state-space search Backward state-space search was described briefly as part of bidirectional search in Chapter 3. We noted there that backward search can be difficult to implement when the goal states are described by a set of constraints rather than being listed explicitly. In particular, it is not always obvious how to generate a description of the possible predecessors of the set of goal states. We will see that the STRIPS representation makes this quite easy because sets of states can be described by the literals that must be true in those states. RELEVANCE The main advantage of backward search is that it allows us to consider only relevant actions. An action is relevant to a conjunctive goal if it achieves one of the conjuncts of the goal. For example, the goal in our 10-airport air cargo problem is to have 20 pieces of cargo at airport B, or more precisely, At(C1, B) A At(C2, B) A .. . A At(Czo, B) . Now consider the conjunct At (C1, B). Working backwards, we can seek actions that have this as an effect. There is only one: Unload (C1, p, B), where plane p is unspecified. Section 11.2. Planning with State-Space Search 385 Notice that there are many irrelevant actions that can also lead to a goal state. For example, we can fly an empty plane from JFK to SFO; this action reaches a goal state from a predecessor state in wliich the plane is at JFK and all the goal conjuncts are satisfied. A backward search that allows irrelevant actions will still be complete, but it will be much less efficient. If a solution exists, it will be found by a backward search that allows only relevant actions. The restriction to relevant actions means that backward search often has a much lower branching factor than forward search. For example, our air cargo problem has about 1000 actions leading forward from the initial state, but only 20 actions working backward from the goal. REGRE:;SION Searching backwards is sometimes called regression planning. The principal question in regression planning is this: what are the states from which applying a given action leads to the goal? Computing the description of these states is called regressing the goal through the action. To see how to do it, consider the air cargo example. We have the goal and the relevant action Unload(C1,p, B), which achieves the first conjunct. The action will work only if its preconditions are satisfied. Therefore, any predecessor state mulst include these preconditions: In((71, p) A At (p, B). Moreover, the subgoal At (C1, B) shoilld not be true in the predecessor state.4 Thus, the predecessor description is In(Cl, p) A At (p, B) A At (C2, B) A . . . A At (Czo, B) . In addition to insisting that actions achieve some desired literal, we must insist that the actions CONSISTENCY not undo any desired literals. An action that satisfies this restriction is called consistent. For example, the action Load(C2, p) would not be consistent with the current goal, because it would negate the literal At (C2, B) . Given definitions of relevance and consistency, we can describe the general process of constructing predecessors for backward search. Given a goal description G, let A be an action that is relevant and consistent. The corresponding predecessor is as follows: a Any positive effects of A that appear in G are deleted. a Each precondition literal of A is added, unless it already appears. Any of the standard search algorithms can be used to carry out the search. Termination occurs when a predecessor description is generated that is satisfied by the initial state of the planning problem. In the first-order case, satisfaction might require a substitution for variables in the predecessor description. Iior example, the predecessor description in the preceding paragraph is satisfied by the initial state with substitution p/P12. The substitution must be applied to the actions leading from the state to the goal, producing the solution Unload (Cl , P12, B) . " If the subgoal were true in the predecessor state, the action would still lead to a goal state. On the other hand, such actions are irrelevant because they do not make the goal true. 386 Chapter 1 1. Planning Heuristics for state-space search It turns out that neither forward nor backward search is efficient without a good heuristic function. Recall from Chapter 4 that a heuristic function estimates the distance from a state to the goal; in STRIPS planning, the cost of each action is 1, so the distance is the number of actions. The basic idea is to look at the effects of the actions and at the goals that must be achieved and to guess how many actions are needed to achieve all the goals. Finding the exact number is NP hard, but it is possible to find reasonable estimates most of the time without too much computation. We might also be able to derive an admissible heuristic-one that does not overestimate. This could be used with A search to find optimal solutions. There are two approaches that can be tried. The first is to derive a relaxed problem from the given problem specification, as described in Chapter 4. The optimal solution cost for the relaxed problem-which we hope is very easy to solve-gives an admissible heuristic for the original problem. The second approach is to pretend that a pure divide-and-conquer SUBGOAL algorithm will work. This is called the subgoal independence assumption: the cost of solving INDEPENDENCE a conjunction of subgoals is approximated by the sum of the costs of solving each subgoal independently. The subgoal independence assumption can be optimistic or pessimistic. It is optimistic when there are negative interactions between the subplans for each subgoal- for example, when an action in one subplan deletes a goal achieved by another subplan. It is pessimistic, and therefore inadmissible, when subplans contain redundant actions-for instance, two actions that could be replaced by a single action in the merged plan. Let us consider how to derive relaxed planning problems. Since explicit representations of preconditions and effects are available, the process will work by modifying those repre- sentations. (Compare this approach with search problems, where the successor function is a black box.) The simplest idea is to relax the problem by removing all preconditions from the actions. Then every action will always be applicable, and any literal can be achieved in one step (if there is an applicable action-if not, the goal is impossible). This almost implies that the number of steps required to solve a conjunction of goals is the number of unsatisfied goals-almost but not quite, because (1) there may be two actions, each of which deletes the goal literal achieved by the other, and (2) some action may achieve multiple goals. If we combine our relaxed problem with the subgoal independence assumption, both of these issues are assumed away and the resulting heuristic is exactly the number of unsatisfied goals. In many cases, a more accurate heuristic is obtained by considering at least the positive interactions arising from actions that achieve multiple goals. First, we relax the problem fur- ther by removing negative effects (see Exercise 1 l .6). Then, we count the minimum number of actions required such that the union of those actions' positive effects satisfies the goal. For example, consider Goal (A A B A C) Action(X, EFFECT:A A P) Action(Y, EFFECT:B A C A Q) Action(Z, EFFECT:B A P A Q) . The minimal set cover of the goal A, B, C) is given by the actions X, Y), so the set cover heuristic returns a cost of 2. This improves on the subgoal independence assumption, which Section 1 1.3. Partial-Order Planning 387 gives a heuristic value of 3. There is one minor irritation: the set cover problem is NP- hard. A simple greedy set-covering algorithm is guaranteed to return a value that is within a factor of log n of the true minimum value, where n is the number of literals in the goal, and usually works much better than this in practice. Unfortunately, the greedy algorithm loses the guarantee of admissibility for the heuristic. It is also possible to generate relaxed problems by removing negative effeciis without removing preconditions. That is, if an action has the effect A A 1B in the original problem, it will have the effect A in the relaxed problem. This means that we need not worry about negative interactions between subplans, because no action can delete the literals achieved by another action. The solution cost of the resulting relaxed problem gives what is called the EMPTY-DELETE-LIST empty-delete-list heuristic. The heuristic is quite accurate, but computing it involves actually running a (simple) planning algorithm. In practice, the search in the relaxed problem is often fast enough that the cost is worthwhile. The heuristics described here can be used in either the progression or the regression direction. At the time of writing, progression planners using the empty-delete-list heuristic hold the lead. That is likely to change as new heuristics and new search techniques are ex- plored. Since planning is exponentially hard? no algorithm will be efficient for all problems, but many practical problems can be solved with the heuristic methods in this chapter-far more than could be solved just a few years ago. Forward and backward state-space search are particular forms of totally ordered plan search. They explore only strictly linear sequences of actions directly connected to the start or goal. This means that they cannot take advantage of problem decomposition. Rather than work on each subproblem separately, they must always make decisions about how to sequence actions from all the subproblems. We would prefer an approach that works on several subgoals independently, solves them with several subplans, and then combines the subplans. Such an approach also has the advantage of flexibility in the order in which it cvnstructs 7 the plan. That is, the planner can work on "obvious" or "important ' decisions first, rather than being forced to work on steps in chronological order. For example, a planning agent that is in Berkeley and wishes to be in Monte Carlo might first try to find a flight from San Francisco to Paris; given information about the departure and arrival times, it can then work on ways to get to and from the airports. LEAST COMMITMENT The general strategy of delaying a choice during search is called a least comrmitment strategy. There is no formal definition of least commitment, and clearly some degree of commitment is necessary, lest the search would make no progress. Despite the infcrmality, lleast commitment is a useful concept for analyzing when decisions should be madle in any search problem. Vechnically, STRIPS-S planning is PSPACE-complete unless actions have only positive preconditions and only one effect literal (Bylander, 1994). 388 Chapter 1 1. Planning Our first concrete example will be much simpler than planning a vacation. Consider the simple problem of putting on a pair of shoes. We can describe this as a formal planning problem as follows: (RightShoeOn A LeftShoeOn) Goal Init () Action(RightShoe, PRECOND: RightSocFOn, EFFECT: RightShoeOn) Action(RightSocF, EEc:RightSockOn) Action(LeftShoe; PRECOND:LSOO, ET:LeftShoeon) Action (LeftSock, ET:Leftsockon) . A planner should be able to come up with the two-action sequence Rightsock followed by Rightshoe to achieve the first conjunct of the goal and the sequence Leftsock followed by LeftShoe for the second conjunct. Then the two sequences can be combined to yield the final plan. In doing this, the planner will be manipulating the two subsequences independently, without committing to whether an action in one sequence is before or after an action in the other. Any planning algorithm that can place two actions into a plan without specifying which comes first is called a partial-order planner. Figure 11.6 shows the partial-order plan that is PLANNER the solution to the shoes and socks problem. Note that the solution is represented as a graph of actions, not a sequence. Note also the "dummy" actions called Start and Finish, which mark the beginning and end of the plan. Calling them actions symplifies things, because now every step of a plan is an action. The partial-order solution corresponds to six possible LINEARIZATION total-order plans; each of these is called a linearization of the partial-order plan. Partial-order planning can be implemented as a search in the space of partial-order plans. (From now on, we will just call them "plans.") That is, we start with an empty plan. Then we consider ways of refining the plan until we come up with a complete plan that solves the problem. The actions in this search are not actions in the world, but actions on plans: adding a step to the plan, imposing an ordering that puts one action before another, and so on. We will define the POP algorithm for partial-order planning. It is traditional to write out the POP algorithm as a stand-alone program, but we will instead formulate partial-order planning as an instance of a search problem. This allows us to focus on the plan refinement steps that can be applied, rather than worrying about how the algorithm explores the space. In fact, a wide variety of uninformed or heuristic search methods can be applied once the search problem is formulated. Remember that the states of our search problem will be (mostly unfinished) plans. To avoid confusion with the states of the world, we will talk about plans rather than states. Each plan has the following four components, where the first two define the steps of the plan and the last two serve a bookkeeping function to determine how plans can be extended: A set of actions that make up the steps of the plan. These are taken from the set of actions in the planning problem. The "empty" plan contains just the Start and Finish actions. Start has no preconditions and has as its effect all the literals in the initial state of the planning problem. Finish has no effects and has as its preconditions the goal literals of the planning problem. Section 11.3. Partial-Order Planning 389 Partial-Order Plan: Total-Order Plans: Start Start Right Left Left Right Right Left Sock P LeftSockOn RightSockOn Shoe Shoe Left LeftShoeOn, RightShoeOn I Finish Finish Finish Finish 0 Figure 11.6 A partial-order plan for putting on shoes and socks, and the six corresponding linearizations into total-order plans. A set of ordering constraints. Each ordering constraint is of the form A -i B, which is read as "A before B" and means that action A xlust be executed sometime before ac- tion B, but not necessarily immediately before. The ordering constraints must describe a proper partial order. Any cycle-such as A -i B and B 4 A-represents a contradic- tion, so an ordering constraint cannot be added to the plan if it creates a cycle. CAUSAL LINKS A set of causal links. A causal link between two actions A and B in the plan is written ACHIEVES as A 5 B and is read as "A achieves p for B." For example, the causal link RightSock Rzgh%kon RightShoe asserts that RightSockOn is an effect of the RightSock action and a precondition of RightShoe. It also asserts that RightSockOn must remain true from the time of ac- tion RightSock to the time of action RightShoe. In other words, the plan may not be CONFLICTS C that conflicts with the causal link. An action C extended by adding a new action conflicts with A 5 B if C has the effect lp and if C could (according to the ordering constraints) come after A and before B. Some authors call causal links protection in- tervals, because the link A 5 B protects p from being negated over the interval from A to B. OPEN a A set of open preconditions. A precondition is open if it is not achieved by some action PRECONDITIONS in the plan. Planners will work to reduce the set of open preconditions to the empty set, without introducing a contradiction. 390 Chapter 1 1. Planning For example, the final plan in Figure 11.6 has the following components (not shown are the ordering constraints that put every other action after Start and before Finish): Actions: Rightsock, RightShoe, Leftsock, LeftShoe, Start, Finish) Orderings:RightSock 4 RightShoe, Leftsock 4 LeftShoe Links:Righ,tSock RightShoe, Leftsock Left%k0n LeftShoe, RightShoe RighOn Finish, Leftshoe Left%0n Finish) Open Preconditions: ) . CONSISTENT PLAN We define a consistent plan as a plan in which there are no cycles in the ordering con- straints and no conflicts with the causal links. A consistent plan with no open preconditions is a solution. A moment's thought should convince the reader of the following fact: every linearization of a partial-order solution is a total-order solution whose execution from the initial state will reach a goal state. This means that we can extend the notion of "executing a plan" from total-order to partial-order plans. A partial-order plan is executed by repeatedly choosing any of the possible next actions. We will see in Chapter 12 that the flexibility avail- able to the agent as it executes the plan can be very useful when the world fails to cooperate. The flexible ordering also makes it easier to combine smaller plans into larger ones, because each of the small plans can reorder its actions to avoid conflict with the other plans. Now we are ready to formulate the search problem that POP solves. We will begin with a formulation suitable for propositional planning problems, leaving the first-order complica- tions for later. As usual, the definition includes the initial state, actions, and goal test. The initial plan contains Start and Finish, the ordering constraint Start 4 Finish, and no causal links and has all the preconditions in Finish as open preconditions. a The successor function arbitrarily picks one open precondition p on an action B and generates a successor plan for every possible consistent way of choosing an action A that achieves p. Consistency is enforced as follows: B and the ordering constraint A 4 B are added to the plan. 1. The causal link A Action A may be an existing action in the plan or a new one. If it is new, add it to the plan and also add Start 4 A and A 4 Finish. 2. We resolve conflicts between the new causal link and all existing actions and be- tween the action A (if it is new) and all existing causal links. A conflict between A 5 B and C is resolved by making C occur at some time outside the protection interval, either by adding B 4 C or C 4 A. We add successor states for either or both if they result in consistent plans. The goal test checks whether a plan is a solution to the original planning problem. Because only consistent plans are generated, the goal test just needs to check that there are no open preconditions. Remember that the actions considered by the search algorithms under this formulation are plan refinement steps rather than the real actions from the domain itself. The path cost is therefore irrelevant, strictly speaking, because the only thing that matters is the total cost of the real actions in the plan to which the path leads. Nonetheless, it is possible to specify a path cost function that reflects the real plan costs: we charge 1 for each real action added to Section 1 1.3. Partial-Order Planning 391 the plan and 0 for all other refinement steps. In this way, g(n), where n is a plan, will be equal to the number of real actions in the plan. A heuristic estimate h(n) can also be used. At first glance, one might think that the successor function should include successors for every open p, not just for one of them. This would be realundant and inefficient, however, for the same reason that constraint satisfaction algorithms don't include successors for every possible variable: the order in which we consider open preccnditions (like the order in which we consider CSP variables) is commutative. (See page 141.) 'Thus, we can choose an arbitrary ordering and still have a complete algorithm. Choosing the right ordering can lead to a faster search, but all orderings end up with the same set of candidate solutions. A partial-order planning example Now let's look at how POP solves the spare tire problem from Section 1 1.1. The problem description is repeated in Figure 1 1.7. Init (At (Fla.t, Axle) A At (Spare, Trunk)) Goal (At (Spare, Axle)) Action(Remove(Spare, Tmnk), PRECOND: At (Spare, Punk) EFFECT: 1 At(Spare, Trunk) At(Spare, Grourd)) Action(Remove(F1at , Axle), PRECOND: At(Flat, Axle) EFFECT: 1 At(Flat, Axle) A At(Flat, Ground)) Action(PutOn(Spare, Axle), PRECOND: At(Spare, Ground) A 7 At(Flat, Axle) EFFECT: 1 At(Spare, Ground) A At(Spare, Axle)) Action(LeaveOvernight, PRECOND: EFFECT: 1 At(Spare, Ground) A 7 At(Spare, Axle) 1 At(Spare, Trunk) A 7 At(Flat, Ground) A 1 At(Flat, Axle)) 1 Figure 11.7 The simple flat tire problem description. The search for a solution begins with the initial plan, containing a Start action with the effect At(Spare, Trunk) A At(Flat, Axle) and a Finish action with the sole precondition At(Spare, Axle). Then we generate successors by picking an open precondition to work on (irrevocably) and choosing among the possible actions to achieve it. For now, we will not worry about a heuristic function to help with these decisions; we will make seemingly arbitrary choices. The sequence of events is as follows: 1. Pick the only open precondition, At(Spare, Axle) of Finish. Choose the only applica- ble action, Put On (Spare, Axle). 2. Pick the At (Spare, Ground) precondition of PutOn(Sare, Axle). Choose the only applicable action, Remove(Spare, Trunk) to achieve it. The resulting plan is shown in Figure 1 1.8. 392 Chapter 1 1. Planning At(Spare, Trunk) Ai(Spare,Ground) It(parexle)F At(Flat,Axle) -I At(Flat,Axle) I Figure 11.8 The incomplete partial-order plan for the tire problem, after choosing actions ( for the first two open preconditions. Boxes represent actions, with preconditions on the left 1 I and effects on the right. (Effects are omitted, except for that of the Start action.) Dark arrows represent causal links protecting the proposition at the head of the arrow. 3. Pick the lAt(Flat, Axle) precondition of PutOn(Spare, Axle). Just to be contrary, choose the LeaveOvernight action rather than the Remove(Flat, Axle) action. Notice that LeaveOvernight also has the effect lAt(Spare, Ground), which means it conflicts with the causal link Renove(Spare, Trunk) At(span.GrOund) -t P utOn(Spare, Axle) . To resolve the conflict we add an ordering constraint putting LeaveOvernight before Remove(Spare, Trunk). The resulting plan is shown in Figure 1 1.9. (Why does this resolve the conflict, and why is there no other way to resolve it?) AYSpare, Trunk) Remove(Spare,Trunk) - AKSpare, Trunk) I I Ai(Flat,hle) t I lAf(Spare, Ground) lAt(Spare, Trunk) Figure 11.9 The plan after choosing LeaveOvernight as the action for achieving At(Flat, Axle). To avoid a conflict with the causal link from Remove(Spare, Trunk) ( that protects At(Spare, Ground), LeaveOvernight is constrained to occur before 1 Remove(Spare, Trunk), as shown by the dashed arrow. 4. The only remaining open precondition at this point is the At (Spare, Punk) precondi- tion of the action Remove(Spare, Trunk). The only action that can achieve it is the ex- isting Start action, but the causal link from Start to Remove(Spare, Trunk) is in con- flict with the At(Spare, Trunk) effect of LeaveOvernight. This time there is no way to resolve the conflict with LeaveOvernight: we cannot order it before Start (because nothing can come before Start), and we cannot order it after Remove(Spare, Trunk) (because there is already a constraint ordering it before Remove(Spare, Trunk)). So we are forced to back up, remove the LeaveOvernzght action and the last two causal links, and return to the state in Figure 11.8. In essence, the planner has proved that LeaveOvernight doesn't work as a way to change a tire. Section 11.3. Partial-Order Planning 393 5. Consider again the lAt(Flat, Axle) precondition of lDutOn(Spare, Axle). This time, we choose Remove(Flat, Axle). 6. Once again, pick the At(Spare, Trunk) precondition of Remove(Spare, Trunk) and choose Start to achieve it. This time there are no conflicts. 7. Pick the At(Flat, Axle) precondition of Remove(Fiat, Axle), and choose Start to achieve it. This gives us a complete, consistenl. planin other words a solution-as shown in Figure 1 1.10. t(paie, Trunk) pare, drouna) pPutOn(Spare,l\xle)+t(are.xje) v At(Nat,Axle) 1Af(Flat,Axle) Figure 11.10 The final solution to the tire problem. Note that Remove(Spare, Trunk) and Remove(Flat, Azle) can be done in either order, as long as they are completed before Put On (Spare, Axle) action. the Although this example is very simple, it illustrates some of the strengths of partial-order planning. First, the causal links lead to early pruning of pclrtions of the search space that, because of irresolvable conflicts, contain no solutions. Secoid, the solution in Figure 1 1.10 is a partial-order plan. In this case the advantage is small, because there are only two possible linearizations; nonetheless, an agent might welcome the flexibility-for example, if the tire has to be changed in the middle of heavy traffic. The example also points to some possible improvement; that could be made. For exam - ple, there is duplication of effort: Start is linked to Rlemove(Spare, Trunk) before the con- flict causes a backtrack and is then unlinked by backtracking even though it is not involved in the conflict. It is then relinked as the search continues. This is typical of chronological backtracking and might be mitigated by dependency-directed backtracking. Partial-order planning with unbound variables In this section, we consider the complications that ca.n arise when POP is used with first- order action representations that include variables. Suppose we have a blocks world problem (Figure 11.4) with the open precondition On(A, B) and the action Action(Move(b, x, y), PRECOND: On(b, x) A Clear(b) A Clear(y), EFFECT: On(b, y) A Clear(x) A 1 On(b, x) A 1 Clear(y)) . 394 Chapter 1 1. Planning This action achieves On (A, B) because the effect On (b, y) unifies with On (A, B) with the substitution b/A, y/B). We then apply this substitution to the action, yielding Action(Move(A, x, B), PRECOND:O(A, x) A Clear(A) A Clear(B), EFFECT: 0n(A, B) A Clear(x) A 1 On(A, x) A 1 Clear(B)) . This leaves the variable x unbound. That is, the action says to move block A from somewhere, without yet saying whence. This is another example of the least commitment principle: we can delay making the choice until some other step in the plan makes it for us. For example, suppose we have On(A, D) in the initial state. Then the Start action can be used to achieve On(A, x), binding x to D. This strategy of waiting for more information before choosing x is often more efficient than trying every possible value of x and backtracking for each one that fails. The presence of variables in preconditions and actions complicates the process of de- tecting and resolving conflicts. For example, when Move(A, x, B) is added to the plan, we will need a causal link Move(A, x, B) onB) Finish . If there is another action M2 with effect lOn(A, z), then M2 conflicts only if z is B. To ac- commodate this possibility, we extend the representation of plans to include a set of inequal- ity constraints of the form z X where z is a variable and X is either another variable or a constant symbol. In this case, we would resolve the conflict by adding z B, which means that future extensions to the plan can instantiate z to any value except B. Anytime we apply a substitution to a plan, we must check that the inequalities do not contradict the substitution. For example, a substitution that includes x/y conflicts with the inequality constraint x y. Such conflicts cannot be resolved, so the planner must backtrack. A more extensive example of POP planning with variables in the blocks world is given in Section 12.6. Heuristics for partial-order planning Compared with total-order planning, partial-order planning has a clear advantage in being able to decompose problems into subproblems. It also has a disadvantage in that it does not represent states directly, so it is harder to estimate how far a partial-order plan is from achieving a goal. At present, there is less understanding of how to compute accurate heuristics for partial-order planning than for total-order planning. The most obvious heuristic is to count the number of distinct open preconditions. This can be improved by subtracting the number of open preconditions that match literals in the Start state. As in the total-order case, this overestimates the cost when there are actions that achieve multiple goals and underestimates the cost when there are negative interactions between plan steps. The next section presents an approach that allows us to get much more accurate heuristics from a relaxed problem. The heuristic function is used to choose which plan to refine. Given this choice, the algorithm generates successors based on the selection of a single open precondition to work

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.