Lecture notes Real time Embedded systems

embedded and real time systems lecture notes and real-time systems design principles for distributed embedded applications pdf free download
Dr.DavidWllker Profile Pic
Dr.DavidWllker,United States,Professional
Published Date:11-07-2017
Your Website URL(Optional)
Chapter 1 The Real-Time Environment OVERVIEW The purpose of this introductory chapter is to describe the environment of real-time computer systems from a number of different perspectives. A solid understanding of the technical and economic factors which characterize a real-time application helps to interpret the demands that the system designer must cope with. The chapter starts with the definition of a real-time system and with a discussion of its functional and metafunctional requirements. Particular emphasis is placed on the temporal requirements that are derived from the well-understood properties of control applications. The objective of a control algorithm is to drive a process so that a performance criterion is satisfied. Random disturbances occurring in the environment degrade system performance and must be taken into account by the control algorithm. Any additional uncertainty that is introduced into the control loop by the control system itself, e.g., a non-predictable jitter of the control loop, results in a degradation of the quality of control. In the Sections 1.2 to 1.5 real-time applications are classified from a number of viewpoints. Special emphasis is placed on the fundamental differences between hard and soft real-time systems. Because soft real-time systems do not have catastrophic failure modes, a less rigorous approach to their design is often followed. Sometimes resource-inadequate solutions that will not handle the rarely occurring peak-load scenarios are accepted on economic arguments. In a hard real-time application, such an approach is unacceptable because the safety of a design in all specified situations, even if they occur only very rarely, must be demonstrated vis-a-vis a certification agency. In Section 1.6, a brief analysis of the real-time system market is carried out with emphasis on the field of embedded real-time systems. An embedded real-time system is a part of a self-contained product, e.g., a television set or an automobile. In the future, embedded real-time systems will form the most important market segment for real-time technology. 2 CHAPTER 1 THE REAL-TIME ENVIRONMENT 1.1 WHEN IS A COMPUTER SYSTEM REAL-TIME? A real-time computer system is a computer system in which the correctness of the system behavior depends not only on the logical results of the computations, but also on the physical instant at which these results are produced. A real-time computer system is always part of a larger system–this larger system is called a real-time system. A real-time system changes its state as a function of physical time, e.g., a chemical reaction continues to change its state even after its controlling computer system has stopped. It is reasonable to decompose a real-time system into a set of subsystems called clusters (Figure 1.1) e.g., the controlled object (the controlled cluster ), the real-time computer system (the computational cluster ) and the human operator (the operator cluster ). We refer to the controlled object and the operator collectively as the environment of the real-time computer system. Figure 1.1: Real-time system. If the real-time computer system is distributed, it consists of a set of (computer) nodes interconnected by a real-time communication network (see also Figure 2.1). The interface between the human operator and the real-time computer system is called the man-machine interface, and the interface between the controlled object and the real-time computer system is called the instrumentation interface. The man-machine interface consists of input devices (e.g., keyboard) and output devices (e.g., display) that interface to the human operator. The instrumentation interface consists of the sensors and actuators that transform the physical signals (e.g., voltages, currents) in the controlled object into a digital form and vice versa. A node with an instrumentation interface is called an interface node. A real-time computer system must react to stimuli from the controlled object (or the operator) within time intervals dictated by its environment. The instant at which a result must be produced is called a deadline. If a result has utility even after the deadline has passed, the deadline is classified as soft, otherwise it is firm. If a catastrophe could result if a firm deadline is missed, the deadline is called hard. Consider a railway crossing a road with a traffic signal. If the traffic signal does not change to "red" before the train arrives, a catastrophe could result. A real-time computer system that must meet at least one hard deadline is called a hard real-timeCHAPTER 1 THE REAL-TIME ENVIRONMENT 3 computer system or a safety-critical real-time computer system. If no hard real-time deadline exists, then the system is called a soft real-time computer system. The design of a hard real-time system is fundamentally different from the design of a soft real-time system. While a hard real-time computer system must sustain a guaranteed temporal behavior under all specified load and fault conditions, it is permissible for a soft real-time computer system to miss a deadline occasionally. The differences between soft and hard real-time systems will be discussed in detail in the following sections. The focus of this book is on the design of hard real-time systems. 1.2 FUNCTIONALREQUIREMENTS The functional requirements of real-time systems are concerned with the functions that a real-time computer system must perform. They are grouped into data collection requirements, direct digital control requirements, and man-machine interaction requirements. 1.2.1 Data Collection A controlled object, e.g., a caror an industrial plant, changes its state as a function of time. If we freeze time, we can describe the current state of the controlled object by recording the values of its state variables at that moment. Possible state variables of a controlled object "car" are the position of the car, the speed of the car, the position of switches on the dash board, and the position of a piston in a cylinder. We are normally not interested in all state variables, but only in the subset of state variables that is significant for our purpose. A significant state variable is called a real-time (RT) entity. Every RT entity is in the sphere of control (SOC)of a subsystem, i.e., it belongs to a subsystem that has the authority to change the value of this RT entity. Outside its sphere of control, the value of an RT entity can be observed, but cannot be modified. For example, the current position of a piston in a cylinder of the engine of a controlled car object is in the sphere of control of the car. Outside the car, the current position of the piston can only be observed. Figure 1.2: Temporal accuracy of the traffic light information. 4 CHAPTER 1 THE REAL-TIME ENVIRONMENT The first functional requirement of a real-time computer system is the observation of the RT entities in a controlled object and the collection of these observations. An observation of an RT entity is represented by a real-time (RT) image in the computer system. Since the state of the controlled object is a function of real time, a given RT image is only temporally accurate for a limited time interval. The length of this time interval depends on the dynamics of the controlled object. If the state of the controlled object changes very quickly, the corresponding RT image has a very short accuracy interval. Example: Consider the example of Figure 1.2, where a car enters an intersection controlled by a traffic light. How long is the observation "the traffic light is green" temporally accurate? If the information "the traffic light is green" is used outside its accuracy interval, i.e., a car enters the intersection after the traffic light has switched to red, a catastrophe may occur. In this example, an upper bound for the accuracy interval is given by the duration of the yellow phase of the traffic light. The set of all temporally accurate real-time images of the controlled object is called the real-time database. The real-time database must be updated whenever an RT entity changes its value. These updates can be performed periodically, triggered by the progression of the real-time clock by a fixed period ( time-triggered (TT) observation ), or immediately after a change of state, which constitutes an event, occurs in the RT entity ( event-triggered (ET) observation ). A more detailed analysis of event-triggered and time-triggered observations will be presented in Chapter 5. Signal Conditioning: A physical sensor, like a thermocouple, produces a raw data element (e.g., a voltage). Often, a sequence of raw data elements is collected and an averaging algorithm is applied to reduce the measurement error. In the next step the raw data must be calibrated and transformed to standard measurement units. The term signal conditioning is used to refer to all the processing steps that are necessary to obtain meaningful measured data of an RT entity from the raw sensor data. After signal conditioning, the measured data must be checked for plausibility and related to other measured data to detect a possible fault of the sensor. A data element that is judged to be a correct RT image of the corresponding RT entity is called an agreed data element. Alarm Monitoring: An important function of a real-time computer system is the continuous monitoring of the RT entities to detect abnormal process behaviors. For example, the rupture of a pipe in a chemical plant will cause many RT entities (diverse pressures, temperatures, liquid levels) to deviate from their normal operating ranges, and to cross some preset alarm limits, thereby generating a set of correlated alarms, which is called an alarm shower. The computer system must detect and display these alarms and must assist the operator in identifying a primary event which was the initial cause of these alarms. For this purpose, alarms that are observed must be logged in a special alarm log with the exact time the alarm occurred. The exact time order of the alarms is helpful in eliminating the secondary alarms, i.e., all alarms that are consequent to the primary event. In complex industrial plants, sophisticated knowledge-based systems are used to assist the operator in the alarm analysis. The predictable behavior of the computer system CHAPTER 1 THE REAL-TIME ENVIRONMENT 5 during peak-load alarm situations is of major importance in many application scenarios. A situation that occurs infrequently but is of utmost concern when it does occur is called a rare-event situation. The validation of the rare-event performance of a real- time computer system is a challenging task. Example: The sole purpose of a nuclear power plant monitoring and shutdown system is reliable performance in a peak-load alarm situation (rare event). Hopefully, this rare event will never occur. 1.2.2 Direct Digital Control Many real-time computer systems must calculate the set points for the actuators and control the controlled object directly ( direct digital control–DDC ), i.e., without any underlying conventional control system. Control applications are highly regular, consisting of an (infinite) sequence of control periods, each one starting with sampling of the RT entities, followed by the execution of the control algorithm to calculate a new set point, and subsequently by the output of the set point to the actuator. The design of a proper control algorithm that achieves the desired control objective, and compensates for the random disturbances that perturb the controlled object, is the topic of the field of control engineering. In the next section on temporal requirements, some basic notions in control engineering will be introduced. 1.2.3 Man-Machine Interaction A real-time computer system must inform the operator of the current state of the controlled object, and must assist the operator in controlling the machine or plant object. This is accomplished via the man-machine interface, a critical subsystem of major importance. Many catastrophic computer-related accidents in safety-critical real- time systems have been traced to mistakes made at the man-machine interface Lev95. Most process-control applications contain, as part of the man-machine interface, an extensive data logging and data reporting subsystem that is designed according to the demands of the particular industry. For example, in some countries, the pharmaceutical industry is required by law to record and store all relevant process parameters of every production batch in an archival storage so that the process conditions prevailing at the time of a production run can be reexamined in case a defective product is identified on the market at a later time. Man-machine interfacing has become such an important issue in the design of computer-based systems that a number of courses dealing with this topic have been developed. In the context of this book, we will introduce an abstract man-machine interface in Section 4.3.1, but we will not cover its design in detail. The interested reader is referred to standard textbooks, such as the books by Ebert Ebe94 or by Hix and Hartson Hix93, on man-machine interfacing. 6 CHAPTER 1 THE REAL-TIME ENVIRONMENT 1.3 TEMPORALREQUIREMENTS 1.3.1 Where Do Temporal Requirements Come From? The most stringent temporal demands for real-time systems have their origin in the requirements of the control loops, e.g., in the control of a fast mechanical process such as an automotive engine. The temporal requirements at the man-machine interface are, in comparison, less stringent because the human perception delay, in the range of 50-100 msec, is orders of magnitudes larger than the latency requirements of fast control loops. Figure 1.3: A simple control loop. A Simple Control Loop: Consider the simple control loop depicted in Figure 1.3 consisting of a vessel with a liquid, a heat exchanger connected to a steam pipe, and a controlling computer system. The objective of the computer system is to control the valve ( control variable) determining the flow of steam through the heat exchanger so that the temperature of the liquid in the vessel remains within a small range around the set point selected by the operator. The focus of the following discussion is on the temporal properties of this simple control loop consisting of a controlled object and a controlling computer system. Figure 1.4: Delay and rise time of the step response. The Controlled Object: Assume that the system is in equilibrium. Whenever the steam flow is increased by a step function, the temperature of the liquid in the CHAPTER 1 THE REAL-TIME ENVIRONMENT 7 vessel will change according to Figure 1.4 until a new equilibrium is reached. This response function of the temperature depends on the amount of liquid in the vessel and the flow of steam through the heat exchanger, i.e., on the dynamics of the controlled object. (In the following section, we will use d to denote a duration and t, a point in time). There are two important temporal parameters characterizing this elementary step object response function, the object delay d after which the measured variable temperature begins to rise (caused by the initial inertia of the process, called the rise process lag) and the rise time d of the temperature until the new equilibrium state object rise has been reached. To determine the object delay d and the rise time d from a given experimentally recorded shape of the step-response function, one finds the two points in time where the response function has reached 10% and 90% of the difference between the two stationary equilibrium values. These two points are connected by a straight line (Figure 1.4). The significant points in time that characterize the object object rise delay d and the rise time d of the step response function are constructed by finding the intersection of this straight line with the two horizontal lines that extend the two liquid temperatures that correspond to the stable states before and after the application of the step function. Controlling Computer System: The controlling computer system must sample the temperature of the vessel periodically to detect any deviation between the intended value and the actual value of the controlled variable. The constant duration sample between two sample points is called the sampling period d and the reciprocal sample sample 1/d is the sampling frequency, f . A rule of thumb is that, in a digital system which is expected to behave like a quasi-continuous system, the sampling rise period should be less than one-tenth of the rise time d of the step response function sample rise of the controlled object, i.e. d (d /10). The computer compares the measured temperature to the temperature set point selected by the operator and calculates the error term. This error term forms the basis for the calculation of a new value of the control variable by a control algorithm. A given time interval after each sampling computer point, called the computer delay d , the controlling computer will output this new value of the control variable to the control valve, thus closing the control loop. computer sample The delay d should be smaller than the sampling period d . The difference between the maximum and the minimum values of the delay is called computer the jitter of the delay, Δd . This jitter is a sensitive parameter for the quality of control, as will be discussed Section 1.3.2. The dead time of the open control loop is the time interval between the observation of the RT entity and the start of a reaction of the controlled object due to a computer action based on this observation. The dead time is the sum of the controlled object object delay d , which is in the sphere of control of the controlled object and is thus computer determined by the controlled object's dynamics, and the computer delay d , which is determined by the computer implementation. To reduce the dead time in a control loop and to improve the stability of the control loop, these delays should be as small as possible. 8 CHAPTER 1 THE REAL-TIME ENVIRONMENT Figure 1.5: Delay and delay jitter. computer The computer delay d is defined by the time interval between the sampling point, i.e., the observation of the controlled object, and the use of this information (see Figure 1.5), i.e., the output of the corresponding actuator signal to the controlled object. Apart from the necessary time for performing the calculations, the computer delay is determined by the time required for communication. Table 1.1: Parameters of an elementary control loop. Parameters of a Control Loop: Table 1.1 summarizes the temporal parameters that characterize the elementary control loop depicted in Figure 1.3. In the first two columns we denote the symbol and the name of the parameter. The third column denotes the sphere of control in which the parameter is located, i.e., what subsystem determines the value of the parameter. Finally, the fourth column indicates the relationships between these temporal parameters. Figure 1.6: The effect of jitter on the measured variable T.CHAPTER 1 THE REAL-TIME ENVIRONMENT 9 1.3.2 Minimal Latency Jitter Thedataitemsincontrolapplicationsarestate-based,i.e.,theycontainimagesofthe RT entities. The computational actions in control applications are mostly time- triggered, e.g., the control signal for obtaining a sample is derived from the progression of time within the computer system. This control signal is thus in the sphere of control of the computer system. It is known in advance when the next control action must take place. Many control algorithms are based on the assumption computer computer that the delay jitter Δd is very small compared to the delay d , i.e., the delay is close to constant. This assumption is made because control algorithms can be designed to compensate a known constant delay. Delay jitter brings an additional uncertainty into the control loop that has an adverse effect on the quality of control. The jitter Δd can be seen as an uncertainty about the instant the RT-entity was observed. This jitter can be interpreted as causing an additional value error ΔT of the measured variable temperature T as shown in Figure 1.6. Therefore, the delay jitter should always be a small fraction of the delay, i.e., if a delay of 1 msec is demanded then the delayjitter should be in the range of a few µsec SAE95. 1.3.3 Minimal Error-Detection Latency Hard real-time applications are, by definition, safety-critical. It is therefore important that any error within the control system, e.g., the loss or corruption of a message or the failure of a node, is detected within a short time with a very high probability. The required error-detection latency must be in the same order of magnitude as the sampling period of the fastest critical control loop. It is then possible to perform some corrective action, or to bring the system into a safe state, before the consequences of an error can cause any severe system failure. Jitterless systems will always have a shorter error-detection latency than systems that allow for jitter, since in a jitterless system, a failure can be detected as soon as the expected event fails to occur Lin96. 1.4 DEPENDABILITYREQUIREMENTS The notion of dependability covers the metafunctional attributes of a computer system that relate to the quality of service a system delivers to its users during an extended interval of time. (A user could be a human or another technical system.) The following measures of dependability attributes are of importance Lap92: 1.4.1 Reliability The Reliability R(t) of a system is the probability that a system will provide the specified service until time t, given that the system was operational at t = t . If a o system has a constant failure rate of λ failures/hour, then the reliability at time t is given by R(t) = exp(–λ(t–t )), o10 CHAPTER 1 THE REAL-TIME ENVIRONMENT where t -t is given in hours. The inverse of the failure rate 1/ λ = MTTF is called the o Mean-Time-To-Failure MTTF (in hours). If the failure rate of a system is required to -9 be in the order of 10 failures/h or lower, then we speak of a system with an ultrahigh reliability requirement. 1.4.2 Safety Safety is reliability regardingcriticalfailure modes. A critical failure mode is said to be malign, in contrast with a noncritical failure, which is benign. In a malign failure mode, the cost of a failure can be orders of magnitude higher than the utility of the system during normal operation. Examples of malign failures are: an airplane crash due to a failure in the flight-control system, and an automobile accident due to a failure of a computer-controlled intelligent brake in the automobile. Safety-critical (hard) real-time systems must have a failure rate with regard to critical failure modes that conforms to the ultrahigh reliability requirement. Consider the example of a computer-controlled brake in an automobile. The failure rate of a computer-caused critical brake failure must be lower than the failure rate of a conventional braking system. Under the assumption that a car is operated about one hour per day on the average, one safety-critical failure per million cars per year translates into a failure -9 rate in the order of 10 failures/h. Similar low failure rates are required in flight- control systems, train-signaling systems, and nuclear power plant monitoring systems. Certification: In many cases the design of a safety-critical real-time system must be approved by an independent certification agency. The certification process can be simplified if the certification agency can be convinced that: (i) The subsystems that are critical for the safe operation of the system are protected by stable interfaces thateliminate the possibility oferror propagation fromtherestofthe system intothese safety-criticalsubsystems. All scenarios that are covered by the given load- and fault-hypothesis can be (ii) handled according to the specification without reference to probabilistic arguments. This makes a resource adequate design necessary. (iii) The architecture supports a constructive certification process where the certification of subsystems can be done independently of each other, e.g., the proof that a communication subsystem meets all deadlines is independent of the proof of the performance of a node. This requires that subsystems have a high degree of autonomy and clairvoyance (knowledge about the future). Joh92 specifies the required properties for a system that is "designed for validation": (i) A complete and accurate reliability model can be constructed. All parameters of the model that cannot be deduced analytically must be measurable in feasible time under test. The reliability model does not include state transitions representing design (ii) faults; analytical arguments must be presented to show that design faults will not cause system failure. CHAPTER 1 THE REAL-TIME ENVIRONMENT 11 (iii) Design tradeoffs are made in favor of designs that minimize the number of parameters that must be measured and simplify the analytical argument. 1.4.3 Maintainability Maintainability is a measure of the time required to repair a system after the occurrence of a benign failure. Maintainability is measured by the probability M(d) that the system is restored within a time interval d after the failure. In keeping with the reliability formalism, a constant repair rate μ (repairs per hour) and a Mean-Time to Repair (MTTR) is introduced to define a quantitative maintainability measure. There is a fundamental conflict between reliability and maintainability. A maintainable design requires the partitioning of a system into a set of smallest replaceable units (SRUs) connected by serviceable interfaces that can be easily disconnected and reconnected to replace a faulty SRU in case of a failure. A serviceable interface, e.g., a plug connection, has a significantly higher physical failure rate than a non-serviceable interface, e.g., a solder connection. Furthermore, a serviceable interface is more expensive to produce. These conflicts between reliability and maintainability are the reasons why many mass-produced consumer products are designed for reliability at the expense of maintainability. 1.4.4 Availability Availability is a measure of the delivery of correct service with respect to the alternation of correct and incorrect service, and is measured by the fraction of time that the system is ready to provide the service. Consider the example of a telephone switching system. Whenever a user picks up the phone, the system should be ready to provide the telephone service with a very high probability. A telephone exchange is allowed to be out of service for only a few minutes per year. In systems with constant failure and repair rates, the reliability (MTTF), maintainability(MTTR), and availability (A) measures are related by A = MTTF/ (MTTF+MTTR). The sum MTTF+MTTR is sometimes called the Mean Time Between Failures (MTBF). Figure 1.7 shows the relationship between MTTF, MTTR, and MTBF. System State: Figure 1.7: Relationship between MTTF, MTBF and MTTR.12 CHAPTER 1 THE REAL-TIME ENVIRONMENT A high availability can be achieved either by a long MTTF or by a short MTTR. The designer has thus some freedom in the selection of her/his approach to the construction of a high-availability system. 1.4.5 Security A fifth important attribute of dependability– the security attribute –is concerned with the ability of a system to prevent unauthorized access to information or services. There are difficulties in defining a quantitative security measure, e.g., the specification of a standard burglar that takes a certain time to intrude a system. Traditionally, security issues have been associated with large databases, where the concerns are confidentiality, privacy, and authenticity of information. During the last few years, security issues have also become important in real-time systems, e.g., a cryptographic theft-avoidance system that locks the ignition of a car if the user cannot present the specified access code. 1.5 CLASSIFICATION OF REAL-TIME SYSTEMS In this section we classify real-time systems from different perspectives. The first two classifications, hard real-time versus soft real-time (on-line), and fail-safe versus fail-operational, depend on the characteristics of the application, i.e., on factors outside the computer system. The second three classifications, guaranteed-timeliness versus best-effort, resource-adequate versus resource-inadequate, and event-triggered versus time-triggered, depend on the design and implementation, i.e., on factors inside the computer system. 1.5.1 Hard Real-Time System versus Soft Real-Time System The design of a hard real-time system, which must produce the results at the correct instant, is fundamentally different from the design of a soft-real time or an on-line system, such as a transaction processing system. In this section we will elaborate on these differences. Table 1.2 compares the characteristics of hard real-time systems versus soft real-time systems. Table 1.2: Hard real-time versus soft real-time systems.CHAPTER 1 THE REAL-TIME ENVIRONMENT 13 Response Time: The demanding response time requirements of hard real-time applications, often in the order of milliseconds or less, preclude direct human intervention during normal operation and in critical situations. A hard real-time system must be highly autonomous to maintain safe operation of the process. In contrast, the response time requirements of soft real-time and on-line systems are often in the order of seconds. Furthermore, if a deadline is missed in a soft real-time system, no catastrophe can result. Peak-load Performance:In a hard real-time system, the peak-load scenario must be well-defined. It must be guaranteed by design that the computer system meets the specified deadlines in all situations, since the utility of many hard real-time applications depends on their predictable performance during rare event scenarios leading to a peak load. This is in contrast to the situation in a soft-real time system, where the average performance is important, and a degraded operation in a rarely occurring peak load case is tolerated for economic reasons. Control of Pace: A hard real-time computer system must remain synchronous with the state of the environment (the controlled object and the human operator) in all operational scenarios. It is thus paced by the state changes occurring in the environment. This is in contrast to an on-line system, which can exercise some control over the environment in case it cannot process the offered load. Consider the case of a transaction processing system, such as an airline reservation system. If the computer cannot keep up with the demands of the operators, it just extends the response time and forces the operators to slow down. Safety: The safety criticality of many real-time applications has a number of consequences for the system designer. In particular, error detection must be autonomous so that the system can initiate appropriate recovery actions within the time intervals dictated by the application. Size of Data Files: Real-time systems have small data files, which constitute the real-time database that is composed of the temporally accurate images of the RT- entities. The key concern in hard real-time systems is on the short-term temporal accuracy of the real-time database that is invalidated due to the flow of real-time. In contrast, in on-line transaction processing systems, the maintenance of the long-term integrity of large data files is the key issue. Redundancy Type: After an error has been detected in an on-line system, the computation is rolled back to a previously established checkpoint to initiate a recovery action. In hard real-time systems, roll-back/recovery is of limited utility for the following reasons: (i) It is difficult to guarantee the deadline after the occurrence of an error, since the roll-back/recovery action can take an unpredictable amount of time. (ii) An irrevocable action (see Section 5.5.1) which has been effected on the environment cannot be undone. (iii) The temporal accuracy of the checkpoint data is invalidated by the time difference between the checkpoint time and the instant now.14 CHAPTER 1 THE REAL-TIME ENVIRONMENT The topic of data integrity is discussed at length in Section 5.4 while the issues of error detection and types of redundancy are dealt with in Chapter 6. 1.5.2 Fail-safe versus Fail-Operational For some hard real-time systems one or more safe states which can be reached in case of a system failure, can be identified. Consider the example of a railway signaling system. In case a failure is detected, it is possible to stop all the trains and to set all the signals to red to avoid a catastrophe. If such a safe state can be identified and quickly reached upon the occurrence of a failure, then we call the system fail-safe. Fail-safeness is a characteristic of the controlled object, not the computer system. In fail-safe applications the computer system must have a high error-detection coverage, i.e., the probability that an error is detected, provided it has occurred, must be close to one. In many real-time computer systems a special external device, a watchdog, is provided to monitor the operation of the computer system. The computer system must send a periodic life-sign (e.g., a digital output of predefined form) to the watchdog. If this life-sign fails to arrive at the watchdog within the specified time interval, the watchdog assumes that the computer system has failed and forces the controlled object into a safe state. In such a system, timeliness is needed only to achieve high availability, but is not needed to maintain safety since the watchdog forces the controlled object into a safe state in case of a timing violation. There are, however, applications where a safe state cannot be identified, e.g., a flight control system aboard an airplane. In such an application the computer system must provide a minimal level of service to avoid a catastrophe even in the case of a failure. This is why these applications are called fail-operational. 1.5.3 Guaranteed-Response versus Best-Effort If we start out with a specified fault- and load-hypothesis and deliver a design that makes it possible to reason about the adequacy of the design without reference to probabilistic arguments, then, even in the case of a peak load and fault scenario, we can speak of a system with a guaranteed response. The probability of failure of a perfect system with guaranteed response is reduced to the probability that the assumptions about the peak load and the number and types of faults hold in reality (see Section 4.1.1 on assumption coverage ). Guaranteed response systems require careful planning and extensive analysis during the design phase. If such an analytic response guarantee cannot be given, we speak of a best-effort design. Best-effort systems do not require a rigorous specification of the load- and fault-hypothesis. The design proceeds according to the principle "best possible effort taken" and the sufficiency of the design is established during the test and integration phases. It is very difficult to establish that a best-effort design operates correctly in rare-event scenarios. At present, many non safety-critical real-time systems are designed according to the best-effort paradigm. CHAPTER 1 THE REAL-TIME ENVIRONMENT 15 1.5.4 Resource-Adequate versus Resource-Inadequate Guaranteed response systems are based on the principle of resource adequacy, i.e., there are enough computing resources available to handle the specified peak load and the fault scenario Law92. Many non safety-critical real-time system designs are based on the principle of resource inadequacy. It is assumed that the provision of sufficient resources to handle every possible situation is not economically viable, and that a dynamic resource allocation strategy based on resource sharing and probabilistic arguments about the expected load and fault scenarios is acceptable. It is expected that, in the future, there will be a paradigm shift to resource-adequate designs in many applications. The use of computers in important volume-based applications, e.g., in cars, will raise both the public awareness, as well as concerns about computer-related incidents, and will force the designer to provide convincing arguments that the design will function properly under all stated conditions. Hard real-time systems must be designed according to the guaranteed response paradigm that requires the availability of adequate resources. 1.5.5 Event-Triggered versus Time-Triggered The flow of real time can be modeled by a directed time line that extends from the past into the future. Any occurrence that happens at a cut of this time line is called an event. Information that describes an event (see also Section 5.2.4 on event observation) is called event information. The present point in time, now, is a very special event that separates the past from the future (the presented model of time is based on Newtonian physics and disregards relativistic effects). An interval on the time line is defined by two events, the start event and the terminating event. The duration of the interval is the time of the terminating event minus the time of the start event. Any property of an RT entity or an object that remains valid during a finite duration, is called a state attribute, the corresponding information state information. A change of state is thus an event. An observation is an event that records the state of an RT entity at a particular instant, the point of observation. A digital clock partitions the time line into a sequence of equally-spaced durations, called the granules of the clock which are bounded by special periodic events, the ticks of the clock. A trigger is an event that causes the start of some action, e.g., the execution of a task or the transmission of a message. Depending on the triggering mechanisms for the start of communication and processing activities in each node of a computer system, two distinctly different approaches to the design of real-time computer applications can be identified Kop93b, Tis95. In the event-triggered (ET) approach, all communication and processing activities are initiated whenever a significant change of state, i.e., an event other than the regular event of a clock tick, is noted. In the time-triggered (TT) approach, all communication and processing activities are initiated at predetermined points in time. 16 CHAPTER 1 THE REAL-TIME ENVIRONMENT In an ET system, the signaling of significant events is realized by the well-known interrupt mechanism, which brings the occurrence of a significant event to the attention of the CPU. ET systems require a dynamic scheduling strategy to activate the appropriate software task that services the event. In a time-triggered (TT) system, all activities are initiated by the progression of time. There is only one interrupt in each node of a distributed TT system, the periodic clock interrupt, which partitions the continuum of time into the sequence of equally spaced granules. In a distributed TT real-time system, it is assumed that the clocks of all nodes are synchronized to form a global notion of time, and that every observation of the controlled object is timestamped with this synchronized time. The granularity of the global time must be chosen such that the time order of any two observations made anywhere in a distributed TT system can be established from their time-stamps Kop92. The topics of global time and clock synchronization will be discussed at length in Chapter 3. 1.6 THEREAL-TIMESYSTEMSMARKET In a market economy, the cost/performance relation is a decisive parameter for the market success of any product. There are only a few scenarios where cost arguments are not the major concern. The total life-cycle cost of a product can be broken down into three rough categories: development cost, production cost, and maintenance cost. Depending on the product type, the distribution of the total life-cycle cost over these three cost categories can vary significantly. We will examine this life-cycle cost distribution by looking at two important examples of real-time systems, embedded systems and plant-automation systems. 1.6.1 Embedded Real-Time Systems The ever decreasing price/performance ratio of microcontrollers makes it economically attractive to replace the conventional mechanical or electronic control system within many products by an embedded real-time computer system. There are numerous examples of products with embedded computer systems: engine controllers in cars, heart pacemakers, FAX machines, cellular phones, computer printers, television sets, washing machines, even some electric razors contain a microcontroller with some thousand instructions of software code Ran94. Because the external interfaces of the product, and in particular, the man-machine interface, often remain unchanged relative to the previous product generation, it is often not visible from the outside that a real-time computer system is controlling the product behavior. Characteristics: An embedded real-time computer system is always part of a well- specified larger system, which we call an intelligent product. An intelligent product consists of a mechanical subsystem, the controlling embedded computer, and, most often, a man-machine interface. The ultimate success of any intelligent productCHAPTER 1 THE REAL-TIME ENVIRONMENT 17 depends on the relevance and quality of service it can provide to its users. A focus on the genuine user needs is thus of utmost importance. Embedded systems have a number of distinctive characteristics that influence the system development process: (i) Mass Production: embedded systems are designed for a mass market and consequently for mass production in highly automated assembly plants. This implies that the production cost of a single unit must be as low as possible, i.e., efficient memory and processor utilization are of concern. Static Structure: the computer system is embedded in an intelligent product of (ii) given functionality and rigid structure. The known a priori static environment can be analyzed at design time to simplify the software, to increase the robustness, and to improve the efficiency ofthe embedded computer system. In an embedded system there is little need for flexible dynamic software mechanisms that increase the resource requirements, reduce the error-detection coverage, and lead to unnecessary complexity of the implementation. (iii) Man-Machine Interface: if an embedded system has a man-machine interface, it must be specifically designed for the stated purpose and must be easy to operate. Ideally, the use of the intelligent product should be self-explanatory, and not require any training or reference to an operating manual. (iv) Minimization of the Mechanical Subsystem: to reduce the manufacturing cost and to increase the reliability of the intelligent product, the complexity of the mechanical subsystem is minimized. Functionality Determined by Software in Read-only Memory: the functionality (v) of an intelligent product is determined by the integrated software that resides in read-only memory. Because there is hardly any possibility to modify the software after its release, the quality standards for this software are high. (vi) Maintenance Strategy: many intelligent products are designed to be non maintainable, because the partitioning of the product into replaceable units is too expensive. If, however, a product is designed to be maintained in the field, the provision of an excellent diagnostic interface and a self-evident maintenance strategy is of importance. (vii) Ability to communicate: although most intelligent products start out as stand- alone units, many intelligent products are required to interconnect with some larger system at a later stage. The protocol controlling the data transfer should be simple and robust. An optimization of the transmission speed is seldom an issue. By far, the largest fraction of the life-cycle cost of an intelligent product is in the production, i.e., in the hardware, whereas the development cost and software cost are only a small part, sometimes less than 5 % of the life-cycle cost. The known a priori static configuration of the intelligent product can be used to reduce the resource requirements, and thus the production cost, and also to increase the robustness of the embedded computer system. Maintenance cost can become significant, particularly if 18 CHAPTER 1 THE REAL-TIME ENVIRONMENT an undetected design fault (software fault) requires a recall of the product, and the replacement of a complete production series. Example: In Neu96 we find the following laconic one-liner (see also Problem 1.19): General Motors recalls almost 300 K cars for engine software flaw. The Four Phases: During the short history of embedded real-time systems, a characteristic pattern has emerged for the deployment of computer technology within a product family Bou95. In the first phase, an ad hoc stand-alone computer implementation on a microcomputer without an operating system realizes the given function of the conventional control system. The software is developed by engineers who understand the application and have little training in computer technology. To be cost competitive with the conventional control system, this first implementation tries to minimize resource requirements (e.g., memory) at the expense of software structure. In the second phase, the functionality of the product is augmented by adding software functions to improve the utility of the intelligent product. The increasing software complexity leads to reliability problems and forces the system designer to step back and to introduce a software architecture and an operating system in the third phase. This third phase requires a fundamental redesign of the software, which produces additional development cost without any significant increase in visible functions. It is thus a critical phase for the organization that is developing a product. Finally, in the fourth phase, the intelligent product is seen as part of a larger system that needs to communicate with its environment. Communication interfaces are first developed within a company, and then standardized across an industrial sector. This standardization makes it possible to define standard subsystems that can be implemented cost-effectively by application-specific VLSI solutions with large production numbers, for the entire industrial sector. Different industries have started this transition process from conventional technology to computer technology, at different times. Therefore, at present, some industries are already further along in this transition than others. Future Trends: During the last few years, the variety and number of embedded computer applications have grown to the point that, now, this segment is by far the most important one in the real-time systems market. The embedded systems market is driven by the continuing improvements in the cost/performance ratio of the semiconductor industry that makes computer-based control systems cost-competitive relative to their mechanical, hydraulic, and electronic counterparts. Among the key mass markets are the fields of consumer electronics and automotive electronics. The automotive electronics market is of particular interest, because of stringent timing, dependability, and cost requirements that act as "technology catalysts". After a conservative approach to computer control during the last ten years, a number of automotive manufacturers now view the proper exploitation of computer technology as a key competitive element in the never-ending quest for increased vehicle performance and reduced manufacturing cost. While some years ago, the computer applications on board a car focused on non-critical body electronics or CHAPTER 1 THE REAL-TIME ENVIRONMENT 19 comfort functions, there is now a substantial growth in the computer control of core vehicle functions, e.g., engine control, brake control, transmission control, and suspension control. In the not-too-distant future we will observe an integration of many of these functions with the goal of increasing the vehicle stability in critical driving maneuvers. Obviously, an error in any of these core vehicle functions has severe safety implications. At present the topic of computer safety in cars is approached at two levels. At the basic level a mechanical system provides the proven safety level that is considered sufficient to operate the car. The computer system provides optimized performance on top of the basic mechanical system. In case the computer system fails cleanly, the mechanical system takes over. Consider, for example, an Antilock Braking System (ABS). If the computer fails, the conventional mechanical brake system is still operational. Soon, this approach to safety may reach its limits for two reasons: (i) If the computer controlled system is further improved, the magnitude of the difference between the performance of the computer controlled system and the performance of the basic mechanical system is further increased. A driver who is used to the high performance of the computer controlled system might consider the fallback to the inferior performance of the mechanical system a safety risk. (ii) The improved price/performance of the microelectronic devices will make the implementation of fault-tolerant computer systems cheaper than the implementation of mixed computer/mechanical systems. Thus, there will be an economical pressure to eliminate the redundant mechanical system and to replace it with a computer system using active redundancy. The automotive industry operates in a highly competitive worldwide market under an extreme economical pressure. Although the design of a new automotive model is a major effort requiring the cooperation of thousands of engineers over a period of three to four years, it is important to realize that more than 95% of the cost of delivering a car lies in manufacturing and marketing, and only 5 % of the cost is related to development. The cost-effective and highly dependable computer solutions that are being developed for the automotive market will thus be adopted in many other real- time system applications. It is expected that the automotive market will be the driving force for the real-time systems market. The embedded system market is expected to grow significantly during the next ten years. Compared to other information technology markets, this market will offer– according to a recent study Ran94–the best employment opportunities for the computer engineers of the future. 1.6.2 Plant Automation Systems Characteristics: Historically, industrial plant automation was the first field for the application of real-time digital computer control. This is understandable since the benefits that can be gained by the computerization of a sizable plant are much larger than the cost of even an expensive process control computer of the late 1960's. In the

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.