Probability stochastic processes and Queuing theory

probability stochastic processes and queuing theory and stochastic process in queuing theory pdf free download
Dr.AlexanderTyler Profile Pic
Dr.AlexanderTyler,India,Teacher
Published Date:21-07-2017
Your Website URL(Optional)
Comment
i DISCRETE EVENT STOCHASTIC PROCESSES Lecture Notes for an Engineering Curriculum Anurag Kumar Department of Electrical Communication Engineering Indian Institute of Science c Anurag Kumar, 2012. All rights reserved. No part of these notes may be reproduced, stored in a retrieval system, or transmitted in any form or by any means – electronic, mechanical, photocopying, scanning, or otherwise – without prior written permission of the author.iiContents Preface vii 1 A Review of Some Basics 1 1.1 Axioms of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Continuity of Probability . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.1 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.1 Finite Dimensional Distributions . . . . . . . . . . . . . . . . . . 11 1.4 Convergence of Random Sequences . . . . . . . . . . . . . . . . . . . . 12 1.4.1 Convergence of Expectation . . . . . . . . . . . . . . . . . . . . 13 1.5 Laws of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.6 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.7 Notes on the Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2 Discrete Time Markov Chains 25 2.1 Conditional Independence . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2 The Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.2.1 Finite Dimensional Distributions . . . . . . . . . . . . . . . . . . 31 2.3 The Strong Markov Property . . . . . . . . . . . . . . . . . . . . . . . . 31 2.4 Hitting Times and Recurrence . . . . . . . . . . . . . . . . . . . . . . . 33 2.4.1 First Passage Time Distribution . . . . . . . . . . . . . . . . . . 34 2.4.2 Number of Returns to a State . . . . . . . . . . . . . . . . . . . . 36 2.5 Communicating Classes and Class Properties . . . . . . . . . . . . . . . 38 2.6 Positive Recurrence and the Invariant Probability Vector . . . . . . . . . 42 2.7 Transience: A Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.8 An Example: The Discrete Time M/M/1 Queue . . . . . . . . . . . . . . 49 2.9 Mean Drift Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.10 Notes on the Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 iiiiv CONTENTS 3 Renewal Theory 65 3.1 Definition and Some Related Processes . . . . . . . . . . . . . . . . . . 65 3.2 The Elementary Renewal Theorem (ERT) . . . . . . . . . . . . . . . . . 69 3.2.1 Application to DTMCs . . . . . . . . . . . . . . . . . . . . . . . 76 3.3 Renewal Reward Processes . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.3.1 Application to Time Averages . . . . . . . . . . . . . . . . . . . 80 3.3.2 Length and Batch Biasing . . . . . . . . . . . . . . . . . . . . . 83 3.4 The Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 3.4.1 Stopping Times . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 3.4.2 Other Characterisations . . . . . . . . . . . . . . . . . . . . . . . 95 3.4.3 Splitting and Superposition . . . . . . . . . . . . . . . . . . . . . 96 3.5 Regenerative Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 3.5.1 Time Averages of a Regenerative Process . . . . . . . . . . . . . 100 3.6 The Renewal Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 3.7 Stationary Renewal Process . . . . . . . . . . . . . . . . . . . . . . . . . 107 3.8 From Time Averages to Limits . . . . . . . . . . . . . . . . . . . . . . . 110 3.9 Limits for Regenerative Processes . . . . . . . . . . . . . . . . . . . . . 116 3.10 Some Topics in Markov Chains . . . . . . . . . . . . . . . . . . . . . . . 118 3.10.1 Relative Rate of Visits . . . . . . . . . . . . . . . . . . . . . . . 118 3.10.2 Limits of DTMCs . . . . . . . . . . . . . . . . . . . . . . . . . . 119 3.11 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 3.12 Notes on the Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . 124 3.13 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 4 Continuous Time Markov Chains 129 4.1 Transition Probability Function . . . . . . . . . . . . . . . . . . . . . . . 130 4.2 Sojourn Time in a State . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 4.3 Structure of a Pure Jump CTMC . . . . . . . . . . . . . . . . . . . . . . 132 4.4 Regular CTMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 4.5 Communicating Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 4.6 Recurrence and Positivity . . . . . . . . . . . . . . . . . . . . . . . . . . 138 4.7 Birth and Death Processes . . . . . . . . . . . . . . . . . . . . . . . . . 143 4.8 Differential Equations forP(t) . . . . . . . . . . . . . . . . . . . . . . . 146 4.9 Notes on the Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . 150 4.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 5 Markov Renewal Theory 155 5.1 Markov Renewal Sequences . . . . . . . . . . . . . . . . . . . . . . . . 155 5.2 Semi-Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 5.3 Markov Regenerative Processes . . . . . . . . . . . . . . . . . . . . . . 160 5.4 Notes on the Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . 163CONTENTS v 5.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164vi CONTENTSPreface Over a period of 15 years, I taught a course titled Stochastic Processes and Queueing Theory to classes mainly comprising communication engineers, and a few computer scientists. The course (popularly called “SPQT” by the students), was aimed primarily at providing material that prepares students for graduate thesis work in communication networking, an area that draws heavily from the tools of stochastic modeling, optimisation, and control. These notes are essentially a transcription of a part of the material I delivered during my lectures. I have dropped “Queueing Theory” from the title, since I have included here only the material on discrete event stochastic processes, with queues being given as important and useful examples. The emphasis of the course derives mainly from the textbook by Wolff 17. It is from this source that the course derives its essentially renewal theoretic emphasis, which distinguishes it from most traditional courses in random processes and queueing theory taught in electrical sciences curricula. The latter typically comprise discrete and continuous time Markov chains (using a primarily matrix algebraic treatment), followed by the analysis of standard queueing models, such as M/M/1, M/G/1, etc. We have found the renewal theoretic approach very appropriate for two reasons: 1. The generality of the approach permits the student to understand and conceive of stochastic models more general than those provided by standard text-book queueing theory. Standard queueing models have been found to be of limited utility in applications such as communication networking; see, for example, Kumar et al. 12. On the other hand semi-Markov models, regenerative models, and Markov regenerative models have been found to be very useful, and are essentially developed out of renewal theory. 2. A renewal theory training provides the student with technical ability that allows him/her to be comfortable with the stochastic analysis that accompanies more advanced techniques, such as sequential statistical analysis, and semi-Markov decision theory. In these notes, several technical details and proofs have been taken from Wolff 17 and from the texts by C ¸ inlar 5, and Karlin and Taylor 10. Chapter 5 also depends heavily on Kulkarni 11. viiviii PREFACE Students who attended this course were also taking, or had taken, a first course on probability, random variables, and random processes, from a book such as the classic by Papoulis 15. With this background, the material presented in these notes can be easily covered in about 28 lectures, each of 1.5 hours duration. After a review of probability theory in Chapter 1, Chapter 2 treats the topic of discrete time Markov chains (DTMCs) in a mainly traditional manner, though some proofs are deferred to results in the following chapter on renewal theory. The reason for taking this approach is that I have found that engineering students develop a physical feel for DTMCs far more easily than for renewal theory. Chapter 3, on renewal theory, always took a substantial amount of time, as it presents somewhat abstract material that students take time to absorb. This is followed by Chapter 4 on continuous time Markov chains, and Chapter 5 on Markov renewal processes. Each chapter is accompanied by several class-tested problems. A solution manual is also available. Readers looking for other compact treatments of topics covered in this book might wish to consider the books by Ross 16 or Gallager 8. While this course takes the training of a student interested in stochastic modeling beyond that of a first course in probability, random variables, and random processes, there are several important advanced topics that are needed to round off the stochastic systems training of an engineering researcher. These are martingale theory, weak convergence, and diffusion processes, topics that could reasonably comprise the syllabus of a third course. Acknowledgements: I lectured from these notes over one and a half decades before deciding to typeset them as a book. Over this period, hundreds of Masters and PhD students in IISc served as sounding boards, testing my presentation with their many questions. Our lab secretary, Chandrika Sridhar, typed up a “raw” version from my handwritten notes. After polishing up this raw version, I handed these notes out to students, many of whom helped as proof-readers, marking up errors and typos on copies of the printed notes. In the recent years, I began to use parts of these notes in our first random processes course, for which I had several teaching assistants, namely, P.T. Akhil, M. Ashok Kumar, K.P. Naveen, K. Premkumar, Chandramani K. Singh, and Vineeth B.S. Chandramani K. Singh was particularly helpful in going through all the problem sets, and checking the correctness of the problems and the solutions. When it was nearing completion, the manuscript was read in its entirety by Arpan Chattopadhyay, who spotted several errors. I am grateful to all these individuals who have helped bring these notes to a wider audience. Finally, of course, the responsibility for their correctness rests with me, and I welcome readers to report any errors to me by email. Anurag Kumar IISc, BangaloreChapter 1 A Review of Some Basics 1.1 Axioms of Probability Probability provides a mathematical framework for reasoning under uncertainty. Although there are several such frameworks, we will be exclusively concerned with the one motivated by the relative frequency interpretation, and codified in Kolmogorov’s axioms of probability. In any situation of probabilistic inference we limit our domain of reasoning to a set of possible basic/elementary outcomes, which we denote by , and usually call the sample space. For example if a coin is tossed once then =fHeads;Tailsg. Abbreviating Heads toH and Tails toT , if a coin is tossed twice then =fHH;HT;TH;TTg. Elements of the set are denoted by, and the empty set is customarily denoted by;. It should be noted that, in general, may remain abstract, i.e., we do not necessarily have to pin it down in every problem (for example, the sample space of a complex queueing network may be very hard to describe, and we may proceed to analyse this queueing network without stopping to identify the exact structure of ). Probability is assigned to events or subsets of . P is a real-valued set function on , defined on a collection of subsetsF of . Definition 1.1. The set functionP :FR; whereF is a collection of subsets of , is said to be a probability measure on ( ;F) if the following hold. I. F is aalgebra, i.e.,F is closed under complements, finite unions and countable unions. II. (a) P ( ) = 1 (i.e.,P is normalised), (b) for everyA2F; P (A) 0 (i.e.,P is nonnegative), and 12 CHAPTER1. AREVIEWOFSOMEBASICS (c) for every countable collection of setsfA 2F; i 1g; such that, for alli6= i P 1 1 j; A \A =;; we have P ( A ) = P (A ) (i.e.,P is countably i j i i i=1 i=1 additive). We say that the 3-tuple ( ;F;P ) is a probability space. In all probability models there is an underlying probability space. In carrying out calculations, it is not always necessary to explicitly describe the underlying probability space, but knowing how to do so often proves useful. The following theorem is easily proved by using the additivity and the nonnegativity axioms. Theorem 1.1. P is a monotone set function onF, i.e., ifA;B2F, withA B, then P (A)P (B). The second part of the following theorem is called the union bound. Theorem 1.2. (i) For everyA;B2F; P (AB) =P (A) +P (B)P (A\B). (ii) IfA 2F; i 1; then i 1 1    X X c 1 i1 P ( A ) = P A \ A  P (A ) i i j i i=1 j=1 i=1 i=1 Exercise 1.1. Prove Theorems 1.1, and 1.2. Hint: Theorem 1.1 follows from the additivity and the nonnegativity axioms, and the first part of Theorem 1.2 follows from finite additivity. The second part of Theorem 1.2 follows from the countable additivity axiom after defining a n1 c sequence of setsE = A \ ( A ) , forn 1, noticing that the sequence of sets n n k k=1 1 fE ;n 1g partitions A into disjoint sets, and then applying the monotonicity n k k=1 n1 property. Note that forn = 1, A is the union of an empty collection of sets, which k k=1 is the empty set. Example 1.1. Bernoullitrials Consider an infinite sequence of coin tosses in each of which the probability of heads (and tails) is 0.5. Let us denote the outcome of heads by 1 and that of tails by 0. The sample 1 space then becomes =f0; 1g , i.e., all countably infinite binary sequences. Note that such a sample space will arise in any repeated experiment where we are interested only in two outcomes. For example, die tossing where we only observe whether the toss is even1.1. AXIOMSOFPROBABILITY 3 or odd, or packet transmission in a communication network where we are only interested in whether the packet is received correctly or is not. We often call such an experiment Bernoulli trials.  n 1 Observe that the probability of any sequence of outcomes, , is lim = 0. n1 2 By placing a decimal point to the left any 2 we can think of each as the binary representation of a number in 0; 1. In this viewpoint each rational number in 0; 1 will correspond to two 2 , the recurring and the nonrecurring representation. For example, 0.5 corresponds to 100000 , and also to 011111 . Let us take the recurring representation in each such case. By doing this we can associate each number in 0; 1 with all but countably many elements of . The set we have left out will have probability 0 (why?). Thus Bernoulli trials can be viewed as yielding an outcome in 0; 1. Now consider the interval 0; 0:5; this corresponds to all the outcomes in which the first trial yields tails. Hence the probability of this interval is 0:5. Consider the interval (0:5; 0:75; this corresponds to all the outcomes with heads in the first trial and tails in the second trial, and hence has probability 0.25. We see that the probability measure we obtain is the uniform distribution on 0; 1. It turns out that the appropriate-algebra of events is the smallest-algebra containing all the intervals in 0; 1. This is a complicated collection of events to describe and we will not pursue this point further in this book. An important point that this example illustrates is that while each elementary outcome has probability 0, sets of outcomes can have positive probability. Obviously such positive probability sets must have an uncountable infinity of elements (why?). 1.1.1 Continuity of Probability ConsiderA 2F; i 1, such thatA  A  A  (or, respectively, A  A  i 1 2 3 1 2 1 1 1 A  ) then limA := A (respectively,\ A ) and we say that A " A 3 i i i i i i=1 i=1 i=1 1 (respectively, A \ A ). Consider the sequence of numbersfP (A );i 1g. By i i i i=1 Theorem 1.1 this is a nondecreasing sequence, and also the sequence is bounded above by 1 1. Hence lim P (A ) exists. Similarly, if A \ A then lim P (A ) exists. i1 i i i i1 i i=1 The following theorem asserts that in either case the limit of the probabilities ofA is the i probability of the limit set; i.e., probability is a continuous set function. Definition 1.2. A monotone set function  is said to be continuous from above if for every sequencefA;i 1g, such thatA A, i i lim (A ) (A) n n1 and similarly for continuity from below. () is continuous if it is continuous from above and from below. Theorem 1.3. A probability measureP is a continuous set function.4 CHAPTER1. AREVIEWOFSOMEBASICS Proof: Continuity from below: Given thatfA;i  1g is such that A " A, define i i c E = A ; E = A A ; i 2 (whereA A meansA \A ). Notice that 1 1 i i i1 i i1 i i1 1 1 P (E ) =P (A )P (A ). It is easily seen thatE\E =;, and that A = E . i i i1 i j i i i=1 i=1 The following sequence of equalities can then be written down 1 n X X 1 1 P ( A ) = P ( E ) = P (E ) = lim P (E ) i i i i i=1 i=1 n1 i=1 i=1 = lim (P (A ) +P (A )P (A ) + +P (A )P (A )) 1 2 1 n n1 n1 = lim P (A ): n n1 In the above calculation, the first equality follows from set equality, the second from P 1 countable additivity, and the third is just the meaning of . For continuity from i=1 above, considerA A. DefineB = A . ThenB " B and we can use continuity i i i i from below to reach the desired conclusion. The following example is a simple illustration of some of the above concepts of limits of sets, continuity ofP (), and countable additivity. Example 1.2. A close study of this example will help the reader to understand the basic concepts behind countable additivity and continuity. Consider the sample space generated by an infinite sequence of coin tosses, where we denote heads by 1 and tails by 0; i.e., = 1 f0; 1g . Denote by an element of , with denoting thekth element of, i.e., the k kth outcome in the sequence of outcomes denoted by. Let, forn 1, A := f : = 0; = 0; ; = 0; = 1g n 1 2 n1 n = f : = 0 for allkn; = 1g k n Thus, for eachn 1,A is the set of outcomes in which the first 1 occurs at thenth trial. n Define B := f : there existsn s.t. = 1g n It is evident that the A are disjoint. Also, B = A , since for each 2 B there i n1 n P is somen such that the first 1 in occurs at thatn. HenceP (B) = P (A ) by n n1 additivity. Let us obtain this conclusion in an alternate way. To this end, define B =f : there existsin s.t. = 1g n i n 1 Now, clearly B  B  B  , B = A , and also B = B . Using the 1 2 3 n i i i=1 i=1 continuity ofP () from below, we obtain1.2. RANDOMVARIABLES 5 P (B) = lim P (B ) (1.1) n n1 n X = lim P (A ) (1.2) i n1 i=1 1 X = P (A ) (1.3) i i=1 where the second equality is obtained by finite additivity ofP (). 1.2 Random Variables Definition 1.3. A (real valued) random variableX on ( ;F) is a functionX : R such that, for everyx2R; f :X()xg2F. Remark: f : X()  xg can be read as “the set of all 2 that are mapped by X into the semi-infinite interval (1;x”. This subset of can also be written as 1 the inverse image of X, or X ((1;x). The requirement that, for every x 2 R, 1 X ((1;x)2F is also called the measurability condition, and a random variable is said to be a measurable function from the sample space toR. The following example illustrates why this condition is imposed. Example 1.3. Consider a single toss of a die. Then = f1; 2; 3; 4; 5; 6g. When forming the probability space we are only told whether the die fell even or odd. Hence we are only able to assign probabilities to events in F =f;;f2; 4; 6g;f1; 3; 5g; g Now consider the following mapping from toR.  1 if is divisible by 3 X() = 0 otherwise X is not a random variable, as defined above, sincef : X() 2 (1; 0g = f1; 2; 4; 5g62F. Remark: Thus a random variable is basically a question that we ask about the experiment (e.g.,X asks: “Was the die toss outcome divisible by 3?”) and should be answerable based on the available information. In the above example the available information permitted us to only assign probabilities to the events inF. HenceX is not a valid question.6 CHAPTER1. AREVIEWOFSOMEBASICS Definition 1.4. If X is a random variable on ( ;F) then the function F : R 0; 1 defined by, forx2R,F (x) =P (Xx), i.e.,F (x) =Pf :X()xg, is called the distribution function ofX or the cumulative distribution function (c.d.f.) ofX. Theorem 1.4. For a random variableX; its c.d.f.F () satisfies the following: 1. For allx2R,F (x) 0, i.e.,F () is a nonnegative function. 2. For allx2R, lim F (x +) =F (x), i.e.,F () is continuous from the right. 0 3. Ifx ;x 2R; x  x , thenF (x ) F (x ), i.e.,F () is a monotone increasing 1 2 1 2 1 2 function. 4. IfP (1X +1) = 1, then lim F (x) = 0, and lim F (x) = 1. x1 x+1 5. LetD :=fx2R :x is a point of discontinuity ofF ()g; thenD is countable, i.e., D =fx ;x ;x ; ;g under some indexing of the points of discontinuity (“jump” 1 2 3 points) ofF (). Remarks 1.1. 1. The proofs of the first four parts of Theorem 1.4 are easy exercises, the second part following from the continuity from above of P (). If P (X = 1) 0 (respectively, P (X = +1) 0), then the limits in the fourth part will be 0 (respectively, 1). IfP (1 X +1) = 1 then we say thatX is a proper random variable, otherwise we say that X is defective. Correspondingly, we also say that the c.d.f.F () of the random variableX is proper or defective. 2. For anyx2R, let us defineF (x+) = lim F (x +) andF (x) = lim F (x 0 0 ). By the second part of Theorem 1.4, we see that F (x) = F (x+): By the monotone increasing property, it is clear that,F (x) F (x). At a discontinuity x ofF (); definep := F (x )F (x) 0. We callp the point mass atx . If i i i i i i F () is defective then there would be point masses at +1; or at1; or both. It can shown that the set of discontinuities of a c.d.f.F () is countable; hence, the point P P 1 1 masses can be indexedp;i = 1; 2; . In general, p  1. If p = 1 i i i i=1 i=1 thenX is called a discrete random variable. 3. Given a c.d.f.F (), it can essentially (in the sense of computation of probabilities, expectations, etc.) be decomposed as follows. There is a function f : R R P x 0;1) such that, for all x 2 R, F (x) = p + f(u)du. Clearly, i i:xx i 1 R x f(u)du 1. This function f() is called the density of the continuous part 1 of the distribution. In generalf() need not be less than 1; as an example, consider the uniform distribution over 0; 0:5. Also, observe thatf(u)du can be interpreted asP (X2 (u;u +du)).1.2. RANDOMVARIABLES 7 Remark: At this point, the reader should review the concept of independence of events and of random variables from a textbook on basic probability. 1.2.1 Expectation A random variableX such thatX() 0 for all2 , is called a nonnegative random variable. Definition 1.5. For a nonnegative random variableX, with distributionF (), with point massp atx ,i 1, and densityf(), we define the expectationE(X) by i i Z 1 1 X E(X) = xp + xf(x)dx i i 0 i=1 The above general expression for E(X) can be written compactly in terms of the 1 Riemann-Stieltjes integral, as follows Z 1 E(X) = xdF (x) (1.4) 0 Expectations of continuous functions of random variables can be treated in an analogous 2 manner. Thus, the Laplace-Stieltjes Transform (LST) of the c.d.f.F (), for alls2C, is given by Z 1  sX sx F (s) := E e = e dF (x): 0 Exercise 1.2. As an exercise in using Equation (1.4), establish the following useful expressions for the expectation of a nonnegative random variable: R +1 (i) E(X) = (1F (x))dx 0 P 1 (ii) Iff(x) = 0;x2R; thenE(X) = P (Xk) k=1 1 Readers unfamiliar with the Riemann-Stieltjes integral, but familiar with the Dirac- function, can think R P 1 1 formally in terms of writingdF(x) = ( p (xx )+f(x))dx. Further, as usual, xdF(x) = i i i=1 R 0 a lim xdF(x), and this limit can be1. a1 0 2 Technical difficulties that arise in more general cases are handled by the rigorous definition of expectation via the Lebesgue-Stieltjes integral.8 CHAPTER1. AREVIEWOFSOMEBASICS R R R 1 1 x (Hint: Write x dF (x) = dy dF (x); and then interchange the order of 0 0 0 integration.) + For a real valued random variable X, we denote X = XI , and X = fX0g + XI ; X and X are both nonnegative random variables, and are called the fX0g + positive part and the negative part of X. Clearly, we can write X = X X , and + jXj =X +X + Definition 1.6. For a random variableX, E(X) = E(X )E(X ); provided at least + one ofE(X ) orE(X ) is finite. Example 1.4. Consider the discrete random variable X 2 f ;3;2;1; +1; +2; +3;g; 3 which takes the values +k andk, each with probability : Also, consider the 2 2  k random variable Y 2 f+1; +2; +3;g; which takes the value +k with probability P 2 1 6 1  . Since = ; we see that X and Y are proper random variables. Since 2 2 2 k=1  k k 6 P 1 1 (k ) = +1, by Definition 1.6, we conclude that the expectation of X is not 2 k=1 k defined, whereasE(Y ) = +1. The following inequality provides a bound on the tail of the distribution ofjXj based  k on thekth absolute moment ofX, i.e.,EjXj . Theorem 1.5 (Markov Inequality). For everyk 0;  0  k EjXj PfjXjg k  k k k c Proof: LetA =f :jX()jg, and writejXj =jXj I +jXj I . A A       k k k c E jXj = E jXj I +E jXj I A A   k  E jXj I (because each term is nonnegative) A k k k   E(I ) (becausejXj I  I ) A A A k =  P (A) Therefore  k EjXj P (A) k  Corollary 1.1 (Chebychev Inequality). If X is a random variable with finite variance, then Var(X) Pf : jXEXjg 2 1.3. STOCHASTICPROCESSES 9 Proof: LetY =XEX and apply the Markov Inequality (Theorem 1.5) withk = 2. Remark: Note that if a random variable has finite variance then it has finite mean. In   k j general, if EjXj 1 for some k 1, then EjXj 1 for all j k. Further, E(jXj)1 if and only ifE(X)1. E(X) Corollary 1.2. IfX 0 with probability 1, thenP (X) .  Example 1.5. The above corollary implies that for a nonnegative random variable X, P (X  10EX) 0:1, i.e., the probability that a nonnegative random variable is more than 10 times its average value is less than 0.1. Thus expectation itself can be used to obtain a simple bound (usually very weak) on the tail probability ofX. 1.3 Stochastic Processes Definition 1.7. A collection of random variablesfX ;t 2 Tg defined on the same t probability space ( ;F) is called a random process. T is called the parameter set or index set. WhenT is a finite set, then we just have a random vector. In general,T is infinite.  IfT is a countable set thenfXg is said to be a discrete parameter process. t  IfT is uncountable thenfXg is said to be a continuous parameter process. In this t case we may also write the parameter as an argument rather than as a subscript, i.e., X(t). We note thatT may not only have the interpretation of time, discrete or continuous. The parameter set could represent, for example, individuals in a population, or points in one dimensional or two dimensional space. Remark: It is important to note that, given a stochastic processfX ;t2Tg, by definition, t for eacht2 T , X is a random variable, but for each given2 , X ();t2 T; is a t t real valued function over the parameter setT , and is called the sample path of the process corresponding to the sample point. Example 1.6. The following are some examples of stochastic processes (see Figure 1.1). 1. fX;i 1g is a sequence of independent and identically distributed (i.i.d.) random i variables, with common distribution function F (). By “independent” we mean that the random variables in the collection are mutually independent, i.e., any finite subset of this collection is a set of independent random variables. Thus for anym210 CHAPTER1. AREVIEWOFSOMEBASICS Figure 1.1: A sample path of the queue length process X(t) and the work in system processV (t) for the sample point shown at the top. In the depiction of , the service requirement of customerk is shown as a vertical line with heightb . k f1; 2; 3;g, andk ;k ; ;k 2f1; 2; 3;g,F (x ;x ; ;x ) = 1 2 m k ;k ;;k 1 2 m 1 2 m m  F (x ). It easily follows that these finite dimensional distributions are j j=1 consistent. fX;i 1g is a discrete parameter stochastic process, that can take i continuous values. 2. Processes in a Single Station Queue: Customers arrive to a waiting room at random times, and bring random amounts of work (expressed in units of time). The customers wait in line in first-come-first-served (FCFS) order. A server works at the rate of 1 second per second on the head-of-the-line (HOL) customer. The server is nonidling, which is to mean that it does not idle when there is work to be done. It is easy to see that the evolution of such a queueing system is completely specified when we specify the arrival instants of the customers (say, (t ;t ;t ; )), and the amounts of time required to serve each customer (say, 1 2 3 (b ;b ;b ; )). Thus we can take a realisation or outcome of the experiment to 1 2 31.3. STOCHASTICPROCESSES 11 be = ((t ;b ); (t ;b ); (t ;b ); ), and the sample space to be the collection 1 1 2 2 3 3 of all such elementary outcomes (see Figure 1.1). The event spaceF can also be correspondingly defined, but we will not attempt to do so. Now several useful stochastic processes can be define on this set up. (a) The queue length process (fX(t); t  0g): For each , X(t;) is the number of customers in the system at timet. This is a continuous parameter + (time) process that takes non-negative integer values (i.e., values inZ ) (see Figure 1.1). (b) The arrival process (fA(t);t 0g): For each , A(t;) is the number of customers that arrive to the system in the interval 0;t. A(t) is a continuous + time discrete values process, taking values inZ (see Figure 1.1). (c) The work in the system (fV (t);t  0g): For each , V (t;) is the total number of seconds of work remaining to be done on the customers in the queue at timet. This is the sum of the residual work to be done on the customer in service, and the total service requirements of the customers that have not yet begun service.V (t) is a continuous time and continuous values process taking + values inR . (d) The sojourn times of the customers (W ;k 2 f1; 2; 3;g): For each , k W () is the amount of time that thekth customer spends in the system from k the instant it arrives until the instant it leaves. W is a discrete parameter k + process (indexed by the customers) that takes continuous values (in R ). In the example of Figure 1.1 W = b and, since b t t , W = 1 1 1 2 1 3 (b (t t )) +b . 2 3 2 3 (a) (e) The number of customers found by each arrival (X ;k2f1; 2; 3;g) : For k (a) each, X () is the number of customers found in the system by thekth k arrival. This is a discrete parameter (the indices of the arriving customers) and (a) (a) discrete valued process. In the example of Figure 1.1X = 0 andX = 1. 1 3 1.3.1 Finite Dimensional Distributions Definition 1.8. For anym2f1; 2; 3;g and a finite setft ;t ; ;t gT 1 2 m F (x ;x ; ;x ) =P (X x ;X x ; ;X x ) t ;t ;;t 1 2 m t 1 t 2 t m 1 2 m 1 2 m is called a finite dimensional distribution of the process X ;t 2 T . The collection of t all such finite dimensional distributions (i.e., for allm2f1; 2; 3;g and all finite sets ft ;t ; ;t g T ) is called the set of finite dimensional distributions of the process 1 2 m fX ;t2Tg. t12 CHAPTER1. AREVIEWOFSOMEBASICS Given a real valued random variableX on ( ;F), its distributionF () can be used to obtain an appropriate probability measure on ( ;F). In a similar way, a stochastic process can be characterised by the collection of its finite dimensional distributions. But clearly an arbitrary collection of such distributions will not do. To see this, note that if (X;Y ) is a random vector then the joint distribution F (x;y) and the individual distributions (X;Y) F (x) and F (y) cannot be an arbitrary collection of distributions, but need to be X Y consistent; i.e., it must hold that F (y) = F (1;y), and F (x) = F (x;1). Y X (X;Y) (X;Y) In the same way, the finite dimensional distributions of a stochastic process must satisfy the following condition. Definition 1.9. A collection of finite dimensional distributions onT is said to be consistent if for all finite setsT T T , m n F (x ;x ; ;x ) =F (1; ;1;x ;1; ;1;x ;1; ;1;x ;1; ;1); T 1 2 m T 1 2 m m n where the dimensions that are not inT are shown as1. m In this course we will often define real valued random processes, X ;t 2 T; via t structural properties. For example, one such structural property, which we will encounter in Chapter 2, is the Markov property. Such properties can be used to prove that the process has consistent finite dimensional distributions. The question then arises: “Is there a probability space ( ;F;P ), and a random process,X ;t2T; whose finite dimensional t distributions are the ones derived from the structural properties?” Remarkably, the Kolmogorov Extension Theorem states that if the finite dimensional distributions that T are derived are consistent, then, with = R and an appropriately defined -algebra and probability measure, there exists a random process with the same finite dimensional distributions. We will be concerned only with the situation in which two processes that have the same finite dimensional distributions are essentially the same for all practical purposes. Hence when we define a new kind of stochastic process, one of the first programs that we will have is to determine its finite dimensional distributions. If we are able to do this, it will mean that we have a well defined process. Also, carrying out this program will demonstrate how a definition of a process can end up in completely specifying its finite dimensional distributions, and hence its essential properties. 1.4 Convergence of Random Sequences Definition 1.10. A sequence of random variablesfX ;n  1g is said to converge in n probability to a random variableX if, for all 0, lim P (jX Xj) = 0, and n1 n p we denote this byX X. n

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.