Linearity of Expectation

Linearity and Monotonicity of Expectation and linearity of expectation dependent variables linearity of expectation proof continuous
ZoeTabbot Profile Pic
ZoeTabbot,Germany,Professional
Published Date:13-07-2017
Your Website URL(Optional)
Comment
Stat 110 Unit 4: Expectation Chapter 4 in the text 1 Unit 4 Outline • Definition of Expectation • Linearity and Monotonicity of Expectation • LOTUS (expectation of a function) • Variance and Standard Deviation • Geometric and Negative Binomial distributions • Indicator r.v.s and the Fundamental Bridge • Poisson distribution 2 Summarizing a r.v.’s Distribution • Suppose a discrete random variable has the following distribution: • How would you summarize this distribution? • Center? • Spread? • Shape? • The concept of expectation of a r.v. formalizes these ideas. 3 3 Definition: Expectation • The expected value (or expectation or mean), E(X), of a discrete random variable X is defined as (text, p.138-139): E(X ) xP(X x)  all x • Sometimes the mean of X is written as μ . X • Intuitively, what is E(X) measuring? • It is the theoretical weighted average of X, weighted by the probabilities (hence the name, mean). • It is the measure of the “center” of a distribution, but says nothing about the spread or shape of it. • Physically, it is the balance point of the PMF. 4 4 Expected Value Examples • Concrete Example: let X have the following distribution: x 0 1 2 P(X = x) 0.5 0.4 0.1 • Intuitively, what should be the mean of X? Calculate it. E(X ) xP(X x) 0(0.5)1(0.4) 2(0.1) 0.6  all x • Let X Bern(p). Intuitively, what should be the mean of X? Calculate it. E(X ) xP(X x) 0(1 p)1( p) p  all x 5 5 Non-uniqueness of Expectation • Based on it’s definition, it can be shown that E(X) only depends on the distribution of X. So two r.v.s with the same distribution (same PMF) will have the same expected value. • The converse is not true: two r.v.s could have the same expected value, but have completely different distribution. • Examples: X Bin(n=2, p=0.5) and Y = DUnif(0,1,2). 6 6 Unit 4 Outline • Definition of Expectation • Linearity and Monotonicity of Expectation • LOTUS (expectation of a function) • Variance and Standard Deviation • Geometric and Negative Binomial distributions • Indicator r.v.s and the Fundamental Bridge • Poisson distribution 7 Linearity of Expectation • For any r.v.s X, Y and any constant c, linearity of expectation holds (text p. 140): E(XY ) E(X ) E(Y ) E(aX b) aE(X ) b • What do these results mean? • The second equation says that we can take out constant factors from an expectation; this is both intuitively reasonable and easily verified from the definition. • The first equation, E(X + Y) = E(X) + E(Y), also seems reasonable when X and Y are independent. What may be surprising is that it holds even if X and Y are dependent 8 8 Why E(X + Y) = E(X) + E(Y) • We can equivalently calculate expectations based on the sum of all outcomes in S directly. Thus: E(X ) X (s)P(s)  all s • What the heck does X(s) mean? How about P(s)? • So if X and Y are based on the outcomes (key: they have to be if they are dependent), then: E(X ) E(Y ) X (s)P(s) Y (s)P(s)  all s all s  (XY )(s)P(s) E(XY )  all s • An example is worth a thousand words. .. 9 9 Linearity of Expectation is handy • Let X Bin(n,p). Intuitively, what should be the mean of X? Calculate it. • Two ways to do this: 1) Brute force: 2) Applying linearity (much easier): 10 10 Linearity of Expectation is handy • Let X HGeom(w,b,n). Intuitively, what should be the mean of X? Calculate it. Hint: define X as a sum of dependent Bernoulli r.v.s. 11 11 Monotonicity of Expectation • Let X and Y be r.v.s such that X ≥ Y with probability 1. Then E(X) ≥ E(Y), with equality holding if and only if X = Y with probability 1. • Proof: • We will prove it only for discrete r.v.s. (but it holds for all r.v.s). The r.v. Z = X – Y is nonnegative (with probability 1), so E(Z) ≥ 0 since E(Z) is defined as a sum of nonnegative terms. By linearity, E(X) – E(Y) = E(X – Y) ≥ 0; as desired. • If E(X) = E(Y), then by linearity we also have E(Z) = 0, which implies that P(X = Y) = P(Z = 0) = 1 since if even one term in the sum defining E(Z) is positive, then the whole sum is positive. • Monotonicity is not as useful as linearity. 12 12 Unit 4 Outline • Definition of Expectation • Linearity and Monotonicity of Expectation • LOTUS (expectation of a function) • Variance and Standard Deviation • Geometric and Negative Binomial distributions • Indicator r.v.s and the Fundamental Bridge • Poisson distribution 13 Law of the Unconscious Statistician (LOTUS) • If X is a discrete r.v. and g is a function from ℝ to ℝ, then (text, p.156): Eg(X ) g(x)P(X x)  all x • What does this mean? This means that we can get the expected value of g(X) knowing only P(X = x), the PMF of X; we don't need to know the PMF of g(X). The name comes from the fact that in going from E(X) to Eg(X) it is tempting just to change x to g(x) in the definition, which can be done very easily and mechanically, perhaps in a state of unconsciousness. • Be careful: Eg(X) does not necessarily equal gE(X) 14 14 Example to illustrate LOTUS • Concrete Example: let X have the following distribution: x -2 -1 0 1 2 P(X = x) 0.1 0.2 0.3 0.2 0.2 2 • Let Y = X . Find E(Y) two ways: based on the distribution of Y and using LOTUS. • Why does it work out to be the same answer? 15 15 Unit 4 Outline • Definition of Expectation • Linearity and Monotonicity of Expectation • LOTUS (expectation of a function) • Variance and Standard Deviation • Geometric and Negative Binomial distributions • Indicator r.v.s and the Fundamental Bridge • Poisson distribution 16 Variance and Standard Deviation • The variance of a r.v. X is defined as (text, p.158): 2 Var(X ) EX 2 where μ = E(X). Sometimes variance is written as σ . X • The square root of the variance is called standard deviation: SD(X ) Var(X ) • What is variance measuring? What are its units? • What is standard deviation measuring? What are its units? • Which is more interpretable? 17 17 Variance: an equivalent formula • For any r.v. X: 2 2 Var(X ) EX where μ = E(X). 2 • Proof (Expand (X – μ) and use linearity): 2 2 2 Var(X ) EX EX 2X 2 2 2 2 2  EX 2E(X ) EX 2 2 2  EX • This result is often useful when calculating variance of a r.v. 18 18 Variance Examples • Concrete Example: let X have the following distribution: x 0 1 2 P(X = x) 0.5 0.4 0.1 • What is the variance of X? What is its standard deviation? 2 2 Var(X ) EXx P(X x)  all x 2 2 2  (0 0.6) (0.5) (1 0.6) (0.4) (2 0.6) (0.1) 0.44 SD(X ) Var(X ) 0.44 0.663 • Let X Bern(p). Calculate Var(X). 2 2 2 2 E(X ) x P(X x) 0 (1 p)1 ( p) p  all x 2 2 2 Var(X ) EX p p p(1 p) 19 19 Properties of Variance • Var(X + c) = Var(X) 2 • Var(cX) = c Var(X) • Var(X) ≥ 0 • If X and Y are independent, then: Var(X + Y) = Var(X) + Var(Y) • Why do these properties make sense, intuitively? th • The 4 property: we will prove later (need to define covariance first). But intuitively, what should Var(X+X) be? • When is Var(X) = 0? 20 20