Question? Leave a message!




Conditional Probability

Conditional Probability 17
Stat S110 Unit 2: Conditional Probability Chapter 2 in the text 1 Unit 2 Outline • Conditional Probability Definition • Bayes’ Rule • Independence • The Gambler’s Ruin • Conditionally Paradoxical 2 Why is conditional probability important • Conditional probability always us to update a probability statement given other information that is related. • Suppose you have plans to attend the Red Sox game (or a picnic, an outdoor concert, etc…) tonight, and you are trying to determine whether it will rain later in the day. • Before any information is gathered, you may put a probability on the event that is will rain: P(R). • However, you may look at the window and see that it is very cloudy. You certainly would want to update the probability of rain given this information: P(R C). • You may gather more information throughout the day, so your conditional probability may change: P(R A ∩ B ∩ C). • By conditioning on known information, we can improve our probability predictions 3 Definition: Conditional Probability • conditional probability: the probability of one event occurring under the condition that we know the outcome of another event • Let A and B be two events in a sample space, with P(B) 0. The conditional probability of event A, given that B has occurred, written P(AB), is: P(A and B) P(A B) P(B) • P(AB) is read as “probability of A, given B” has happened, or probability of A if B is true. • P(A) ignoring B is often called the prior probability, and P(AB) is often called the posterior probability incorporating knowledge/information of event B. • We’ve essentially been using conditional probability already (sampling without replacement). 4 Conditional probability in Pebble World • We are trying to determine P(AB). • On the left is the entire sample space. But we do not need C to incorporate anything in B to determine P(AB). • So in the end, we just restrict ourselves to only considering the outcomes in B and asking: how much of B does A take up 5 Key Concept • Conditional probability is probability • So all of the rules/axioms, results, and properties hold. • For example: P(S E)1, P( E) 0 C P(A E)1 P(A E) P(A B E) P(A E) P(B E) P(A B E) P(A and B E) P(A B E) P(B E) • And many more… 6 Simplest Example • There is a bag with 3 balls in it: 1 is red, and 2 are black • You draw two balls out of the bag, one at a time (without replacement). What is the probability the second ball is black given the first ball is black • Define the events: A: the first ball drawn is black B: the second ball drawn is black First Ball Second Ball P(A B) P(Both Balls are Black) 2 / 6 1 P(B A) P(A) P(First Ball is Black) 4 / 6 2 st • What is the probability the 1 ball was black given the nd 2 is black Very tricky… • The Monty Hall Problem  There are prizes behind 3 doors: two are ‘worthless’ (an ant farm) and one is expensive (like a new car)  You are asked to choose one of the 3 doors  Then, Monty Hall (from Let’s Make a Deal) opens one of the other 2 doors and shows you a worthless prize • Should you switch doors • NYTimes take: http://www.nytimes.com/2008/04/08/science/08monty.html 8 A general multiplication rule (from conditional probability) • Suppose A and B are two events in a sample space (not necessarily independent). Then • P(A ∩ B) = P(B A)P(A) = P(A)P(B A) • P(A ∩ B ∩ C) = P(A) P(B A)P(C A ∩ B) • In fact, for general A ,…,A : 1 n P(A ∩…∩A )=P(A ) P(A A )P(A A ∩A )…P(A A ∩…∩A ) 1 n 1 2 1 3 1 2 n n1 1 • The first relationship is a simple algebraic rearrangement of the definition of conditional probability: P(B and A) P(B A) P(A) 9 A Personable Example It is known that approximately 20 of men and 3 of women are taller than 6 feet in the US. Let F = the event that someone is female and T = taller than 6 feet. C a) What is P(T F) What is P(T F ) b) What is the probability that the next person walking through the door is a woman and 6 feet tall c) What is the probability that the next person walking through the door is 6 feet tall 10 Example (cont.) c) What is the probability that the next person walking through the door is 6 feet tall c Two ways for this to happen: (T and F) or (T and F ) Think Venn Diagrams C P(T ) P(T  F) P(T  F ) C C  P(T F)P(F) P(T F )P(F )  0.030.50 0.200.50 0.115 11 2way tables can help organize your thinking… P(F ∩ T) Tall (6' or more) Yes No P(F)P(T F) P(F)P(not T F) = (0.5)(0.03) Yes = (0.5)(0.97) = 0.015 = 0.485 Female P(not F)P(not T not F) P(not F)P(T not F) = (0.5)(0.80) No = (0.5)(0.20) = 0.400 = 0.100 Law of Total Probability • Let A ,…, A be a partition of the sample space S (that is, 1 n the A are disjoint and their union makes up the entire S) i with P(A ) 0 for all A . Then: i i n n P(B) P(B  A ) P(B A )P(A )  i i i i1 i1 • Illustration of law of total probability: 13 Unit 2 Outline • Conditional Probability • Bayes’ Rule • Independence • The Gambler’s Ruin • Conditionally Paradoxical 14 Bayes’ Rule • Bayes’ rule (formula) provides a way to go from P(B A) to P(A B) (they are in general not equal…)‏ • If A and B are two events whose probabilities are not 0 or 1, then: P(B A)P(A) P(A B) P(B) • It is often written using the law of total probability: P(B A)P(A) P(A B) C C P(B A)P(A) P(B A )P(A ) • Or: P(B A )P(A ) 1 1 P(A B) 1 n P(B A )P(A )  i i i1 15 d) What is the probability that a person known to be 6 feet tall is a woman Bayes Rule Directly: P(T F)P(F) P(F T ) C C P(T F)P(F) P(T F )P(F ) 0.03(0.5)  0.130 0.03(0.5) 0.2(0.5) Pretty Simple from the 2x2 table… P(F and T ) 0.015 P(F T ) 0.130 P(T ) 0.015 0.100Bayes’ Rule Example: Random Coin • You have one fair coin, and one biased coin which lands Heads with probability 3/4. You pick one of the coins at random and flip it three times. It lands Heads all three times. Given this information, what is the probability that the coin you picked is the fair one P(A F)P(F) P(F A) C C P(A F)P(F) P(A F )P(F ) 3 (1/ 2) (1/ 2)  0.23 3 3 (1/ 2) (1/ 2) (3/ 4) (1/ 2) • So what 17 Random Coin Continued • Suppose that we have now seen our chosen coin land Heads three times. If we toss the coin a fourth time, what is the probability that it will land Heads once more C C P(H A) P(H F, A)P(F A) P(H F , A)P(F A)  (1/ 2)0.23 (3/ 4)(0.77) 0.69 18 Unit 2 Outline • Conditional Probability Definition • Bayes’ Rule • Independence • The Gambler’s Ruin • Conditionally Paradoxical 19 Independent events • Two events A and B are independent if and only if knowing that one event occurs does not change the probability that the other event occurs. What does that mean for the conditional probabilities P(A B) P(A) • The following are equivalent definitions of independence: P(A B) P(A)P(B) C P(A B) P(A B ) P(B A) P(B) Note: it takes care to draw independence in a Venn Diagram… 20 An Example • There is a bag with 3 balls in it: 1 is red, and 2 are black • You draw two balls out of the bag, one at a time (without replacement). Define the events: A: the first ball drawn is black B: the second ball drawn is black • Are A and B independent How do you know 21 A practice problem Ultrasound is often used to determine the sex of an unborn baby. However, because the procedure relies on visual detection of anatomic differences between male and female babies, the error rates differ according to whether the baby is a boy or a girl.  P(ultrasound predicts male baby is male) = 75  P(ultrasound predicts female baby is female) = 90 (a) Consider an individual woman who comes to the clinic for an ultrasound to predict her baby’s sex. What is the probability that the ultrasound gives the wrong result (b) Suppose a clinic performs 10 ultrasounds a day, and the ultrasound results are independent. What is the probability of one or more incorrect sex determinations (c) What is the probability that a baby is male, given that the ultrasound predicts a male 22 Solution C Let B = baby is a boy, B = baby is a girl C A = ultrasound predicts boy, A = ultrasound predicts girl C C (a) P (Wrong result) P(A and B ) P(A and B) C C C  P(A B )P(B ) P(A B)P(B)  0.10(0.50) 0.25(0.50) 0.175 (b) P(at least 1 out of 10 are wrong predictions) 1 P(0 wrong predictions)1 P(10 correct predictions) 10 1 (0.825)1 0.146 0.854 (c) P(boy ultrasound predicts boy) P(B A) C  P(B and A) / P(A) P(B and A) /P(B and A) P(B and A) C C  P(A B)P(B) /P(A B)P(B) P(A B )P(B )  0.75(0.50)/0.75(0.50) 0.10(0.50) 0.375 / 0.425 0.882 23 Independence of 3 events • Three events A, B, and C are independent if and only if all of the following hold: P(A B) P(A)P(B) P(A C) P(A)P(C) P(B C) P(B)P(C) P(A B C) P(A)P(B)P(C) • If the first 3 conditions hold, then we say A, B, and C are th pairwise independent, but this does not mean that 4 condition holds. Example: A = first toss is heads, B = second toss is heads, C = both tosses have the same result. • Independence of 4+ events gets unwieldy very quickly… 24 Conditional Independence • Two events A and B are conditionally independent given E if and only if: P(A B E) P(A E)P(B E) • Two events can be conditionally independent given E, but not independent themselves. And vice versa. It’s easy to make either mistake. C • For example (conditional independence given E vs. given E ): Suppose there are 2 types of classes: good classes (G) and bad classes. In a good class, if you work hard (W) you are likely to get an A (A). In bad classes the professor randomly assigns grades to students. Then W and A are conditionally C independent of G , but not given G. 25 Unit 2 Outline • Conditional Probability Definition • Bayes’ Rule • Independence • The Gambler’s Ruin • Conditionally Paradoxical 26 The Gambler’s Ruin • Suppose there are two gamblers, A and B (B could be the house), and they make a sequence of 1 bets. For each bet, gambler A has probability p of winning, and gambler B has probability q = 1 – p of winning. Gambler A starts with i dollars and gambler B starts with N – i dollars. • The amount of money that gambler A has can be viewed as a random walk on the integers 0 through N, with probability p of going to the right in a given step and probability q = 1 – p of going to the left. Visually: 27 The Gambler’s Ruin • What is the probability that A walks away with all the money • Key: define W = event that A wins all the money, p = i probability that A wins all the money for a value of i, condition on the first step, use LOTP, and reset your starting point. • Thus: st st p P(W A starts at i, wins 1 round) p P(W A starts at i,loses 1 round) q i  P(W A starts at i1) p P(W A starts at i1) q  p p p q i1 i1 28 The Gambler’s Ruin p p p p q i i1 i1 • This is a difference equation, which can be solved based on i the fact that p will have the form of x (see Math Review for i more details). This can then be reduced to the polynomial: 2 • px – x + q = 0, which has two roots: 1 and q/p. So unless q = p = ½, the general solution is: i  q i p a1 b i  p  • Since we know that p = 0 and p = 1, we can solve the above 0 N to find the values of a and b. Specifically: 1 ab N  q 1  p  29 The Gambler’s Ruin • What about when p = ½ There is only one root, and it is of the form: i i p a1 bi1 i • And again solving for the boundary condition, we find that a = 0 and b = 1/N. • So the probability that A wins when starting with i dollars is: i N    q q   1 1 if p 1/ 2  p   p p i    i / N if p1/ 2  • How do these two conditions make sense In what way are they consistent with each other 30 The Gambler’s Ruin: TakeHome • So what’s the take home message • Well first let’s explore the probability of B winning all the money. By symmetry, we just need to change the starting wealth to N – i and switch the roles of q and p. • So for all values of both i and p, it can be shown that P(A wins) + P(B wins) = 1. • What does this mean • This means the game is guaranteed to end The game can never be stuck in neverending oscillation 31 Unit 2 Outline • Conditional Probability Definition • Bayes’ Rule • Independence • The Gambler’s Ruin • Conditionally Paradoxical 32 Prosecutor’s Fallacy (http://en.wikipedia.org/wiki/SallyClark) • In 1998 Sally Clark was arrested and tried for murder after her two sons died shortly after birth (possibly due to sudden infant death syndrome, or SIDS). During the trial an expert witness testified that 1/8500 newborns die of SIDS, so the probability of two babies in the same fanily dying in a row of 2 SIDS was (1/8500) ≈ 1 in 73 million. The expert continued and said the probability of Clark’s innocence was 1 in 73 million. • What are two major problems with this line of reasoning 1) Independence of children dying of SIDS 2) Incorrectly interpreting conditional probability (not using Bayes rule to flip the conditioning). 33 Simpson’s Paradox • Two doctor’s, Dr. Hibbert and Dr. Nick, each perform two types of surgeries: heart surgery and bandaid removal, with the following results: • What is the overall probability of success for each doctor • What is the probability of success of each surgery type separately, for each doctor • Which doctor would you see 34 Simpson’s Paradox • For events A, B, and C, we have a Simpson’s paradox if: C P(A B,C) P(A B ,C) C C C P(A B,C ) P(A B ,C ) but, C P(A B) P(A B ) • What are events A, B, and C in the Dr. Hibbert v. Dr. Nick example 35 UC Berkeley 1973 Sex Bias Case http://en.wikipedia.org/wiki/Simpson27sparadox • UC Berkeley sued in 1973 for sex bias in admissions to graduate schools. • Women appeared to have lower acceptance rates than men. • Key idea: important to condition on the correct events/variables. UC Berkeley was admitting students based on School/Major, so that should be taken into account. 36 Last Word: The other Simpsons’ Paradox HOMER: Hey, I got a question for you. (pulls out a piece of paper) "Could Jesus microwave a burrito so hot that he himself could not eat it" NED: Well sir, of course, he could, but then again... wow, as melon scratchers go that's a honeydoodle. 37