Hypothesis testing examples in real life

what is hypothesis testing and why is it important and hypothesis testing binomial distribution
HalfoedGibbs Profile Pic
HalfoedGibbs,United Kingdom,Professional
Published Date:02-08-2017
Your Website URL(Optional)
Comment
5 hypothesis testing Say it ain’t so That marlin I caught was 10,000 pounds, and we had to let it go before it sank the boat… what? Well I’d like to see you prove me wrong The world can be tricky to explain. And it can be fiendishly difficult when you have to deal with complex, heterogeneous data to anticipate future events. This is why analysts don’t just take the obvious explanations and assume them to be true: the careful reasoning of data analysis enables you to meticulously evaluate a bunch of options so that you can incorporate all the information you have into your models. You’re about to learn about falsification, an unintuitive but powerful way to do just that. this is a new chapter 139your new client electroskinny Gimme some skin… You’re with ElectroSkinny, a maker of phone skins. Your assignment is to figure out whether PodPhone is going to release a new phone next month. PodPhone is a huge product, and there’s a lot at stake. With my active lifestyle, I need a great skin for my PodPhone, that’s why I’m all about ElectroSkinny PodPhone will release a phone at some point in the future, and ElectroSkinny needs to start manufacturing skins a month before the phone is released in order to get in on the first wave of phone sales. If they don’t have skins ready for a release, their competitors will beat them to the punch and sell a lot of skins before ElectroSkinny can put their own on the market. But if they manufacture skins and PodPhone isn’t released, they’ll have wasted money on skins that no one knows when they’ll be able to sell. 140 Chapter 5 ElectroSkinny hipsterIf PodPhone releases, we want our skins manufactured. If there’s a delay, but ElectroSkinny hasn’t started manufacturing, they’re in great shape. hypothesis testing When do we start making new phone skins? Does ElectroSkinny manufacture the skin? The decision of when to start manufacturing a new line of skins is a big deal. Yes No New New New New PodPhone PodPhone PodPhone PodPhone out delayed out delayed PodPhone releases are always a surprise, so ElectroSkinny has to figure out when they’re about to happen. If they can start manufacturing a month before a PodPhone release, they’re in great shape. Can you help them? What sort of data or information would help you get started on this analytical problem? you are here 4 141 These are the situations we want to avoid. Here’s your client, the ElectroSkinny CEO.PodPhone knows you’ll see all this information, so they won’t want any of it to let on their release date. you have scant data What do you need to know in order to get started? PodPhone wants their releases to be a surprise, so they’ll probably take measures to avoid letting people figure out when those releases happen. We’ll need some sort of insight into how they think about their releases, and we’ll need to know what kind of information they use in their decision. PodPhone doesn’t want you to predict their next move PodPhone takes surprise seriously: they really don’t want you to know what they’re up to. So you Stuff that can’t just look at publicly available data and expect Blogs an answer of when they’re releasing the PodPhone everyone to pop out at you. knows Patents Phone specs for accessory manufacturers Consumer news PodPhone government filings …unless you’ve got a really smart way to think about them. Public economic data Specs for accessory manufacturers You need to figure out how to compare the data you do have with your hypotheses about when PodPhone will release their new phone. But first, Competitor product lines let’s take a look at the key pieces of information we do have about PodPhone… PodPhone press releases 142 Chapter 5 These data points really aren’t going to be of much help…hypothesis testing Here’s everything we know Here’s what little information ElectroSkinny has been able to piece together about the release. Some of it is publicly available, some of it is secret, and some of it is rumor. PodPhone has There is going to invested more in CEO of PodPhone be a huge increase the new phone said “No way we’re in features than any other launching the new compared to company ever phone tomorrow.” competitor phones. has. The economy There is a rumor There was just a and consumer that the PodPhone big new phone spending are both CEO said there’d released from a up, so it’s a good be no release for competitor. time to sell phones. a year. Internally, we don’t expect a release, because their product line is really strong. They’ll want to ride out their success with this line as long as possible. I’m thinking we should start several months from now… Do you think her hypothesis makes sense in light of the above evidence we have to consider? you are here 4 143 CEO of ElectroSkinnyThis model of the world fits your evidence. a hypothesis fits ElectroSkinny’s analysis does fit the data The CEO has a pretty straightforward account of step- by-step thinking on the part of PodPhone. Here’s what she said in a schematic form: PodPhone’s current product line is strong. PodPhone will want to ride out their success for now. The new product release will be delayed. This model or hypothesis fits the evidence, because there is nothing in the evidence that proves the model wrong. Of course, there is nothing in the evidence that strongly supports the model either. PodPhone has There is going invested more in to be a huge CEO of PodPhone the new phone increase in said “No way we’re than any other features compared launching the new company ever to competitor phone tomorrow.” has. phones. The economy There is a rumor There was just a and consumer that the PodPhone big new phone spending are both CEO said there’d released from a up, so it’s a good be no release for competitor. time to sell phones. a year. Seems like pretty solid reasoning… 144 Chapter 5 Here’s what the ElectroSkinny CEO thinks is going to be PodPhone’s thinking. Nothing in here that disagrees with ElectroSkinny’s hypothesis.Can this memo help you figure out when a new PodPhone will be released? Write a “–” sign if the variables move in opposite directions. hypothesis testing ElectroSkinny obtained this confidential strategy memo ElectroSkinny watches PodPhone really closely, and sometimes stuff like this just falls in your lap. This strategy memo outlines a number of the factors that PodPhone considers when it’s calculating its release dates. It’s quite a bit more subtle than the reasoning the ElectroSkinny CEO imagined they are using. Think carefully about how PodPhone thinks the variables mentioned in the memo relate. Do the pairs below rise and fall together, or do they go in opposite directions? Write a “+” or “–” in each circle depending on your answer. Competitor PodPhone Consumer Economy + Product Product Spending Releases Releases PodPhone PodPhone PodPhone Product Economy Sales Sales Releases PodPhone Competitor Supplier PodPhone Sales Sales Sales Output you are here 4 145 Put a “+” in each circle if the two variables rise and fall together. PodPhone phone release strategy memo We want to time our releases to maximize sales and to beat out our competitors. We have to take into account a variety of factors to do it. First, we watch the economy, because an increase in overall economic performance drives up consumer spending, while economic decline depresses consumer spending. And consumer spending is where all phone sales comes from. But we and our competitors are after the same pot of consumer spending. Every phone we sell is one they don’t sell, and vice versa. We don’t usually want to release a phone when they have a new phone on the market. We take a bigger bite out of competitor sales if we release when they have a stale product portfolio. Our suppliers and internal development team place limits on our ability to drop new phones, too. CONFIDENTIALEconomy goes up, so does consumer spending. linked variables In the mind of PodPhone, how are the pairs of variables below linked to each other quantitatively? Competitor PodPhone Consumer Economy + Product – Product Spending Releases Releases PodPhone PodPhone PodPhone Economy + Product + Sales Sales Releases PodPhone Competitor PodPhone Supplier – + Sales Sales Output Sales Variables can be negatively Here are a few of the other relationships that can be read or positively linked from PodPhone’s strategy memo. When you are looking at data variables, it’s a good idea to ask whether they are positively linked, PodPhone where more of one means more of the other (and Internal Product + development vice versa), or negatively linked, where more of Releases activity one means less of the other. On the right are some more of the relationships PodPhone sees. How can you use these relationships Competitor to develop a bigger model of their beliefs, one Competitor Product + that might predict when they’re going to release their Sales Releases new phone? Competitor PodPhone Product Product + Releases Releases 146 Chapter 5 Every phone PodPhone sells is a phone that their competitor doesn’t sell, and vice versa. If a competitor has a recent product release, PodPhone avoids releasing. These are all positively linked.hypothesis testing Let’s tie those positive and negative links between variables into an integrated model. Using the relationships specified on the facing page, draw a network that incorporates all of them. Consumer + Spending Economy + PodPhone Sales you are here 4 147 These two relationships are already done.PodPhone seems to be watching the interaction of a lot of variables. There are a bunch of things going on here. networked causality How does your model of PodPhone’s worldview look once you’ve put it in the form of a network? Consumer + Spending Economy Competitor Competitor Product + + Sales Releases – – + PodPhone PodPhone Sales + Product Releases + + Internal Supplier development Output activity 148 Chapter 5 One of these can’t really change without affecting all the other variables.This is way too simple. hypothesis testing Causes in the real world are networked, not linear Linearity is intuitive. A linear explanation of the causes for why PodPhone might decide to delay their release is simple and straightforward. PodPhone’s current product line is strong. PodPhone will want to ride out their success for now. The new product release will be But a careful look at PodPhone’s strategy report delayed. suggests that their actual thinking, whatever the details are, is much more complex and sophisticated than a simple linear, step-by-step diagram would suggest. PodPhone realizes that they are making decisions in the context of an active, volatile, interlinked system. As an analyst, you need to see beyond simple models like this and expect to see causal networks. In the real world causes propagate across a network of related variables… why should your models be any different? So how do we use that to figure out when PodPhone is going to release their new phone? What about the data? you are here 4 149 PodPhone’s strategy memo suggests that their thinking is more complex than this.Here are a few estimates of when the new PodPhone might be released. The hypothesis that we consider strongest will determine ElectroSkinny’s manufacturing schedule. You’ll somehow combine your hypotheses with this evidence and PodPhone’s mental model to get your answer. generate hypotheses Hypothesize PodPhone’s options Sooner or later, PodPhone is going to release a new phone. The question is when. And different answers to that question are your hypotheses for this analysis. Below are a PodPhone has There is going few options that specify when a release might invested more in to be a huge CEO of PodPhone the new phone increase in said “No way we’re occur, and picking the right hypothesis is what than any other features compared launching the new company ever ElectroSkinny needs you to do. to competitor phone tomorrow.” has. phones. The economy There is a rumor There was just a and consumer that the PodPhone big new phone spending are both CEO said there’d released from a up, so it’s a good be no release for competitor. time to sell phones. a year. Your evidence H1: Release will be tomorrow H2: Release will be next month H3: Release will be in six months H4: Release will be in a year H5: No release, Your product hypotheses canceled 150 Chapter 5hypothesis testing You have what you need to run a hypothesis test Between your understanding of PodPhone’s Consumer mental model and the evidence, you have + Spending amassed quite a bit of knowledge about the issue that ElectroSkinny cares about most: when PodPhone is going to release their Economy product. Competitor You just need a method to put all this Competitor Product + intelligence together and form a solid + Sales Releases prediction. – – + PodPhone PodPhone Sales + Product Releases + + Internal Supplier development Output activity PodPhone’s mental model Your big prediction But how do we do it? We’ve already seen how complex this problem is… with all that complexity how can we possibly pick the right hypothesis? you are here 4 151 Here’s the variable you care most about. Here’s what ElectroSkinny’s looking forfalsify your way to the truth Falsification is the heart of hypothesis testing Don’t try to pick the right hypothesis; just eliminate the disconfirmed hypotheses. This is the method of falsification, which is fundamental to hypothesis testing. Picking the first hypothesis that seems best is called satisficing and looks like this: Don’t satisfice H1: H2: H3: H5: H4: Release Release Release will No release, Release will will be will be next be in six product be in a year tomorrow month months canceled Satisficing is really simple: it’s picking the first option without ruling out the others. On the other hand, falsification looks like this: Falsification is more reliable. H1: H2: H3: H5: H4: Release Release Release will No release, Release will will be will be next be in six product be in a year tomorrow month months canceled It looks like both satisficing and falsification This is all that’s left. get you the same answer, right? They don’t always. The big problem with satisficing is that when people pick a hypothesis without thoroughly analyzing the alternatives, they often stick with it even as evidence piles up Use falsification in hypothesis against it. Falsification enables you to have a more nimble perspective on your testing and avoid the danger hypotheses and avoid a huge cognitive trap. of satisficing. 152 Chapter 5 This is satisficing.Which ones do your evidence suggest are wrong? hypothesis testing Give falsification a try and cross out any hypotheses that are falsified by the evidence below. H1: H2: H3: H5: H4: Release Release Release will No release, Release will will be will be next be in six product be in a year tomorrow month months canceled PodPhone has There is going invested more in to be a huge CEO of PodPhone the new phone increase in said “No way we’re than any other features compared launching the new company ever to competitor phone tomorrow.” has. phones. The economy There is a rumor There was just a and consumer that the PodPhone big new phone spending are both CEO said there’d released from a up, so it’s a good be no release for competitor. time to sell phones. a year. Why do you believe that the hypotheses you picked are falsified by the evidence? you are here 4 153 Here are your hypotheses. Here’s your evidence.hypotheses eliminated Which hypotheses did you find to be falsified? H1: H2: H3: H5: H4: Release Release Release will No release, Release will will be will be next be in six product be in a year tomorrow month months canceled PodPhone has There is going invested more in to be a huge CEO of PodPhone the new phone increase in said “No way we’re than any other features compared launching the new company ever to competitor phone tomorrow.” has. phones. This evidence rules out H5. The economy There is a rumor There was just a and consumer that the PodPhone big new phone spending are both CEO said there’d released from a up, so it’s a good be no release for competitor. time to sell phones. a year. Why do you believe that the hypotheses you picked are falsified by the evidence? H1 is definitely falsified by the evidence, because the CEO has gone on record saying that there was no way it’ll happen tomorrow. The CEO might be lying, but that would be so weird that we can still rule out H1. H5 is falsified because PodPhone has put so much money into the phone. The phone might be delayed or changed, but unless the company ceases to exist, it’s hard to imagine that they’d cancel the new phone. 154 Chapter 5 This evidence rules out H1.hypothesis testing heterogenous data of widely varying quality. This method is falsification in a very general Q: Q: Falsification seems like a really Where’s the data in all this? I’d form, which makes it useful for very complex elaborate way to think about analyzing expect to see a lot more numbers. problems. But it’s definitely a good idea to situations. Is it really necessary? bone up on “frequentist” hypothesis testing A: Data is not just a grid of numbers. described above, because for tests where A: It’s a great way to overcome the Falsification in hypothesis testing lets you the data fit its parameters, you would not natural tendency to focus on the wrong take a more expansive view of “data” and want to use anything else. answer and ignore alternative explanations. aggregate a lot of heterogeneous data. You By forcing you to think in a really formal can put virtually any sort of data into the Q: I think that if my coworkers saw way, you’ll be less likely to make mistakes falsification framework. me reasoning like this they’d think I was that stem from your ignorance of important crazy. features of a situation. Q: What’s the difference between using falsification to solve a problem and A: They certainly won’t think you’re crazy Q: How does this sort of falsification using optimization to solve it? if you catch something really important. relate to statistical hypothesis testing? The aspiration of good data analysts is to A: They’re different tools for different uncover unintuitive answers to complex A: What you might have learned in contexts. In certain situations, you’ll want to problems. Would you hire a conventionally statistics class (or better yet, in Head break out Solver to tweak your variables until minded data analyst? If you are really First Statistics) is a method of comparing you have the optimal values, and in other interested in learning something new about a candidate hypothesis (the “alternate” situations, you’ll want to use falsification to your data, you’ll go for the person who thinks hypothesis) to a baseline hypothesis (the eliminate possible explanations of your data. outside the box “null” hypothesis). The idea is to identify a situation that, if true, would make the null Q: OK. What if I can’t use falsification Q: It seems like not all hypotheses hypothesis darn near impossible. to eliminate all the hypotheses? could be falsified definitively. Like certain evidence might count against a Q: So why aren’t we using that hypothesis without disproving it. A: That’s the 64,000 question Let’s see method? what we can do… A: That’s totally correct. A: One of the virtues of this approach is that it enables you to aggregate Nice work I definitely know more now than I did when I brought you on board. But can you do even better than this? What about eliminating two more? you are here 4 155beyond falsification We still have 3 hypotheses left. Looks like falsification didn’t solve the whole problem. So what’s the plan now? How do you choose among the last three hypotheses? You know that it’s a bad idea to pick the one that looks like it has the most support, and falsification has helped you eliminate only two of the hypotheses, so what should you do now? H2: H3: H4: Release Release will Release will will be next be in six be in a year month months 156 Chapter 5 Which one of these will you ultimately consider to be the strongest?hypothesis testing What are the benefits and drawbacks of each hypothesis- elimination technique? Compare each hypothesis to the evidence and pick the one that has the most confirmation. Just present all of the hypotheses and let the client decide whether to start manufacturing skins. Use the evidence to rank hypotheses in the order of which has the fewest evidence-based knocks against it. you are here 4 157weigh your hypotheses Did you pick a hypothesis elimination technique that you like best? Compare each hypothesis to the evidence and pick the one that has the most confirmation. This is dangerous. The problem is that the information I have is incomplete. It could be that there is something really important that I don’t know. And if that’s true, then picking the hypothesis based on what I do know will probably give me the wrong answer. Just present all of the hypotheses and let the client decide whether to start manufacturing skins. This is certainly an option, but the problem with it is that I’m not really taking any responsibility for the conclusions. In other words, I’m not really acting as a data analyst as much as someone who just delivers data. This is the wimpy approach. Use the evidence to rank hypotheses in the order of which has the fewest evidence-based knocks against it. This one is the best. I’ve already used falsification to rule out things that I’m sure can’t be true. Now, even though I can’t rule out my remaining hypotheses, I can still use the evidence to see which ones are the strongest. 158 Chapter 5