Some criticisms of null hypothesis testing focus on researchers’ misunderstanding of it. We have already seen, for example, that the *p *value is widely misinterpreted as the
probability that the null hypothesis is true. (Recall that it is really the probability of the sample result *i**f *the null hypothesis were true.) A closely related
misinterpretation is that 1 − *p *is the probability of replicating a statistically significant result. In one study, 60% of a sample of professional researchers thought that a
*p *value of .01—for an independent- samples *t *test with 20 participants in each sample—meant there was a 99% chance of replicating the statistically significant
result (Oakes, 1986). 1 Our earlier discussion of power should make it clear that this is far too optimistic.
As Table
13.5 shows, even if there were a large difference between means in the population, it would require 26 participants per sample to achieve a power of .80. And the program G*Power shows
that it would require 59 participants per sample to achieve a power of .99.

Another set of criticisms focuses on the logic of null hypothesis testing. To many, the strict convention of rejecting the null hypothesis when *p*is less than .05 and retaining it when
*p*is greater than .05 makes little sense. This criticism does not have to do with the specific value of .05 but with the idea that there should be any rigid dividing line between
results that are considered significant and results that are not. Imagine two studies on the same statistical relationship with similar sample sizes. One has a *p*value of .04 and the
other a *p*value of .06. Although the two studies have produced essentially the same result, the former is likely to be considered interesting and worthy of publication and the latter
simply not significant. This convention is likely to prevent good research from being published and to contribute to the file drawer problem.

Yet another set of criticisms focus on the idea that null hypothesis testing—even when understood and carried out correctly—is simply not very informative. Recall that the null hypothesis is
that there is no relationship between variables in the population (e.g., Cohen’s *d *or Pearson’s *r *is precisely 0). So to reject the null hypothesis is simply to say
that there is *so**me *nonzero relationship in the population. But this is not really saying very much. Imagine if chemistry could tell us only that there is
*s**o**me *relationship between the temperature of a gas and its volume—as opposed to providing a precise equation to describe that relationship. Some critics even
argue that the relationship between two variables in the population is never precisely 0 if it is carried out to enough decimal places. In other words, the null hypothesis is never literally
true. So rejecting it does not tell us anything we did not already know!

To be fair, many researchers have come to the defense of null hypothesis testing. One of them, Robert Abelson, has argued that when it is correctly understood and carried out, null hypothesis testing does serve an important purpose (Abelson, 1995). 2 Especially when dealing with new phenomena, it gives researchers a principled way to convince others that their results should not be dismissed as mere chance occurrences.

- 5948 reads