Statistical Testing Error
In science and in life, we are always dealing with a large amount of annoying uncertainty. That’s just how it goes. If we waited to be certain in all situations before speaking or acting, we’d never say or do anything. When we do our probabilistic testing of hypotheses, we can expect to make errors periodically. There are two types of errors that we make, and from which we hope to learn.
TYPE I, OR ALPHA ERROR
We reject the null hypothesis when it is in reality true. (We accuse our friend of using a loaded coin when our friend was really playing fair.)
The alpha level not only sets the cutoff point at which we will reject the null hypothesis; it also sets the likelihood of committing this type of error. When we set an alpha level of .05, we are committing ourselves to a 5% chance of rejecting a true null hypothesis, at a p-level of .01 it’s 1%, at .001, it’s .1%. The lower the alpha level, the less chance of committing this type of error. But this certainty has a cost. The lower we set the alpha level, the greater the chance that we will commit a beta error and accept a false null hypothesis. We sacrifice what we call POWER in order to use a very low alpha level. This other competing type of error, which we must try to balance out the chance of, is:
TYPE II, OR BETA ERROR
We fail to reject the null hypothesis when it is in reality false. (We assume our friend is playing fair when in fact he or she is playing with a loaded coin.)
Beta is the probability of making this type of error. The implication of this type of error is that in our quest to be scrupulous about not accepting a chance or random effect as an actual systematic effect of interest, we ignore a possible effect of interest.
Why all this testing?
We need our test and our significance levels to allow us a sufficient level of power as well as carefully guard against finding effects that aren’t there. Power is the potential for our research and tests to reject an actually false null hypothesis. If we set our alpha levels so low that we have little or no chance of doing so, it’s not a good thing.
Think of alpha and beta as lying on a graph with a diagonal – as one goes up, the other goes down. The .05 significance level, in most cases, is regarded as the best compromise level between alpha and beta errors, although significance of results at .01 level is generally more highly prized in the world of research. Although this is usually a very good thing, let’s say your research results show that the effect you were looking at would have occurred only 2% of the time by chance. At a an alpha level of .01, you still fail to reject the null hypothesis even though there has been an effect from your treatment. You probably did not allow yourself sufficient power in setting the alpha level. You commit a beta error.
The alpha level, along with the number of subjects, is used in conjunction with statistical tables in order to set a "critical" value for our statistic ("t" for t tests and "F" for ANOVA.) If our testing procedure yields a t or F value higher than the critical value, we can reject the null. If our obtained t or F (obtained from the test) does not exceed the critical, we FAIL TO REJECT the null (by custom we do not "Accept" because nothing has been proven and a future research study may not replicate our non-significant results. Because of the reciprocal nature of these types of error, we need to carefully consider the consequences of each before we set our alpha level.