Hypothesis testing allows us to evaluate a hypothesis, or compare two hypotheses. I include it here as part of the theoretical framework necessary for validation of the statistical models offered.
In hypothesis testing, there are two hypotheses.
- is the null hypothesis. This is generally the hypothesis we are trying to disprove. We attempt to mount evidence against the null hypothesis.
- is the alternative hypothesis. This is the hypothesis we are ready to accept if the null hypothesis is not rejected.
Types of Error – There are 2 types of error associated with hypothesis testing.
- is the probability of rejecting given is true.
- is the probability of failing to reject given is false.
- Power of a test is given as . It is the probability of rejecting given is false.
A -value, associated with a test statistic, is the largest -level at which we would still fail to reject the null hypothesis, given the data on hand. It is interpreted as the probability of seeing a sample (the data) as extreme or more extreme than what you saw, given (the null hypothesis) is true. We can NOT interpret the p-value as the likelihood that the null hypothesis is true. Data can not serve to prove the null hypothesis is true, data only offers evidence that it is untrue. We evaluate the strength of that evidence using the hypothesis test.
Example: Normal Hypothesis Testing, Known Variance, Unknown Mean
Let us assume that a sample of size is taken from a population with a known variance , and unknown mean . We know this population to be normally distributed, so we can model the sample mean using a normal distribution without any further assumptions.
Let’s say we have an initial guess as to the population mean, that we’re trying to disprove. We represent that with our null hypothesis. The alternative is given as the alternative hypothesis.
- :
- :
By nature of their definitions, we can’t calculate Power or levels without an actual value for , so I’m not going to here.
The test statistic for the normal distribution, , is defined as . is normally distributed with mean 0 and variance 1. In this case, because we’re conducting a two-sided test, we calculate the rejection region of our test as being outside the range . Depending on our intended level of significance, we go to the Z-table to get those values. If we use an level of .05, then that range is approximately (-1.96, 1.96).
So if the calculated from is greater than 1.96, or less than -1.96, then we reject the null hypothesis.
Example: Normal Hypothesis Testing, Unknown Variance
Let us assume we have a sample of size , taken from a population with an unknown variance , and an unknown mean . We believe this population to be normally distributed, but since we don’t know the variance we can not use the normal model. Thus enters the Student’s T-test.
Because we don’t know the variance, we have to use an estimator of variance. The unbiased estimator of variance is . We use degrees of freedom here because we subtract 1 for calculation of .
Since we used an estimator of variance, we must use a different test statistic. Enter the statistic.
The follows a student’s t distribution with degrees of freedom. The distribution itself asymptotically approaches the normal distribution as the degrees of freedom go up. In practice, it’s generally safe for sample sizes larger than 30 to simply use the normal distribution. However, because we’re performing calculations by computer, we can let the computer do the calculations and use the t distribution with however many degrees of freedom we have.
But anyways, we find the appropriate critical value for the given degrees of freedom, and level. If as calculated by is less than or greater than , then we reject the null hypothesis.