Power and Sample Size for Normally Distributed Data

An often used method in applied statistics is determining the sample size necessary to view statistically significant results. Given the intended power, we can calculate the required sample size. Given the intended sample size, we can calculate the resulting power. Before we go in to how this works, we need to define a few things.

Error Types

Truth
H0 H1
Test Negative
Don’t Reject
True Negative False Negative
β
Positive
Reject
False Positive
α
True Positive
Power = 1 – β
  • \alpha = False Positive Rate.
    This is the chance of rejecting the null hypothesis H_0, given that the null hypothesis is true. Note that this value does not depend on the existence of an alternative hypothesis.
  • \beta = False Negative Rate.
    This is the chance of failing to reject the null hypothesis, given the alternative hypothesis was true. Note that this value does depend on the alternative hypothesis.
  • Power is viewed as the complement of \beta, the false negative rate. The value for power is 1-\beta. The power of the test is the probability of rejecting the null hypothesis, given the null hypothesis is false. (Given the alternative hypothesis is true)

Using these error types, we can make guesses as to the sample size necessary to achieve significant results to support our alternative hypotheses.

Sample Size Calculation

  • Case 1: One Sided Test: In a one-sided test, we will only reject the null hypothesis if our sample mean falls to the side of it that we specified in advance. That means if we have a null hypothesis, \mu = \mu_0, our alternative hypothesis is that \mu > \mu_0, (exclusive) or \mu < \mu_0.
    Given \alpha, \beta

        \begin{align*} H_0:\hspace{1cm}\mu &= \mu_0\\ H_1:\hspace{1cm}\mu &= \mu_1 > \mu_0\\ \end{align*}

    We’re using \mu_1 > \mu_0, though this calculation will work for \mu_1 < \mu_0 as well. We’re only concerned here that this is a one-sided test.

        \begin{align*} \beta &= P(Z < \frac{\mu_0 - \mu_1}{\sigma/\sqrt{n}} + Z_{\alpha})\\ -Z_{\beta} &= \frac{\mu_0 - \mu_1}{\sigma/\sqrt{n}} + Z_{\alpha}\\ (-Z_{\beta} - Z_{\alpha})(\frac{\sigma}{\sqrt{n}}) &= \mu_0 - \mu_1\\ (Z_{\beta} + Z_{\alpha})(\frac{\sigma}{\sqrt{n}}) &= \mu_1 - \mu_0\\ (Z_{\beta} + Z_{\alpha})\sigma &= (\mu_1 - \mu_0)\sqrt{n}\\ \sqrt{n} &= \frac{(Z_{\beta} + Z_{\alpha})\sigma}{\mu_1 - \mu_0}\\ n &= \frac{\sigma^2(Z_{\beta} + Z_{\alpha})^2}{(\mu_1 - \mu_0)^2}\\ \end{align*}

    Due to the fact \mu_0 and \mu_1 can be interchanged with the quantity in the denominator always remaining positive, we can use this calculation for the case where \mu_1 < \mu_0 as well.

  • Case 2: Two-sided Test: In a two-sided test, we will accept sample means that fall to either side of the null hypothesis as evidence against the null hypothesis. By this, if we have a null hypothesis \mu = \mu_0, our alternative hypothesis is that \mu = \mu_1, and that \mu_1 \neq \mu_0. That means that we will reject the null hypothesis if the sample mean is sufficiently far away from the null hypothesis, in either direction.

        \begin{align*} H_0:&\hspace{1cm}\mu = \mu_0\\ H_1:&\hspace{1cm}\mu = \mu_1 \neq \mu_0\\ \end{align*}

    In the event of a two-sided test, we are required to correct the Z value at which we reject the null hypothesis, to reflect that we are accepting values on both sides of the null hypothesis. This shifts out the Z value correspondingly.

        \begin{align*} \beta &= P(Z < Z_{\frac{\alpha}{2}} - \frac{|\mu_0 - \mu_1|}{\sigma/\sqrt{n}})\\ -Z_{\beta} &= Z_{\frac{\alpha}{2}} - \frac{|\mu_0 -\mu_1|}{\sigma/\sqrt{n}})\\ n &= \frac{\sigma^2(Z_{\frac{\alpha}{2}} + Z_{\beta})^2}{(\mu_0 - \mu_1)^2} \end{align*}

Additional Links