What is a hypothesis test?
Decision problem
We begin with a baseline model $H_0$. The data are summarized by a statistic. If the observed statistic is too unlikely under $H_0$, we reject $H_0$.
Three numbers to track
| Symbol | Meaning |
|---|---|
| $\alpha$ | probability of rejecting a true $H_0$ |
| $p$-value | tail probability computed under $H_0$ |
| power | probability of rejecting $H_0$ when an alternative is true |
Normal one-sample $z$ test
This is the template behind many large-sample tests.
Rejection region under $H_0$
The shaded tail is the rejection region. The green line is the observed test statistic.
The likelihood-ratio idea
The likelihood-ratio statistic compares how well the null model explains the data against the best explanation allowed by the alternative or larger parameter space.
Small $\Lambda(x)$ means the null model explains the observed data poorly relative to the best-fitting model.
For one-sided mean tests
In common exponential-family examples, the LRT often reduces to a simple rule:
This is why the Poisson and exponential tests below are driven by $\bar X$.
$X_i\sim\mathrm{Poisson}(\lambda)$
Test $H_0:\lambda=\lambda_0$ vs. $H_1:\lambda>\lambda_0$.
Poisson rejection region
This display uses the normal approximation for $\bar X$ under $H_0$.
$X_i\sim\mathrm{Exp}(\mu)$
Test $H_0:\mu=\mu_0$ vs. $H_1:\mu>\mu_0$, where $\mu$ is the mean.
Chi-square pivot
For speed, the graph and p-value use the Wilson-Hilferty approximation to the chi-square CDF.
Type I error, Type II error, and power
| $H_0$ true | $H_1$ true | |
|---|---|---|
| Reject $H_0$ | Type I error $\alpha$ | Power $1-\beta$ |
| Do not reject | Correct | Type II error $\beta$ |
Power curve for the Poisson test
The curve shows approximate rejection probability as the true Poisson mean changes.
Posterior probability decision rule
For a one-sided Bayesian test, compute the posterior probability of the null region and reject if it is small.
Posterior density and null region
The shaded area is the posterior probability assigned to $H_0$.
Simulate the Poisson test
This experiment repeats the Poisson test many times. When the true mean equals $\lambda_0$, the rejection rate should be close to $\alpha$. When the true mean is larger, the rejection rate estimates power.
Simulated $Z$ statistics
The vertical red line is the rejection cutoff from the Poisson test settings.
Self-check questions
Q1
For $H_0:\lambda=5$ vs. $H_1:\lambda>5$, large sample means should lead to:
Q2
A p-value is computed assuming:
Q3
Power means: