MATH 5010 Sections 14–15 — Hypothesis Tests

1. Big picture

What is a hypothesis test?

Decision problem

We begin with a baseline model $H_0$. The data are summarized by a statistic. If the observed statistic is too unlikely under $H_0$, we reject $H_0$.

$$H_0:\theta=\theta_0 \quad\text{versus}\quad H_1:\theta>\theta_0,$$ $$p\text{-value}=P_{H_0}(\text{test statistic at least as extreme as observed}).$$

Three numbers to track

Symbol	Meaning
$\alpha$	probability of rejecting a true $H_0$
$p$-value	tail probability computed under $H_0$
power	probability of rejecting $H_0$ when an alternative is true

Null modelAlternative modelTest statisticRejection regionp-valueLikelihood-ratio testType I errorPower

2. Interactive test builder

Normal one-sample $z$ test

This is the template behind many large-sample tests.

AlternativeNull mean $\mu_0$: 0.00Observed sample mean $\bar x$: 1.20Standard error: 0.50Significance level $\alpha$: 0.05

Rejection region under $H_0$

The shaded tail is the rejection region. The green line is the observed test statistic.

3. Likelihood-ratio test

The likelihood-ratio idea

The likelihood-ratio statistic compares how well the null model explains the data against the best explanation allowed by the alternative or larger parameter space.

$$\Lambda(x)=\frac{\sup_{\theta\in\Theta_0}L(\theta;x)}{\sup_{\theta\in\Theta}L(\theta;x)}.$$

Small $\Lambda(x)$ means the null model explains the observed data poorly relative to the best-fitting model.

For one-sided mean tests

In common exponential-family examples, the LRT often reduces to a simple rule:

Reject $H_0$ when the sample mean is sufficiently large for $H_1:\theta>\theta_0$, or sufficiently small for $H_1:\theta<\theta_0$.

This is why the Poisson and exponential tests below are driven by $\bar X$.

4. Poisson mean test

$X_i\sim\mathrm{Poisson}(\lambda)$

Test $H_0:\lambda=\lambda_0$ vs. $H_1:\lambda>\lambda_0$.

$$Z=\frac{\bar X-\lambda_0}{\sqrt{\lambda_0/n}}\approx N(0,1),\qquad \text{reject if }Z>z_{1-\alpha}.$$

Sample size $n$: 40Null mean $\lambda_0$: 5.00Observed $\bar x$: 5.80$\alpha$: 0.05

Poisson rejection region

This display uses the normal approximation for $\bar X$ under $H_0$.

5. Exponential mean test

$X_i\sim\mathrm{Exp}(\mu)$

Test $H_0:\mu=\mu_0$ vs. $H_1:\mu>\mu_0$, where $\mu$ is the mean.

$$T=\frac{2n\bar X}{\mu_0}\sim\chi^2_{2n}\quad\text{under }H_0,$$ $$\text{reject if }T>\chi^2_{2n,1-\alpha}.$$

Sample size $n$: 25Null mean $\mu_0$: 10.00Observed $\bar x$: 11.20$\alpha$: 0.05

Chi-square pivot

For speed, the graph and p-value use the Wilson-Hilferty approximation to the chi-square CDF.

6. Errors and power

Type I error, Type II error, and power

	$H_0$ true	$H_1$ true
Reject $H_0$	Type I error $\alpha$	Power $1-\beta$
Do not reject	Correct	Type II error $\beta$

Poisson alternative $\lambda_1$: 6.00

Power curve for the Poisson test

The curve shows approximate rejection probability as the true Poisson mean changes.

7. Bayesian hypothesis tests

Posterior probability decision rule

For a one-sided Bayesian test, compute the posterior probability of the null region and reject if it is small.

$$H_0:\theta\le \theta_0,\quad H_1:\theta>\theta_0,\qquad \text{reject if }P(H_0\mid\text{data})ModelSample size $n$: 20Prior shape $a$: 2.0Prior rate/scale $b$: 1.0Null cutoff $\theta_0$: 5.0Observed total $S=\sum X_i$: 130Decision cutoff $c$: 0.05

Posterior density and null region

The shaded area is the posterior probability assigned to $H_0$.

8. Simulation

Simulate the Poisson test

This experiment repeats the Poisson test many times. When the true mean equals $\lambda_0$, the rejection rate should be close to $\alpha$. When the true mean is larger, the rejection rate estimates power.

True mean used to generate data $\lambda_{true}$: 5.0Number of simulated experiments: 500

Simulated $Z$ statistics

The vertical red line is the rejection cutoff from the Poisson test settings.

9. Quick checks

Self-check questions

Q1

For $H_0:\lambda=5$ vs. $H_1:\lambda>5$, large sample means should lead to:

Q2

A p-value is computed assuming:

Q3

Power means: