MATH 5010 Sections 12–13 — Point Estimation

1. Big picture

From sample to parameter estimate

An estimator is a statistic used to estimate an unknown parameter. In Sections 12–13, the central examples are Poisson and Exponential models, where the sample mean appears repeatedly as MoM, MLE, and often as a limit of Bayesian estimators.

$$\widehat\theta = T(X_1,\dots,X_n),\qquad \text{MSE}(\widehat\theta)=E[(\widehat\theta-\theta)^2]=\operatorname{Var}(\widehat\theta)+\operatorname{Bias}(\widehat\theta)^2.$$

Method of MomentsMaximum LikelihoodBayes estimatorUnbiasednessConsistencyEfficiencyCramér–Rao lower bound

2. Two classical estimation methods

MoM and MLE

Method	Main idea	Typical equation
Method of Moments	Match population moments to sample moments	$E_\theta[X]=\bar X$
Maximum Likelihood	Choose the parameter that makes observed data most likely	$\frac{d}{d\theta}\ell(\theta)=0$
Bayes estimator	Combine prior and likelihood, then estimate from posterior	$E[\theta\mid X]$ under squared error loss

Teaching message: for many one-parameter exponential-family models, MoM and MLE are the same. But their derivations teach different statistical languages.

Estimator formula finder

Choose a model

Why the sample mean appears so often

The sample mean is the empirical version of the first moment. If the parameter is itself the mean, MoM immediately gives the sample mean. MLE often gives the same estimator because the likelihood depends on the data through $\sum X_i$.

3. Poisson estimation

$X_i\sim\mathrm{Poisson}(\lambda)$

For Poisson data, $E[X]=\lambda$ and $\operatorname{Var}(X)=\lambda$. Both method of moments and maximum likelihood give

$$\widehat\lambda_{MOM}=\widehat\lambda_{MLE}=\bar X=\frac1n\sum_{i=1}^n X_i.$$

Sample size $n$: 25True $\lambda$: 4.00

Poisson likelihood curve

The curve is the relative likelihood $L(\lambda)/\max L(\lambda)$. The peak occurs at $\bar X$.

4. Exponential estimation

$Y_i\sim\mathrm{Exp}(\theta)$ with mean $\theta$

Using the scale/mean parameterization, $E[Y]=\theta$ and $\operatorname{Var}(Y)=\theta^2$. Both MoM and MLE give

$$\widehat\theta_{MOM}=\widehat\theta_{MLE}=\bar Y=\frac1n\sum_{i=1}^n Y_i.$$

Sample size $n$: 25True mean $\theta$: 2.50

Exponential likelihood curve

For $L(\theta)=\theta^{-n}\exp(-\sum Y_i/\theta)$, the likelihood peaks at $\hat\theta=\bar Y$.

5. Quality of an estimator

Bias, variance, MSE, and consistency

An estimator is unbiased if $E[\widehat\theta]=\theta$. It is consistent if it converges to the true parameter as $n$ grows. Its MSE combines variance and bias.

$$\operatorname{Bias}(\widehat\theta)=E[\widehat\theta]-\theta,\qquad \operatorname{MSE}=\operatorname{Var}(\widehat\theta)+\operatorname{Bias}^2.$$

DistributionParameter value: 4.00Sample size $n$: 25

Sampling distribution simulation

As $n$ grows, the estimator becomes more concentrated around the true parameter.

6. Bayesian point estimation

Posterior mean under squared error loss

In Bayesian estimation, parameters are treated as unknown quantities with a prior distribution. After observing data, the posterior combines prior and likelihood.

Model	Prior	Posterior mean
$X_i\sim\mathrm{Poisson}(\lambda)$	$\lambda\sim\mathrm{Gamma}(\alpha,\beta)$ rate form	$\dfrac{\alpha+\sum X_i}{\beta+n}$
$Y_i\sim\mathrm{Exp}(\theta)$ mean parameter	$\theta\sim\mathrm{InvGamma}(\alpha,\beta)$	$\dfrac{\beta+\sum Y_i}{\alpha+n-1}$

Bayesian estimators may be biased in the frequentist sense, but can have smaller MSE when prior information is useful.

Poisson Gamma-prior explorer

Prior shape $\alpha$: 2.0Prior rate $\beta$: 1.0Sample size $n$: 20Total count $S=\sum X_i$: 80

7. Cramér–Rao lower bound

Best possible variance for unbiased estimators

For unbiased estimators under regularity conditions, the variance cannot be smaller than the reciprocal Fisher information.

$$\operatorname{Var}(\widehat\theta)\ge \frac{1}{I_n(\theta)},\qquad I_n(\theta)=nI_1(\theta).$$

Model	$I_1$	CRLB	Estimator variance
Poisson$(\lambda)$	$1/\lambda$	$\lambda/n$	$\operatorname{Var}(\bar X)=\lambda/n$
Exponential mean $\theta$	$1/\theta^2$	$\theta^2/n$	$\operatorname{Var}(\bar Y)=\theta^2/n$

Efficiency checker

ModelParameter: 4.00Sample size $n$: 25

For these two models, the MLE/sample mean achieves the CRLB, so it is efficient among unbiased estimators.

8. Quick checks

Self-check questions

Q1

For $X_i\sim\mathrm{Poisson}(\lambda)$, what is the MLE of $\lambda$?

Q2

For $Y_i\sim\mathrm{Exp}(\theta)$ with mean $\theta$, is $\bar Y$ unbiased?

Q3

Under squared error loss, what is the Bayesian point estimator?