MATH 5010 · Sections 12–13

Point Estimation

Estimate unknown parameters from data. Compare method of moments, maximum likelihood, Bayesian posterior means, bias, variance, MSE, efficiency, and the Cramér–Rao lower bound.

1. Big picture

From sample to parameter estimate

An estimator is a statistic used to estimate an unknown parameter. In Sections 12–13, the central examples are Poisson and Exponential models, where the sample mean appears repeatedly as MoM, MLE, and often as a limit of Bayesian estimators.

$$\widehat\theta = T(X_1,\dots,X_n),\qquad \text{MSE}(\widehat\theta)=E[(\widehat\theta-\theta)^2]=\operatorname{Var}(\widehat\theta)+\operatorname{Bias}(\widehat\theta)^2.$$
Method of MomentsMaximum LikelihoodBayes estimatorUnbiasednessConsistencyEfficiencyCramér–Rao lower bound
2. Two classical estimation methods

MoM and MLE

MethodMain ideaTypical equation
Method of MomentsMatch population moments to sample moments$E_\theta[X]=\bar X$
Maximum LikelihoodChoose the parameter that makes observed data most likely$\frac{d}{d\theta}\ell(\theta)=0$
Bayes estimatorCombine prior and likelihood, then estimate from posterior$E[\theta\mid X]$ under squared error loss
Teaching message: for many one-parameter exponential-family models, MoM and MLE are the same. But their derivations teach different statistical languages.

Estimator formula finder

Why the sample mean appears so often

The sample mean is the empirical version of the first moment. If the parameter is itself the mean, MoM immediately gives the sample mean. MLE often gives the same estimator because the likelihood depends on the data through $\sum X_i$.

3. Poisson estimation

$X_i\sim\mathrm{Poisson}(\lambda)$

For Poisson data, $E[X]=\lambda$ and $\operatorname{Var}(X)=\lambda$. Both method of moments and maximum likelihood give

$$\widehat\lambda_{MOM}=\widehat\lambda_{MLE}=\bar X=\frac1n\sum_{i=1}^n X_i.$$

Poisson likelihood curve

The curve is the relative likelihood $L(\lambda)/\max L(\lambda)$. The peak occurs at $\bar X$.

4. Exponential estimation

$Y_i\sim\mathrm{Exp}(\theta)$ with mean $\theta$

Using the scale/mean parameterization, $E[Y]=\theta$ and $\operatorname{Var}(Y)=\theta^2$. Both MoM and MLE give

$$\widehat\theta_{MOM}=\widehat\theta_{MLE}=\bar Y=\frac1n\sum_{i=1}^n Y_i.$$

Exponential likelihood curve

For $L(\theta)=\theta^{-n}\exp(-\sum Y_i/\theta)$, the likelihood peaks at $\hat\theta=\bar Y$.

5. Quality of an estimator

Bias, variance, MSE, and consistency

An estimator is unbiased if $E[\widehat\theta]=\theta$. It is consistent if it converges to the true parameter as $n$ grows. Its MSE combines variance and bias.

$$\operatorname{Bias}(\widehat\theta)=E[\widehat\theta]-\theta,\qquad \operatorname{MSE}=\operatorname{Var}(\widehat\theta)+\operatorname{Bias}^2.$$

Sampling distribution simulation

As $n$ grows, the estimator becomes more concentrated around the true parameter.

6. Bayesian point estimation

Posterior mean under squared error loss

In Bayesian estimation, parameters are treated as unknown quantities with a prior distribution. After observing data, the posterior combines prior and likelihood.

ModelPriorPosterior mean
$X_i\sim\mathrm{Poisson}(\lambda)$$\lambda\sim\mathrm{Gamma}(\alpha,\beta)$ rate form$\dfrac{\alpha+\sum X_i}{\beta+n}$
$Y_i\sim\mathrm{Exp}(\theta)$ mean parameter$\theta\sim\mathrm{InvGamma}(\alpha,\beta)$$\dfrac{\beta+\sum Y_i}{\alpha+n-1}$
Bayesian estimators may be biased in the frequentist sense, but can have smaller MSE when prior information is useful.

Poisson Gamma-prior explorer

7. Cramér–Rao lower bound

Best possible variance for unbiased estimators

For unbiased estimators under regularity conditions, the variance cannot be smaller than the reciprocal Fisher information.

$$\operatorname{Var}(\widehat\theta)\ge \frac{1}{I_n(\theta)},\qquad I_n(\theta)=nI_1(\theta).$$
Model$I_1$CRLBEstimator variance
Poisson$(\lambda)$$1/\lambda$$\lambda/n$$\operatorname{Var}(\bar X)=\lambda/n$
Exponential mean $\theta$$1/\theta^2$$\theta^2/n$$\operatorname{Var}(\bar Y)=\theta^2/n$

Efficiency checker

For these two models, the MLE/sample mean achieves the CRLB, so it is efficient among unbiased estimators.
8. Quick checks

Self-check questions

Q1

For $X_i\sim\mathrm{Poisson}(\lambda)$, what is the MLE of $\lambda$?

Q2

For $Y_i\sim\mathrm{Exp}(\theta)$ with mean $\theta$, is $\bar Y$ unbiased?

Q3

Under squared error loss, what is the Bayesian point estimator?