MATH 5010 · Section 20

Bayesian Inference: Prior, Posterior, Prediction, and Decision

An interactive teaching page connecting point estimation, interval estimation, and hypothesis testing through one Bayesian workflow. Use the sliders and calculators to see how data update prior beliefs.

1. The Bayesian update

Prior
what we believe before data
×
Likelihood
what the data say
Posterior
updated belief after data

π(θ | x) = L(θ; x)π(θ) / ∫ L(u; x)π(u)du

Classical inference often treats the parameter as fixed and the data as random. Bayesian inference treats the parameter as unknown and represents uncertainty about it by a probability distribution.

Why conjugate priors?

A conjugate prior produces a posterior in the same family as the prior. This makes the update easy to compute and excellent for teaching: prior parameters plus data summaries become posterior parameters.

Learning goals

  • Translate prior + likelihood into posterior.
  • Compute Bayes estimators under squared error loss.
  • Interpret credible intervals.
  • Compare Bayesian tests with p-value tests.
  • Use posterior predictive simulation.
  • Understand why MCMC is needed when posteriors are not closed form.

Conjugate-pair summary

Data modelPriorPosteriorUseful statisticBayes estimate under squared error
Binomial(n, θ)Beta(a,b)Beta(a+x,b+n−x)x successesE[θ|x]
Poisson(λ)Gamma(a,b rate)Gamma(a+Σxᵢ,b+n)ΣxᵢE[λ|x]
Exponential(mean μ)Inv-Gamma(a,b)Inv-Gamma(a+n,b+Σxᵢ)ΣxᵢE[μ|x], if a+n>1
Normal mean μ, known σNormal(m₀,s₀²)Normal(mₙ,sₙ²)posterior mean mₙ

2. Beta–Binomial update

Use this for conversion rates, coin bias, success probability, or any binary outcome.

Posterior
Mean
MAP
Approx. 95%
θ | x ~ Beta(a+x, b+n−x)

3. Gamma–Poisson update

This matches the Poisson Bayesian estimation and Bayesian test examples: if counts are Poisson with mean λ and λ has a Gamma prior, the posterior is Gamma.

Posterior
Posterior mean
P(λ ≤ λ₀ | x)
Decision
λ | x ~ Gamma(a+S, b+n), rate parameterization

4. Inverse-Gamma prior for an Exponential mean

If Xᵢ ~ Exp(μ) with density f(x|μ)=μ⁻¹e^{-x/μ}, then an inverse-gamma prior for μ is conjugate.

Posterior
Posterior mean
P(μ ≤ μ₀ | x)
Decision
μ | x ~ Inv-Gamma(a+n, b+S)

5. Normal–Normal update for a mean

Known sampling standard deviation σ. The posterior mean is a precision-weighted average of the prior mean and the sample mean.

Posterior mean
Posterior sd
Data weight
Approx. 95%
sₙ² = 1 / (1/s₀² + n/σ²), mₙ = sₙ²(m₀/s₀² + nx̄/σ²)

6. Credible interval interpretation

A 95% Bayesian credible interval means: after observing the data and using the prior, the posterior probability that θ lies in the interval is 0.95.

Classical confidence interval: the random procedure has long-run 95% coverage.
Bayesian credible interval: the posterior distribution assigns 95% probability to this interval.

Normal posterior interval calculator

Lower
Upper
Width
Meaningposterior prob.

Loss functions and Bayes estimators

LossBayes estimatorIntuition
Squared errorposterior meanbalances squared distance
Absolute errorposterior medianbalances posterior probability
0–1 lossposterior mode/MAPmost likely parameter value
Teaching connection

Earlier in the course, the mean minimized expected squared error and the median minimized expected absolute error. Bayesian point estimation applies the same principle, but expectation is taken with respect to the posterior distribution.

7. Bayesian hypothesis testing

For one-sided tests such as H₀: θ ≤ θ₀ versus H₁: θ > θ₀, a direct Bayesian rule is based on posterior probability.

Reject H₀ if P(θ ≤ θ₀ | data) < α_B

Poisson mean example

Using the Gamma–Poisson calculator above with n=20, a=2, b=1, S=130, λ₀=5 gives the posterior Gamma(132,21). The decision rule compares P(λ≤5|data) to 0.05.

Exponential mean example

Using the inverse-gamma calculator above with n=15, a=3, b=10, S=160, μ₀=8 gives the posterior Inv-Gamma(18,170). The decision rule compares P(μ≤8|data) to 0.05.

8. Posterior predictive distribution

After learning about θ, Bayesian inference can predict a future observation by averaging over posterior uncertainty.

p(x_new | data) = ∫ p(x_new | θ)π(θ | data)dθ

Beta–Binomial predictive probability

For future m binary trials after posterior Beta(A,B), the predictive mean number of successes is mA/(A+B).

Predictive mean
Approx P(X≥r)
Posterior A
Posterior B

9. Why MCMC?

When the posterior cannot be written in a convenient closed form, we often approximate it by simulation. Markov chain Monte Carlo builds a dependent sequence whose long-run distribution is the posterior.

θ⁽¹⁾, θ⁽²⁾, … approximately sampled from π(θ | data)

Random-walk Metropolis toy sampler

Target density: a two-component posterior-like mixture. Move the proposal scale to see acceptance and mixing.

Acceptance
Sample mean
Target ideaposterior sample
Warningcheck mixing

10. Quick self-checks

Q1

For Poisson data with Gamma(a,b) prior, what statistic is sufficient for updating λ?

Q2

Under squared error loss, the Bayes estimator is the posterior...

Q3

A 95% credible interval means...