MATH 5010 Section 10 — Monte Carlo Simulation and Importance Sampling

1. Why Monte Carlo?

Turn integration into averaging

Monte Carlo methods replace a difficult deterministic calculation by a random experiment. The key principle is simple: if $U_1,\dots,U_n$ are independent Uniform$(0,1)$ random variables, then

$$\int_0^1 g(x)\,dx=E[g(U)]\approx \frac1n\sum_{i=1}^n g(U_i).$$

Teaching message: Monte Carlo accuracy is governed by variance. More samples help, but smarter sampling can help much more.

2. Monte Carlo integration

Approximate an integral on $[0,1]$

FunctionNumber of random points: 1000

Estimator and standard error

$$\hat I_n=\frac1n\sum_{i=1}^n g(U_i),\qquad SE(\hat I_n)\approx \frac{s_g}{\sqrt n}.$$

Here $s_g$ is the sample standard deviation of $g(U_1),\dots,g(U_n)$.

Running estimate

The green horizontal line is the high-accuracy reference value computed by a deterministic grid.

3. Geometric Monte Carlo

Estimate $\pi$ from random points

Throw points uniformly in the unit square. The quarter circle $x^2+y^2\le1$ has area $\pi/4$.

Number of points: 1500

$$\hat \pi=4\cdot\frac{\#\{(X_i,Y_i):X_i^2+Y_i^2\le1\}}{n}.$$

Unit square experiment

4. Estimating probabilities and expectations

Tail probability by simulation

Estimate $P(Z>a)$ for $Z\sim N(0,1)$. This is easy for moderate tails but inefficient for rare tails.

Threshold $a$: 2.00Monte Carlo samples: 10000

For rare events, direct simulation may produce many zeros. Importance sampling fixes this by sampling more often from the important region.

Indicator estimates

Each plotted point is a block estimate using a subset of the simulation.

5. Inverse transform sampling

Generate exponential samples from uniforms

If $U\sim\mathrm{Unif}(0,1)$, then $X=-\theta\log(1-U)$ has Exponential mean $\theta$.

Mean $\theta$: 1.50Number of samples: 3000

$$F_X(x)=1-e^{-x/\theta},\qquad X=F_X^{-1}(U)=-\theta\log(1-U).$$

Histogram and density

6. Importance sampling

Rare normal tail $P(Z>a)$

Instead of sampling $Z\sim N(0,1)$ directly, sample $Y\sim N(a,1)$ and reweight.

Threshold $a$: 4.00Samples: 10000

Why the weight appears

$$p=P(Z>a)=\int_a^\infty f(x)\,dx=\int_a^\infty \frac{f(x)}{q(x)}q(x)\,dx=E_q\left[1_{Y>a}\frac{f(Y)}{q(Y)}\right].$$

Here $f$ is the $N(0,1)$ density and $q$ is the $N(a,1)$ proposal density.

Estimator comparison

Lower variability means a more stable estimate for the same sample size.

7. Quick checks

Self-check questions

Q1

Why does Monte Carlo integration work?

Q2

What happens to Monte Carlo standard error when $n$ is multiplied by 100?

Q3

What is the purpose of importance sampling?