The central object: $X_1,\ldots,X_n$
A random sample from a distribution $F$ means
Identically distributed
Each $X_i$ has the same CDF $F$ and the same mean and variance.
Independent
Knowing one observation does not change the distribution of another.
Statistics
A statistic is any function of the sample, such as $\bar X$, $S^2$, $X_{(1)}$, or $X_{(n)}$.
Generate one sample
Sample histogram
The histogram shows a single realized sample. Repeated samples fluctuate, but their long-run behavior is governed by the parent distribution.
Many sample means
Generate many independent samples of size $n$. Each sample gives one $\bar X$. The histogram below is an empirical sampling distribution.
Histogram of $\bar X$
As $n$ grows, the distribution of $\bar X$ becomes more concentrated around $\mu$.
Empirical CDF
The empirical CDF estimates the true CDF using the sample:
For Uniform(0,1), the true CDF is $F(x)=x$ on $0\le x\le 1$. Generate uniform samples and watch the step function approach the diagonal.
ECDF vs true CDF
$k$-th smallest observation
Sort the sample:
For $X_i\sim \operatorname{Uniform}(0,1)$, the $k$-th order statistic has density
Simulation vs Beta curve
Monte Carlo integration
Random samples can approximate integrals. For $U_i\sim\operatorname{Uniform}(0,1)$,
Estimate as points accumulate
Practice questions
1. If $X_1,\ldots,X_n$ are iid with mean $\mu$ and variance $\sigma^2$, what are $E[\bar X]$ and $\operatorname{Var}(\bar X)$?
Answer: $E[\bar X]=\mu$ and $\operatorname{Var}(\bar X)=\sigma^2/n$.
2. For a Uniform(0,1) sample of size $n$, what is $E[X_{(k)}]$?
Since $X_{(k)}\sim \operatorname{Beta}(k,n+1-k)$, $E[X_{(k)}]=k/(n+1)$.
3. For $n=5$, what are the expected minimum, median, and maximum of a Uniform(0,1) sample?
$E[X_{(1)}]=1/6$, $E[X_{(3)}]=3/6=1/2$, and $E[X_{(5)}]=5/6$.
4. Why is $S^2$ divided by $n-1$ instead of $n$?
The sample mean $\bar X$ is estimated from the same data, using one degree of freedom. Dividing by $n-1$ makes $S^2$ an unbiased estimator of $\sigma^2$.