MATH 5010 Section 3
Joint and Conditional Distributions

An interactive teaching page for joint distributions, marginals, conditional distributions, independence, total probability, Bayes' theorem, and visual intuition for two random variables.

Section 3Joint distributionsConditional probabilityBayes theoremSimulation checks

1. Big picture

Section 1 studied events. Section 2 studied one random variable. Section 3 studies two random variables together: how they vary jointly, how to summarize each one separately, and how knowing one variable changes the distribution of the other.

Joint

The joint distribution answers questions such as $P(X=x,Y=y)$ or $P((X,Y)\in A)$.

Marginal

The marginal distribution of $X$ ignores $Y$ by summing or integrating over all $Y$ values.

Conditional

The conditional distribution of $X$ given $Y=y$ updates probabilities after information about $Y$ is known.

$$p_{X,Y}(x,y)=P(X=x,Y=y),\qquad p_X(x)=\sum_y p_{X,Y}(x,y),\qquad p_{X\mid Y}(x\mid y)=\frac{p_{X,Y}(x,y)}{p_Y(y)}.$$
Teaching message: Joint distribution is the full story. Marginal and conditional distributions are summaries of that story.

2. Joint PMF table

For discrete random variables, the joint PMF can be stored in a table. The entries must be nonnegative and the total sum must be $1$.

$$p_{X,Y}(x,y)\ge 0,\qquad \sum_x\sum_y p_{X,Y}(x,y)=1.$$

Interactive joint table

Edit the four probabilities below. Click Normalize if the total is not $1$.

$Y=0$$Y=1$Row sum $p_X(x)$
$X=0$
$X=1$
Column sum $p_Y(y)$

Read from the table

$P(X=1,Y=0)=$

$P(X=1)=$

$P(Y=1)=$

$P(X=1\mid Y=1)=$

3. Marginal distributions

The word marginal comes from the margins of a table: row sums and column sums.

Discrete case

$$p_X(x)=\sum_y p_{X,Y}(x,y),\qquad p_Y(y)=\sum_x p_{X,Y}(x,y).$$

Continuous case

$$f_X(x)=\int_{-\infty}^{\infty}f_{X,Y}(x,y)\,dy,\qquad f_Y(y)=\int_{-\infty}^{\infty}f_{X,Y}(x,y)\,dx.$$
Common mistake: A marginal distribution is not found by choosing one row or one column. It is found by summing or integrating out the other variable.

4. Conditional distributions

A conditional distribution rescales one slice of the joint distribution so that the slice has total probability $1$.

$$p_{X\mid Y}(x\mid y)=\frac{p_{X,Y}(x,y)}{p_Y(y)},\qquad f_{X\mid Y}(x\mid y)=\frac{f_{X,Y}(x,y)}{f_Y(y)}.$$

Conditional probabilities from current table

The bars in the plot should add to $1$ because a conditional distribution is a complete probability distribution after conditioning.

5. Independence checker

Two discrete random variables are independent if every joint entry factors into the product of its marginals.

$$X\perp Y \quad \Longleftrightarrow \quad p_{X,Y}(x,y)=p_X(x)p_Y(y)\text{ for all }x,y.$$

Current table result

Maximum factorization error:

6. Total probability and Bayes' theorem

The law of total probability combines several conditional probabilities. Bayes' theorem reverses the conditioning direction.

$$P(T)=P(T\mid D)P(D)+P(T\mid D^c)P(D^c),\qquad P(D\mid T)=\frac{P(T\mid D)P(D)}{P(T)}.$$

Interactive medical test example

0.010
0.950
0.100

Computed probabilities

Overall positive rate $P(+)=$

Posterior probability $P(D\mid +)=$

With the default values, the overall positive probability is $(0.95)(0.01)+(0.10)(0.99)=0.1085$, matching the course homework-style calculation.

7. Continuous joint density

For two continuous random variables, probability is volume under a surface over a region.

$$P((X,Y)\in A)=\iint_A f_{X,Y}(x,y)\,dx\,dy.$$

Uniform unit square: geometric probability

Let $(X,Y)$ be uniform on $[0,1]^2$. Then $f_{X,Y}(x,y)=1$ on the square, so probabilities equal areas.

$$P(X^2+Y\le 1)=\int_0^1(1-x^2)\,dx=\frac{2}{3}.$$
2.00

For $0

$$P(Y\le 1-X^a)=\int_0^1(1-x^a)\,dx=\frac{a}{a+1}.$$

Current probability:

At $a=2$, this reduces to the familiar $2/3$ geometric probability.

8. Bivariate normal explorer

A standard bivariate normal pair has correlation parameter $\rho$. When $\rho=0$, the variables are independent. As $|\rho|$ increases, the cloud becomes more line-shaped.

$$f(x,y)=\frac{1}{2\pi\sqrt{1-\rho^2}}\exp\left[-\frac{x^2-2\rho xy+y^2}{2(1-\rho^2)}\right].$$
0.60

Conditional normal intuition

For a standard bivariate normal pair,

$$X\mid Y=y\sim N(\rho y,1-\rho^2).$$

1.0

Conditional mean $E[X\mid Y=y]=$

Conditional variance $\mathrm{Var}(X\mid Y=y)=$

9. Simulation lab

Simulation helps students connect joint distributions to empirical frequencies.

Simulate the current joint table

3000

10. Self-check quiz