8  Chapter 8. When Diagonalization Fails: Jordan Canonical Form

Generalized eigenvectors, nilpotent motion, and the hidden structure of repeated eigenvalues

Guiding question.
If a matrix does not have enough eigenvectors to be diagonalized, what is the next best coordinate system?

In Chapter 7, diagonalization told a beautiful story. If a matrix has enough eigenvectors, then we can choose an eigenvector basis and the matrix becomes diagonal:

\[ A=PDP^{-1}. \]

In that coordinate system, every coordinate evolves independently. Powers, exponentials, and dynamical systems become easy.

But not every matrix has enough eigenvectors. A repeated eigenvalue may give only one eigendirection. At first, this looks like failure. Jordan canonical form explains that the failure is structured. Missing eigenvectors are replaced by generalized eigenvectors, and these generalized eigenvectors form chains. Each chain becomes one Jordan block.

NoteMain idea

Jordan form is diagonalization plus correction terms for missing eigenvectors:

\[ \boxed{A=PJP^{-1},\qquad J=\text{block diagonal matrix of Jordan blocks}.} \]

A diagonal matrix is the special case in which every Jordan block has size \(1\).

TipAI and coding companion

Ask an AI tool to explain the difference between an eigenvector and a generalized eigenvector using a simple \(2\times2\) Jordan block. Then verify the formulas by hand and with Python.

8.1 8.1 The obstruction: not enough eigenvectors

Consider

\[ A=\begin{bmatrix}2&1\\0&2\end{bmatrix}. \]

The only eigenvalue is \(\lambda=2\) because

\[ \det(A-\lambda I)=(2-\lambda)^2. \]

But

\[ A-2I=\begin{bmatrix}0&1\\0&0\end{bmatrix}, \]

so

\[ \ker(A-2I)=\left\{\begin{bmatrix}x\\0\end{bmatrix}:x\in\mathbb R\right\}. \]

Thus the eigenspace is one-dimensional, even though the algebraic multiplicity of \(2\) is two. We do not have enough eigenvectors to diagonalize \(A\).

WarningWarning

A repeated eigenvalue does not automatically cause a problem. The problem occurs when the eigenspace is too small.

Proof: why this matrix is not diagonalizable

A \(2\times2\) matrix is diagonalizable over \(\mathbb R\) if it has a basis of two linearly independent eigenvectors. Here the eigenspace for the only eigenvalue \(2\) is one-dimensional. Therefore there is only one independent eigenvector. Hence \(A\) is not diagonalizable.

8.2 8.2 Nilpotent matrices: motion that eventually disappears

NoteDefinition 8.1: Nilpotent matrix

A square matrix \(N\) is called nilpotent if there exists a positive integer \(k\) such that

\[ N^k=0. \]

The smallest such \(k\) is called the nilpotency index of \(N\).

A basic nilpotent matrix is

\[ N=\begin{bmatrix}0&1&0\\0&0&1\\0&0&0\end{bmatrix}. \]

It satisfies

\[ N^2=\begin{bmatrix}0&0&1\\0&0&0\\0&0&0\end{bmatrix}, \qquad N^3=0. \]

So \(N\) is nilpotent of index \(3\).

ImportantInterpretation

Nilpotent means “eventually zero.” A nilpotent matrix may move a vector at first, but after enough applications, every vector is sent to zero.

Proof: a nilpotent matrix has only eigenvalue \(0\)

Suppose \(Nv=\lambda v\) with \(v\neq0\). If \(N^k=0\), then

\[ 0=N^kv=\lambda^k v. \]

Since \(v\neq0\), this implies \(\lambda^k=0\), hence \(\lambda=0\).

8.3 8.3 Jordan blocks

NoteDefinition 8.2: Jordan block

For \(\lambda\in\mathbb F\) and \(k\ge1\), the \(k\times k\) Jordan block with eigenvalue \(\lambda\) is

\[ J_{\lambda,k}=\begin{bmatrix} \lambda&1&0&\cdots&0\\ 0&\lambda&1&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\vdots\\ 0&0&\cdots&\lambda&1\\ 0&0&\cdots&0&\lambda \end{bmatrix}. \]

A Jordan block can be written as

\[ J_{\lambda,k}=\lambda I+N_k, \]

where

\[ N_k=\begin{bmatrix} 0&1&0&\cdots&0\\ 0&0&1&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\vdots\\ 0&0&\cdots&0&1\\ 0&0&\cdots&0&0 \end{bmatrix}. \]

The matrix \(N_k\) is nilpotent.

NoteTheorem 8.3: Meaning of a Jordan block

On a Jordan block \(J_{\lambda,k}\), the matrix action is the sum of two parts:

  1. the diagonal eigenvalue part \(\lambda I\);
  2. the nilpotent shift part \(N_k\).

Thus a Jordan block represents an eigendirection together with a chain of generalized directions.

Proof idea

The formula \(J_{\lambda,k}=\lambda I+N_k\) is immediate from the entries. The diagonal entries are \(\lambda\), and the only off-diagonal nonzero entries are the \(1\)’s just above the diagonal. Those \(1\)’s form the nilpotent shift \(N_k\).

8.4 8.4 Generalized eigenvectors and Jordan chains

NoteDefinition 8.4: Generalized eigenvector

Let \(A\in\mathbb F^{n\times n}\) and let \(\lambda\) be an eigenvalue of \(A\). A nonzero vector \(v\) is a generalized eigenvector for \(\lambda\) if

\[ (A-\lambda I)^m v=0 \]

for some positive integer \(m\).

Ordinary eigenvectors correspond to \(m=1\).

NoteDefinition 8.5: Jordan chain

A sequence of nonzero vectors

\[ v_1,v_2,\ldots,v_k \]

is called a Jordan chain for \(A\) with eigenvalue \(\lambda\) if

\[ (A-\lambda I)v_1=0, \]

and

\[ (A-\lambda I)v_j=v_{j-1},\qquad j=2,\ldots,k. \]

In this chain, \(v_1\) is an ordinary eigenvector. The vectors \(v_2,\ldots,v_k\) are generalized eigenvectors.

NoteProposition 8.6: Matrix of a Jordan chain

If \(v_1,\ldots,v_k\) is a Jordan chain for \(A\) with eigenvalue \(\lambda\), then the matrix of \(A\) in the ordered basis

\[ v_1,v_2,\ldots,v_k \]

is the Jordan block \(J_{\lambda,k}\).

Proof

From the chain conditions,

\[ (A-\lambda I)v_1=0, \qquad (A-\lambda I)v_j=v_{j-1}\quad (j\ge2). \]

Therefore

\[ Av_1=\lambda v_1, \]

and

\[ Av_j=v_{j-1}+\lambda v_j\quad (j\ge2). \]

These equations say that the coordinate columns of \(A\) in the basis \(v_1,\ldots,v_k\) are exactly the columns of the Jordan block.

8.5 8.5 Generalized eigenspaces

NoteDefinition 8.7: Generalized eigenspace

The generalized eigenspace of \(A\) associated with \(\lambda\) is

\[ G_\lambda=\ker\bigl((A-\lambda I)^n\bigr), \]

where \(A\) is an \(n\times n\) matrix.

The ordinary eigenspace is

\[ E_\lambda=\ker(A-\lambda I). \]

The generalized eigenspace may be larger. It contains all vectors that eventually become eigenvectors, then eventually become zero after repeated application of \(A-\lambda I\).

NoteTheorem 8.8: Generalized eigenspace decomposition

Assume the characteristic polynomial of \(A\in\mathbb C^{n\times n}\) splits as

\[ p_A(t)=\prod_{i=1}^s(t-\lambda_i)^{a_i}. \]

Then

\[ \mathbb C^n=G_{\lambda_1}\oplus\cdots\oplus G_{\lambda_s}. \]

Proof idea

The main idea is that different factors \((t-\lambda_i)^{a_i}\) of the characteristic polynomial are relatively prime. Using polynomial identities, one builds projection operators onto the corresponding generalized eigenspaces. This separates the whole space into independent generalized eigenspace components.

8.6 8.6 Jordan canonical form

NoteTheorem 8.9: Jordan canonical form

Let \(A\in\mathbb C^{n\times n}\). Then there exists an invertible matrix \(P\) such that

\[ A=PJP^{-1}, \]

where \(J\) is block diagonal and each block is a Jordan block:

\[ J=\begin{bmatrix} J_{\lambda_1,k_1}&&0\\ &\ddots&\\ 0&&J_{\lambda_r,k_r} \end{bmatrix}. \]

The matrix \(J\) is called a Jordan canonical form of \(A\).

The theorem says that every complex square matrix can be understood by generalized eigenvector chains.

ImportantStory

Diagonalization asks for enough eigenvectors. Jordan form asks for enough generalized eigenvectors. Over \(\mathbb C\), generalized eigenvectors always provide a basis.

Proof idea

The proof has two layers. First, decompose \(\mathbb C^n\) into generalized eigenspaces. Second, on each generalized eigenspace, study the nilpotent operator \(N=A-\lambda I\). Nilpotent operators can be organized into chains. Each chain gives one Jordan block.

8.7 8.7 Diagonalization as a special case

A diagonal matrix is a Jordan form in which every Jordan block has size \(1\):

\[ J_{\lambda,1}=[\lambda]. \]

NoteTheorem 8.10: Diagonalizability and Jordan blocks

A matrix \(A\) is diagonalizable over \(\mathbb C\) if and only if every Jordan block of \(A\) has size \(1\).

Proof

If every Jordan block has size \(1\), then the Jordan matrix is diagonal, so \(A\) is similar to a diagonal matrix. Conversely, if \(A\) is diagonalizable, it has a basis of ordinary eigenvectors. In such a basis, the matrix is diagonal, so no Jordan block of size larger than \(1\) can appear.

8.8 8.8 How to find Jordan block sizes

Let

\[ N=A-\lambda I. \]

Define

\[ d_j=\dim\ker(N^j),\qquad j=1,2,\ldots. \]

NoteTheorem 8.11: Kernel dimensions detect block sizes

For a fixed eigenvalue \(\lambda\), the number

\[ d_j-d_{j-1} \]

equals the number of Jordan blocks for \(\lambda\) of size at least \(j\), where \(d_0=0\).

Therefore the number of Jordan blocks of size exactly \(j\) is

\[ (d_j-d_{j-1})-(d_{j+1}-d_j). \]

Proof idea

For one Jordan block of size \(k\), the nilpotent part \(N\) satisfies

\[ \dim\ker(N^j)=\min(j,k). \]

Thus this block contributes \(1\) to \(d_j-d_{j-1}\) exactly when \(j\le k\). Adding over all blocks gives the formula.

8.8.1 Example 8.12: block sizes from kernel dimensions

Suppose the algebraic multiplicity of \(\lambda\) is \(5\) and

\[ d_1=2, \qquad d_2=4, \qquad d_3=5. \]

Then

\[ d_1-d_0=2, \qquad d_2-d_1=2, \qquad d_3-d_2=1. \]

So there are two blocks of size at least \(1\), two blocks of size at least \(2\), and one block of size at least \(3\). Therefore the block sizes are

\[ 3 \quad\text{and}\quad 2. \]

8.9 8.9 Python computation: Jordan blocks and powers

The following code constructs a Jordan block and compares direct powers with the binomial formula.

Code
import sympy as sp

lam = sp.Symbol('lambda')
k = 3
N = sp.zeros(k)
for i in range(k-1):
    N[i, i+1] = 1
J = lam*sp.eye(k) + N
J

\(\displaystyle \left[\begin{matrix}\lambda & 1 & 0\\0 & \lambda & 1\\0 & 0 & \lambda\end{matrix}\right]\)

For a Jordan block,

\[ J_{\lambda,k}^m=(\lambda I+N)^m =\sum_{r=0}^{k-1}\binom{m}{r}\lambda^{m-r}N^r. \]

Code
lam_value = 2
m = 5
J2 = J.subs(lam, lam_value)
J2**m

\(\displaystyle \left[\begin{matrix}32 & 80 & 80\\0 & 32 & 80\\0 & 0 & 32\end{matrix}\right]\)

8.10 8.10 Powers of a Jordan block

NoteProposition 8.13: Powers of a Jordan block

Let

\[ J=J_{\lambda,k}=\lambda I+N, \]

where \(N^k=0\). Then for every positive integer \(m\),

\[ J^m=\sum_{r=0}^{k-1}\binom{m}{r}\lambda^{m-r}N^r. \]

Proof

Since \(\lambda I\) commutes with \(N\), the binomial theorem gives

\[ (\lambda I+N)^m=\sum_{r=0}^{m}\binom{m}{r}(\lambda I)^{m-r}N^r. \]

But \(N^r=0\) for \(r\ge k\). Therefore only the terms \(r=0,1,\ldots,k-1\) remain.

8.10.1 Example 8.14: a \(2\times2\) block

Let

\[ J=\begin{bmatrix}\lambda&1\\0&\lambda\end{bmatrix}. \]

Then

\[ J^m=\begin{bmatrix}\lambda^m&m\lambda^{m-1}\\0&\lambda^m\end{bmatrix}. \]

The off-diagonal term \(m\lambda^{m-1}\) is the polynomial correction caused by the missing eigenvector.

8.11 8.11 Matrix exponentials

Matrix exponentials appear in differential equations

\[ \frac{d}{dt}x(t)=Ax(t), \qquad x(t)=e^{tA}x(0). \]

For a Jordan block,

\[ e^{tJ}=e^{t(\lambda I+N)}=e^{\lambda t}e^{tN}. \]

Since \(N^k=0\),

\[ e^{tN}=I+tN+\frac{t^2}{2!}N^2+\cdots+\frac{t^{k-1}}{(k-1)!}N^{k-1}. \]

NoteProposition 8.15: Exponential of a Jordan block

If \(J=J_{\lambda,k}=\lambda I+N\), then

\[ e^{tJ}=e^{\lambda t}\sum_{r=0}^{k-1}\frac{t^r}{r!}N^r. \]

Proof

The matrices \(\lambda I\) and \(N\) commute. Hence

\[ e^{t(\lambda I+N)}=e^{t\lambda I}e^{tN}=e^{\lambda t}e^{tN}. \]

Because \(N\) is nilpotent, the exponential series for \(e^{tN}\) stops after finitely many terms.

8.12 8.12 Minimal polynomial and largest Jordan blocks

NoteDefinition 8.16: Minimal polynomial

The minimal polynomial of a square matrix \(A\) is the monic polynomial \(m_A(t)\) of least degree such that

\[ m_A(A)=0. \]

NoteTheorem 8.17: Minimal polynomial and Jordan blocks

Suppose the Jordan form of \(A\) has eigenvalues \(\lambda_1,\ldots,\lambda_s\). For each eigenvalue \(\lambda_i\), let \(r_i\) be the size of the largest Jordan block associated with \(\lambda_i\). Then

\[ m_A(t)=\prod_{i=1}^s(t-\lambda_i)^{r_i}. \]

Proof idea

A Jordan block of size \(r\) for eigenvalue \(\lambda\) is killed by \((t-\lambda)^r\) but not by any smaller power. If several blocks share the same eigenvalue, the largest block requires the largest exponent. Multiplying over all distinct eigenvalues kills all blocks.

8.13 8.13 Numerical warning

Jordan form is powerful for exact theory, but it is unstable for numerical computation. A tiny perturbation can change the Jordan structure.

For example,

\[ \begin{bmatrix}1&1\\0&1\end{bmatrix} \]

is not diagonalizable, but

\[ \begin{bmatrix}1&1\\0&1+\varepsilon\end{bmatrix} \]

has two distinct eigenvalues when \(\varepsilon\ne0\), so it is diagonalizable.

WarningPractical message

Jordan form is a theoretical microscope. In numerical linear algebra, stable tools such as Schur decomposition, QR methods, and SVD are often preferred.

8.14 8.14 AI companion activities

8.14.1 Activity A: Explain generalized eigenvectors

Ask an AI tool:

Explain the difference between an eigenvector and a generalized eigenvector using the matrix \(\begin{bmatrix}2&1\\0&2\end{bmatrix}\).

Then verify its answer by computing \(\ker(A-2I)\) and \(\ker((A-2I)^2)\).

8.14.2 Activity B: Find the chain

Ask:

For \(A=\begin{bmatrix}2&1\\0&2\end{bmatrix}\), find a Jordan chain and explain why it gives a Jordan block.

Check that

\[ (A-2I)e_1=0, \qquad (A-2I)e_2=e_1. \]

8.14.3 Activity C: Compare diagonalization and Jordan form

Ask:

Why does a nontrivial Jordan block create polynomial factors in \(A^m\) and \(e^{tA}\)?

Then connect the answer to the formulas in Sections 8.10 and 8.11.

8.15 8.15 Challenge questions

8.15.1 Challenge 1

Let

\[ A=\begin{bmatrix}3&1&0\\0&3&1\\0&0&3\end{bmatrix}. \]

Find \(A^m\).

Solution

Write \(A=3I+N\), where

\[ N=\begin{bmatrix}0&1&0\\0&0&1\\0&0&0\end{bmatrix}, \qquad N^3=0. \]

Therefore

\[ A^m=3^mI+m3^{m-1}N+\binom{m}{2}3^{m-2}N^2. \]

Thus

\[ A^m= \begin{bmatrix} 3^m&m3^{m-1}&\binom{m}{2}3^{m-2}\\ 0&3^m&m3^{m-1}\\ 0&0&3^m \end{bmatrix}. \]

8.15.2 Challenge 2

Suppose for one eigenvalue \(\lambda\) the kernel dimensions are

\[ d_1=3, \qquad d_2=5, \qquad d_3=6. \]

Find the Jordan block sizes.

Solution

We compute

\[ d_1-d_0=3, \qquad d_2-d_1=2, \qquad d_3-d_2=1. \]

So there are three blocks of size at least \(1\), two blocks of size at least \(2\), and one block of size at least \(3\). Hence the block sizes are

\[ 3,2,1. \]

8.16 8.16 Practice problems

8.16.1 Problem 1

Determine whether

\[ A=\begin{bmatrix}4&1\\0&4\end{bmatrix} \]

is diagonalizable.

Solution

The only eigenvalue is \(4\). Since

\[ A-4I=\begin{bmatrix}0&1\\0&0\end{bmatrix}, \]

the eigenspace is one-dimensional. Therefore \(A\) is not diagonalizable.

8.16.2 Problem 2

Find a Jordan chain for

\[ A=\begin{bmatrix}4&1\\0&4\end{bmatrix}. \]

Solution

Let \(v_1=e_1\) and \(v_2=e_2\). Then

\[ (A-4I)v_1=0, \qquad (A-4I)v_2=v_1. \]

Thus \(v_1,v_2\) is a Jordan chain.

8.16.3 Problem 3

Let \(J=J_{2,2}\). Compute \(J^5\).

Solution

For a \(2\times2\) Jordan block,

\[ J^m=\begin{bmatrix}2^m&m2^{m-1}\\0&2^m\end{bmatrix}. \]

Thus

\[ J^5=\begin{bmatrix}32&80\\0&32\end{bmatrix}. \]

8.16.4 Problem 4

If the largest Jordan block for eigenvalue \(1\) has size \(3\) and the largest Jordan block for eigenvalue \(-2\) has size \(1\), find the minimal polynomial.

Solution

The minimal polynomial is

\[ m_A(t)=(t-1)^3(t+2). \]

8.17 8.17 Summary

Jordan canonical form explains the precise structure behind the failure of diagonalization.

Concept Meaning
Eigenvector A vector killed by \(A-\lambda I\)
Generalized eigenvector A vector eventually killed by powers of \(A-\lambda I\)
Jordan chain A sequence moved downward by \(A-\lambda I\)
Jordan block The matrix representation of one chain
Diagonalizable matrix All Jordan blocks have size \(1\)
Minimal polynomial Records the largest Jordan block for each eigenvalue

The central message is:

\[ \boxed{\text{Jordan form replaces missing eigenvectors with generalized eigenvector chains.}} \]