Code
import sympy as sp
lam = sp.Symbol('lambda')
k = 3
N = sp.zeros(k)
for i in range(k-1):
N[i, i+1] = 1
J = lam*sp.eye(k) + N
J\(\displaystyle \left[\begin{matrix}\lambda & 1 & 0\\0 & \lambda & 1\\0 & 0 & \lambda\end{matrix}\right]\)
Generalized eigenvectors, nilpotent motion, and the hidden structure of repeated eigenvalues
Guiding question.
If a matrix does not have enough eigenvectors to be diagonalized, what is the next best coordinate system?
In Chapter 7, diagonalization told a beautiful story. If a matrix has enough eigenvectors, then we can choose an eigenvector basis and the matrix becomes diagonal:
\[ A=PDP^{-1}. \]
In that coordinate system, every coordinate evolves independently. Powers, exponentials, and dynamical systems become easy.
But not every matrix has enough eigenvectors. A repeated eigenvalue may give only one eigendirection. At first, this looks like failure. Jordan canonical form explains that the failure is structured. Missing eigenvectors are replaced by generalized eigenvectors, and these generalized eigenvectors form chains. Each chain becomes one Jordan block.
Jordan form is diagonalization plus correction terms for missing eigenvectors:
\[ \boxed{A=PJP^{-1},\qquad J=\text{block diagonal matrix of Jordan blocks}.} \]
A diagonal matrix is the special case in which every Jordan block has size \(1\).
Ask an AI tool to explain the difference between an eigenvector and a generalized eigenvector using a simple \(2\times2\) Jordan block. Then verify the formulas by hand and with Python.
Consider
\[ A=\begin{bmatrix}2&1\\0&2\end{bmatrix}. \]
The only eigenvalue is \(\lambda=2\) because
\[ \det(A-\lambda I)=(2-\lambda)^2. \]
But
\[ A-2I=\begin{bmatrix}0&1\\0&0\end{bmatrix}, \]
so
\[ \ker(A-2I)=\left\{\begin{bmatrix}x\\0\end{bmatrix}:x\in\mathbb R\right\}. \]
Thus the eigenspace is one-dimensional, even though the algebraic multiplicity of \(2\) is two. We do not have enough eigenvectors to diagonalize \(A\).
A repeated eigenvalue does not automatically cause a problem. The problem occurs when the eigenspace is too small.
A \(2\times2\) matrix is diagonalizable over \(\mathbb R\) if it has a basis of two linearly independent eigenvectors. Here the eigenspace for the only eigenvalue \(2\) is one-dimensional. Therefore there is only one independent eigenvector. Hence \(A\) is not diagonalizable.
A square matrix \(N\) is called nilpotent if there exists a positive integer \(k\) such that
\[ N^k=0. \]
The smallest such \(k\) is called the nilpotency index of \(N\).
A basic nilpotent matrix is
\[ N=\begin{bmatrix}0&1&0\\0&0&1\\0&0&0\end{bmatrix}. \]
It satisfies
\[ N^2=\begin{bmatrix}0&0&1\\0&0&0\\0&0&0\end{bmatrix}, \qquad N^3=0. \]
So \(N\) is nilpotent of index \(3\).
Nilpotent means “eventually zero.” A nilpotent matrix may move a vector at first, but after enough applications, every vector is sent to zero.
Suppose \(Nv=\lambda v\) with \(v\neq0\). If \(N^k=0\), then
\[ 0=N^kv=\lambda^k v. \]
Since \(v\neq0\), this implies \(\lambda^k=0\), hence \(\lambda=0\).
For \(\lambda\in\mathbb F\) and \(k\ge1\), the \(k\times k\) Jordan block with eigenvalue \(\lambda\) is
\[ J_{\lambda,k}=\begin{bmatrix} \lambda&1&0&\cdots&0\\ 0&\lambda&1&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\vdots\\ 0&0&\cdots&\lambda&1\\ 0&0&\cdots&0&\lambda \end{bmatrix}. \]
A Jordan block can be written as
\[ J_{\lambda,k}=\lambda I+N_k, \]
where
\[ N_k=\begin{bmatrix} 0&1&0&\cdots&0\\ 0&0&1&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\vdots\\ 0&0&\cdots&0&1\\ 0&0&\cdots&0&0 \end{bmatrix}. \]
The matrix \(N_k\) is nilpotent.
On a Jordan block \(J_{\lambda,k}\), the matrix action is the sum of two parts:
Thus a Jordan block represents an eigendirection together with a chain of generalized directions.
The formula \(J_{\lambda,k}=\lambda I+N_k\) is immediate from the entries. The diagonal entries are \(\lambda\), and the only off-diagonal nonzero entries are the \(1\)’s just above the diagonal. Those \(1\)’s form the nilpotent shift \(N_k\).
Let \(A\in\mathbb F^{n\times n}\) and let \(\lambda\) be an eigenvalue of \(A\). A nonzero vector \(v\) is a generalized eigenvector for \(\lambda\) if
\[ (A-\lambda I)^m v=0 \]
for some positive integer \(m\).
Ordinary eigenvectors correspond to \(m=1\).
A sequence of nonzero vectors
\[ v_1,v_2,\ldots,v_k \]
is called a Jordan chain for \(A\) with eigenvalue \(\lambda\) if
\[ (A-\lambda I)v_1=0, \]
and
\[ (A-\lambda I)v_j=v_{j-1},\qquad j=2,\ldots,k. \]
In this chain, \(v_1\) is an ordinary eigenvector. The vectors \(v_2,\ldots,v_k\) are generalized eigenvectors.
If \(v_1,\ldots,v_k\) is a Jordan chain for \(A\) with eigenvalue \(\lambda\), then the matrix of \(A\) in the ordered basis
\[ v_1,v_2,\ldots,v_k \]
is the Jordan block \(J_{\lambda,k}\).
From the chain conditions,
\[ (A-\lambda I)v_1=0, \qquad (A-\lambda I)v_j=v_{j-1}\quad (j\ge2). \]
Therefore
\[ Av_1=\lambda v_1, \]
and
\[ Av_j=v_{j-1}+\lambda v_j\quad (j\ge2). \]
These equations say that the coordinate columns of \(A\) in the basis \(v_1,\ldots,v_k\) are exactly the columns of the Jordan block.
The generalized eigenspace of \(A\) associated with \(\lambda\) is
\[ G_\lambda=\ker\bigl((A-\lambda I)^n\bigr), \]
where \(A\) is an \(n\times n\) matrix.
The ordinary eigenspace is
\[ E_\lambda=\ker(A-\lambda I). \]
The generalized eigenspace may be larger. It contains all vectors that eventually become eigenvectors, then eventually become zero after repeated application of \(A-\lambda I\).
Assume the characteristic polynomial of \(A\in\mathbb C^{n\times n}\) splits as
\[ p_A(t)=\prod_{i=1}^s(t-\lambda_i)^{a_i}. \]
Then
\[ \mathbb C^n=G_{\lambda_1}\oplus\cdots\oplus G_{\lambda_s}. \]
The main idea is that different factors \((t-\lambda_i)^{a_i}\) of the characteristic polynomial are relatively prime. Using polynomial identities, one builds projection operators onto the corresponding generalized eigenspaces. This separates the whole space into independent generalized eigenspace components.
Let \(A\in\mathbb C^{n\times n}\). Then there exists an invertible matrix \(P\) such that
\[ A=PJP^{-1}, \]
where \(J\) is block diagonal and each block is a Jordan block:
\[ J=\begin{bmatrix} J_{\lambda_1,k_1}&&0\\ &\ddots&\\ 0&&J_{\lambda_r,k_r} \end{bmatrix}. \]
The matrix \(J\) is called a Jordan canonical form of \(A\).
The theorem says that every complex square matrix can be understood by generalized eigenvector chains.
Diagonalization asks for enough eigenvectors. Jordan form asks for enough generalized eigenvectors. Over \(\mathbb C\), generalized eigenvectors always provide a basis.
The proof has two layers. First, decompose \(\mathbb C^n\) into generalized eigenspaces. Second, on each generalized eigenspace, study the nilpotent operator \(N=A-\lambda I\). Nilpotent operators can be organized into chains. Each chain gives one Jordan block.
A diagonal matrix is a Jordan form in which every Jordan block has size \(1\):
\[ J_{\lambda,1}=[\lambda]. \]
A matrix \(A\) is diagonalizable over \(\mathbb C\) if and only if every Jordan block of \(A\) has size \(1\).
If every Jordan block has size \(1\), then the Jordan matrix is diagonal, so \(A\) is similar to a diagonal matrix. Conversely, if \(A\) is diagonalizable, it has a basis of ordinary eigenvectors. In such a basis, the matrix is diagonal, so no Jordan block of size larger than \(1\) can appear.
Let
\[ N=A-\lambda I. \]
Define
\[ d_j=\dim\ker(N^j),\qquad j=1,2,\ldots. \]
For a fixed eigenvalue \(\lambda\), the number
\[ d_j-d_{j-1} \]
equals the number of Jordan blocks for \(\lambda\) of size at least \(j\), where \(d_0=0\).
Therefore the number of Jordan blocks of size exactly \(j\) is
\[ (d_j-d_{j-1})-(d_{j+1}-d_j). \]
For one Jordan block of size \(k\), the nilpotent part \(N\) satisfies
\[ \dim\ker(N^j)=\min(j,k). \]
Thus this block contributes \(1\) to \(d_j-d_{j-1}\) exactly when \(j\le k\). Adding over all blocks gives the formula.
Suppose the algebraic multiplicity of \(\lambda\) is \(5\) and
\[ d_1=2, \qquad d_2=4, \qquad d_3=5. \]
Then
\[ d_1-d_0=2, \qquad d_2-d_1=2, \qquad d_3-d_2=1. \]
So there are two blocks of size at least \(1\), two blocks of size at least \(2\), and one block of size at least \(3\). Therefore the block sizes are
\[ 3 \quad\text{and}\quad 2. \]
The following code constructs a Jordan block and compares direct powers with the binomial formula.
import sympy as sp
lam = sp.Symbol('lambda')
k = 3
N = sp.zeros(k)
for i in range(k-1):
N[i, i+1] = 1
J = lam*sp.eye(k) + N
J\(\displaystyle \left[\begin{matrix}\lambda & 1 & 0\\0 & \lambda & 1\\0 & 0 & \lambda\end{matrix}\right]\)
For a Jordan block,
\[ J_{\lambda,k}^m=(\lambda I+N)^m =\sum_{r=0}^{k-1}\binom{m}{r}\lambda^{m-r}N^r. \]
lam_value = 2
m = 5
J2 = J.subs(lam, lam_value)
J2**m\(\displaystyle \left[\begin{matrix}32 & 80 & 80\\0 & 32 & 80\\0 & 0 & 32\end{matrix}\right]\)
Let
\[ J=J_{\lambda,k}=\lambda I+N, \]
where \(N^k=0\). Then for every positive integer \(m\),
\[ J^m=\sum_{r=0}^{k-1}\binom{m}{r}\lambda^{m-r}N^r. \]
Since \(\lambda I\) commutes with \(N\), the binomial theorem gives
\[ (\lambda I+N)^m=\sum_{r=0}^{m}\binom{m}{r}(\lambda I)^{m-r}N^r. \]
But \(N^r=0\) for \(r\ge k\). Therefore only the terms \(r=0,1,\ldots,k-1\) remain.
Let
\[ J=\begin{bmatrix}\lambda&1\\0&\lambda\end{bmatrix}. \]
Then
\[ J^m=\begin{bmatrix}\lambda^m&m\lambda^{m-1}\\0&\lambda^m\end{bmatrix}. \]
The off-diagonal term \(m\lambda^{m-1}\) is the polynomial correction caused by the missing eigenvector.
Matrix exponentials appear in differential equations
\[ \frac{d}{dt}x(t)=Ax(t), \qquad x(t)=e^{tA}x(0). \]
For a Jordan block,
\[ e^{tJ}=e^{t(\lambda I+N)}=e^{\lambda t}e^{tN}. \]
Since \(N^k=0\),
\[ e^{tN}=I+tN+\frac{t^2}{2!}N^2+\cdots+\frac{t^{k-1}}{(k-1)!}N^{k-1}. \]
If \(J=J_{\lambda,k}=\lambda I+N\), then
\[ e^{tJ}=e^{\lambda t}\sum_{r=0}^{k-1}\frac{t^r}{r!}N^r. \]
The matrices \(\lambda I\) and \(N\) commute. Hence
\[ e^{t(\lambda I+N)}=e^{t\lambda I}e^{tN}=e^{\lambda t}e^{tN}. \]
Because \(N\) is nilpotent, the exponential series for \(e^{tN}\) stops after finitely many terms.
The minimal polynomial of a square matrix \(A\) is the monic polynomial \(m_A(t)\) of least degree such that
\[ m_A(A)=0. \]
Suppose the Jordan form of \(A\) has eigenvalues \(\lambda_1,\ldots,\lambda_s\). For each eigenvalue \(\lambda_i\), let \(r_i\) be the size of the largest Jordan block associated with \(\lambda_i\). Then
\[ m_A(t)=\prod_{i=1}^s(t-\lambda_i)^{r_i}. \]
A Jordan block of size \(r\) for eigenvalue \(\lambda\) is killed by \((t-\lambda)^r\) but not by any smaller power. If several blocks share the same eigenvalue, the largest block requires the largest exponent. Multiplying over all distinct eigenvalues kills all blocks.
Jordan form is powerful for exact theory, but it is unstable for numerical computation. A tiny perturbation can change the Jordan structure.
For example,
\[ \begin{bmatrix}1&1\\0&1\end{bmatrix} \]
is not diagonalizable, but
\[ \begin{bmatrix}1&1\\0&1+\varepsilon\end{bmatrix} \]
has two distinct eigenvalues when \(\varepsilon\ne0\), so it is diagonalizable.
Jordan form is a theoretical microscope. In numerical linear algebra, stable tools such as Schur decomposition, QR methods, and SVD are often preferred.
Ask an AI tool:
Explain the difference between an eigenvector and a generalized eigenvector using the matrix \(\begin{bmatrix}2&1\\0&2\end{bmatrix}\).
Then verify its answer by computing \(\ker(A-2I)\) and \(\ker((A-2I)^2)\).
Ask:
For \(A=\begin{bmatrix}2&1\\0&2\end{bmatrix}\), find a Jordan chain and explain why it gives a Jordan block.
Check that
\[ (A-2I)e_1=0, \qquad (A-2I)e_2=e_1. \]
Ask:
Why does a nontrivial Jordan block create polynomial factors in \(A^m\) and \(e^{tA}\)?
Then connect the answer to the formulas in Sections 8.10 and 8.11.
Let
\[ A=\begin{bmatrix}3&1&0\\0&3&1\\0&0&3\end{bmatrix}. \]
Find \(A^m\).
Write \(A=3I+N\), where
\[ N=\begin{bmatrix}0&1&0\\0&0&1\\0&0&0\end{bmatrix}, \qquad N^3=0. \]
Therefore
\[ A^m=3^mI+m3^{m-1}N+\binom{m}{2}3^{m-2}N^2. \]
Thus
\[ A^m= \begin{bmatrix} 3^m&m3^{m-1}&\binom{m}{2}3^{m-2}\\ 0&3^m&m3^{m-1}\\ 0&0&3^m \end{bmatrix}. \]
Suppose for one eigenvalue \(\lambda\) the kernel dimensions are
\[ d_1=3, \qquad d_2=5, \qquad d_3=6. \]
Find the Jordan block sizes.
We compute
\[ d_1-d_0=3, \qquad d_2-d_1=2, \qquad d_3-d_2=1. \]
So there are three blocks of size at least \(1\), two blocks of size at least \(2\), and one block of size at least \(3\). Hence the block sizes are
\[ 3,2,1. \]
Determine whether
\[ A=\begin{bmatrix}4&1\\0&4\end{bmatrix} \]
is diagonalizable.
The only eigenvalue is \(4\). Since
\[ A-4I=\begin{bmatrix}0&1\\0&0\end{bmatrix}, \]
the eigenspace is one-dimensional. Therefore \(A\) is not diagonalizable.
Find a Jordan chain for
\[ A=\begin{bmatrix}4&1\\0&4\end{bmatrix}. \]
Let \(v_1=e_1\) and \(v_2=e_2\). Then
\[ (A-4I)v_1=0, \qquad (A-4I)v_2=v_1. \]
Thus \(v_1,v_2\) is a Jordan chain.
Let \(J=J_{2,2}\). Compute \(J^5\).
For a \(2\times2\) Jordan block,
\[ J^m=\begin{bmatrix}2^m&m2^{m-1}\\0&2^m\end{bmatrix}. \]
Thus
\[ J^5=\begin{bmatrix}32&80\\0&32\end{bmatrix}. \]
If the largest Jordan block for eigenvalue \(1\) has size \(3\) and the largest Jordan block for eigenvalue \(-2\) has size \(1\), find the minimal polynomial.
The minimal polynomial is
\[ m_A(t)=(t-1)^3(t+2). \]
Jordan canonical form explains the precise structure behind the failure of diagonalization.
| Concept | Meaning |
|---|---|
| Eigenvector | A vector killed by \(A-\lambda I\) |
| Generalized eigenvector | A vector eventually killed by powers of \(A-\lambda I\) |
| Jordan chain | A sequence moved downward by \(A-\lambda I\) |
| Jordan block | The matrix representation of one chain |
| Diagonalizable matrix | All Jordan blocks have size \(1\) |
| Minimal polynomial | Records the largest Jordan block for each eigenvalue |
The central message is:
\[ \boxed{\text{Jordan form replaces missing eigenvectors with generalized eigenvector chains.}} \]