5  Chapter 5. Coordinates, Matrices, and Change of Basis

How one linear transformation can wear many coordinate costumes

Guiding question.
A vector can be written in many coordinate systems. A linear transformation can be represented by many matrices. What stays the same, and what changes, when we change the basis?

In Chapter 4 we learned that a basis is a coordinate system for a vector space. Once a basis is chosen, every vector becomes a column of numbers.

Chapter 5 takes the next step. If vectors can be translated into coordinate columns, then linear transformations can be translated into matrices. This is one of the deepest practical ideas in linear algebra:

\[ \text{abstract linear map} \quad \longleftrightarrow \quad \text{matrix after choosing bases}. \]

The matrix is not the transformation itself. It is a coordinate description of the transformation. When the basis changes, the matrix changes. The transformation remains the same.

This chapter is about that translation.

NoteHow to read this chapter

Keep three ideas separate.

  1. A vector \(v\) is an object in a vector space \(V\).
  2. A coordinate vector \([v]_{\mathcal B}\) is a column of numbers describing \(v\) relative to a basis \(\mathcal B\).
  3. A matrix \([T]_{\mathcal C\leftarrow \mathcal B}\) describes a linear transformation \(T:V\to W\) after choosing a basis \(\mathcal B\) for the input and a basis \(\mathcal C\) for the output.
TipAI and coding companion

This chapter is a good place to use Python or an AI assistant as a checking partner. Ask it to compute coordinate vectors, change-of-basis matrices, and matrix representations. Then ask: Which basis is being used? What is the domain? What is the codomain? Does the matrix represent a vector or a map?

Code
import sympy as sp
sp.init_printing()

5.1 5.1 Coordinates relative to a basis

A basis is a dictionary. It gives names to directions. Once the dictionary is chosen, every vector has a unique sentence written in that dictionary.

5.1.1 5.1.1 Unique coordinates

ImportantTheorem 5.1: Unique representation theorem

Let \(V\) be a vector space over a field \(\mathbb F\), and let

\[ \mathcal B=(b_1,b_2,\ldots,b_n) \]

be a basis for \(V\). Then every vector \(v\in V\) can be written uniquely in the form

\[ v=c_1b_1+c_2b_2+\cdots+c_nb_n, \]

where \(c_1,\ldots,c_n\in\mathbb F\).

The word uniquely is essential. A spanning set lets us describe every vector, but possibly in many ways. A linearly independent spanning set describes every vector in exactly one way.

NoteDefinition 5.2: Coordinate vector

Let \(\mathcal B=(b_1,\ldots,b_n)\) be a basis for \(V\). If

\[ v=c_1b_1+\cdots+c_nb_n, \]

then the coordinate vector of \(v\) relative to \(\mathcal B\) is

\[ [v]_{\mathcal B}=\begin{bmatrix}c_1\\ \vdots\\ c_n\end{bmatrix}\in \mathbb F^n. \]

WarningCommon mistake

The vector \(v\) and the coordinate vector \([v]_{\mathcal B}\) are not usually the same object. The vector \(v\) lives in \(V\). The coordinate vector \([v]_{\mathcal B}\) lives in \(\mathbb F^n\). The basis \(\mathcal B\) is the translation rule.

5.1.2 5.1.2 Example: coordinates in \(\mathbb R^2\)

Let

\[ b_1=\begin{bmatrix}1\\1\end{bmatrix},\qquad b_2=\begin{bmatrix}-1\\2\end{bmatrix},\qquad \mathcal B=(b_1,b_2). \]

Find \([x]_{\mathcal B}\) for

\[ x=\begin{bmatrix}2\\8\end{bmatrix}. \]

We solve

\[ c_1\begin{bmatrix}1\\1\end{bmatrix} +c_2\begin{bmatrix}-1\\2\end{bmatrix} = \begin{bmatrix}2\\8\end{bmatrix}. \]

This gives

\[ c_1-c_2=2,\qquad c_1+2c_2=8. \]

Thus \(c_2=2\) and \(c_1=4\). Therefore

\[ [x]_{\mathcal B}=\begin{bmatrix}4\\2\end{bmatrix}. \]

Python verifies the calculation by using the basis vectors as columns.

Code
b1 = sp.Matrix([1, 1])
b2 = sp.Matrix([-1, 2])
P_B = sp.Matrix.hstack(b1, b2)
x = sp.Matrix([2, 8])
coords = P_B.LUsolve(x)
P_B, coords

\(\displaystyle \left( \left[\begin{matrix}1 & -1\\1 & 2\end{matrix}\right], \ \left[\begin{matrix}4\\2\end{matrix}\right]\right)\)

The matrix

\[ P_{\mathcal B}=\begin{bmatrix}1&-1\\1&2\end{bmatrix} \]

reconstructs the vector from its coordinates:

\[ x=P_{\mathcal B}[x]_{\mathcal B}. \]

Code
P_B * coords

\(\displaystyle \left[\begin{matrix}2\\8\end{matrix}\right]\)

5.1.3 5.1.3 Example: coordinates in a polynomial space

Let \(V=P_2(\mathbb R)\) and let

\[ \mathcal B=(1,\ 1+t,\ 1+t+t^2). \]

Find the coordinate vector of

\[ p(t)=3+5t+2t^2 \]

relative to \(\mathcal B\).

We solve

\[ p(t)=c_1(1)+c_2(1+t)+c_3(1+t+t^2). \]

Matching coefficients gives

\[ c_1+c_2+c_3=3,\qquad c_2+c_3=5,\qquad c_3=2. \]

Therefore

\[ c_3=2,\qquad c_2=3,\qquad c_1=-2, \]

and hence

\[ [p]_{\mathcal B}=\begin{bmatrix}-2\\3\\2\end{bmatrix}. \]

Code
# Polynomial coordinates are found by matching coefficients.
# Basis: 1, 1+t, 1+t+t^2.
P = sp.Matrix([[1, 1, 1],   # constant coefficients
               [0, 1, 1],   # t coefficients
               [0, 0, 1]])  # t^2 coefficients
p_std = sp.Matrix([3, 5, 2])
P.LUsolve(p_std)

\(\displaystyle \left[\begin{matrix}-2\\3\\2\end{matrix}\right]\)

5.1.4 5.1.4 Example: coordinates in a matrix space

Let \(V=M_2(\mathbb R)\) and use the standard matrix basis

\[ E_{11}=\begin{bmatrix}1&0\\0&0\end{bmatrix},\quad E_{12}=\begin{bmatrix}0&1\\0&0\end{bmatrix},\quad E_{21}=\begin{bmatrix}0&0\\1&0\end{bmatrix},\quad E_{22}=\begin{bmatrix}0&0\\0&1\end{bmatrix}. \]

For

\[ A=\begin{bmatrix}2&-1\\5&4\end{bmatrix}, \]

we have

\[ A=2E_{11}-E_{12}+5E_{21}+4E_{22}. \]

Therefore

\[ [A]_{\mathcal B}=\begin{bmatrix}2\\-1\\5\\4\end{bmatrix}. \]

This example shows a key idea: matrices themselves can be vectors once a basis has been chosen.

5.2 5.2 The coordinate map

A coordinate map converts an abstract vector-space problem into a problem in \(\mathbb F^n\) without losing the linear structure.

NoteDefinition 5.3: Coordinate map

Let \(\mathcal B=(b_1,\ldots,b_n)\) be a basis for a vector space \(V\). The map

\[ \Phi_{\mathcal B}:V\to \mathbb F^n, \qquad \Phi_{\mathcal B}(v)=[v]_{\mathcal B} \]

is called the coordinate map determined by \(\mathcal B\).

ImportantTheorem 5.4: Coordinate maps are isomorphisms

For every basis \(\mathcal B\) of an \(n\)-dimensional vector space \(V\), the coordinate map

\[ \Phi_{\mathcal B}:V\to \mathbb F^n \]

is a one-to-one and onto linear transformation. Therefore

\[ V\cong \mathbb F^n. \]

NoteProof idea

Linearity follows from the uniqueness of coordinates:

\[ [v+w]_{\mathcal B}=[v]_{\mathcal B}+[w]_{\mathcal B}, \qquad [cv]_{\mathcal B}=c[v]_{\mathcal B}. \]

Injectivity follows because \([v]_{\mathcal B}=0\) means all coordinates of \(v\) are zero, so \(v=0\). Surjectivity follows because every column \((c_1,\ldots,c_n)^T\) comes from the vector \(c_1b_1+\cdots+c_nb_n\).

Different bases give different coordinate maps. The vector is the same; the coordinate column changes.

5.3 5.3 Matrices of linear transformations

A linear transformation is a machine that respects addition and scalar multiplication. To represent it by a matrix, we must choose coordinates for the input and output spaces.

NoteDefinition 5.5: Matrix of a linear transformation

Let \(T:V\to W\) be a linear transformation. Let

\[ \mathcal B=(b_1,\ldots,b_n) \]

be a basis for \(V\), and let

\[ \mathcal C=(w_1,\ldots,w_m) \]

be a basis for \(W\). The \(\mathcal B\)-\(\mathcal C\) matrix of \(T\) is the \(m\times n\) matrix

\[ [T]_{\mathcal C\leftarrow \mathcal B} = \begin{bmatrix} [T(b_1)]_{\mathcal C} & [T(b_2)]_{\mathcal C} & \cdots & [T(b_n)]_{\mathcal C} \end{bmatrix}. \]

The arrow notation reminds us of direction:

\[ [T]_{\mathcal C\leftarrow \mathcal B} \quad\text{takes}\quad \mathcal B\text{-coordinates as input and returns }\mathcal C\text{-coordinates as output.} \]

ImportantTheorem 5.6: Matrix representation theorem

Let \(T:V\to W\) be linear, and let

\[ M=[T]_{\mathcal C\leftarrow \mathcal B}. \]

Then for every \(v\in V\),

\[ [T(v)]_{\mathcal C}=M[v]_{\mathcal B}. \]

TipWorkflow

To build the matrix of \(T\):

  1. Apply \(T\) to each basis vector of the domain.
  2. Write each output in the codomain basis.
  3. Place those coordinate vectors as columns.

5.3.1 5.3.1 Example: derivative as a matrix

Let

\[ D:P_3(\mathbb R)\to P_2(\mathbb R),\qquad D(p)=p'. \]

Use the bases

\[ \mathcal B=(1,t,t^2,t^3),\qquad \mathcal C=(1,t,t^2). \]

Then

\[ D(1)=0, \quad D(t)=1, \quad D(t^2)=2t, \quad D(t^3)=3t^2. \]

Thus

\[ [D]_{\mathcal C\leftarrow \mathcal B} = \begin{bmatrix} 0&1&0&0\\ 0&0&2&0\\ 0&0&0&3 \end{bmatrix}. \]

For

\[ p(t)=4-2t+5t^2+t^3, \qquad [p]_{\mathcal B}=\begin{bmatrix}4\\-2\\5\\1\end{bmatrix}, \]

we get

\[ [D(p)]_{\mathcal C} = \begin{bmatrix} 0&1&0&0\\ 0&0&2&0\\ 0&0&0&3 \end{bmatrix} \begin{bmatrix}4\\-2\\5\\1\end{bmatrix} = \begin{bmatrix}-2\\10\\3\end{bmatrix}. \]

Therefore

\[ D(p)=-2+10t+3t^2. \]

Code
D = sp.Matrix([[0, 1, 0, 0],
               [0, 0, 2, 0],
               [0, 0, 0, 3]])
p = sp.Matrix([4, -2, 5, 1])
D * p

\(\displaystyle \left[\begin{matrix}-2\\10\\3\end{matrix}\right]\)

5.3.2 5.3.2 Example: trace map as a row matrix

Let

\[ T:M_2(\mathbb R)\to \mathbb R, \qquad T(A)=\operatorname{tr}(A). \]

Use the basis

\[ \mathcal B=(E_{11},E_{12},E_{21},E_{22}) \]

for \(M_2(\mathbb R)\) and \(\mathcal C=(1)\) for \(\mathbb R\). Since

\[ T(E_{11})=1, \quad T(E_{12})=0, \quad T(E_{21})=0, \quad T(E_{22})=1, \]

the matrix of \(T\) is

\[ [T]_{\mathcal C\leftarrow \mathcal B} = \begin{bmatrix}1&0&0&1\end{bmatrix}. \]

Thus, for

\[ A=\begin{bmatrix}a&b\\c&d\end{bmatrix}, \qquad [A]_{\mathcal B}=\begin{bmatrix}a\\b\\c\\d\end{bmatrix}, \]

we obtain

\[ [T(A)]_{\mathcal C} =\begin{bmatrix}1&0&0&1\end{bmatrix} \begin{bmatrix}a\\b\\c\\d\end{bmatrix} =a+d. \]

5.3.3 5.3.3 Example: integration as a matrix

Let

\[ I:P_2(\mathbb R)\to P_3(\mathbb R), \qquad I(p)(t)=\int_0^t p(s)\,ds. \]

Using standard bases

\[ \mathcal B=(1,t,t^2),\qquad \mathcal C=(1,t,t^2,t^3), \]

we have

\[ I(1)=t,\qquad I(t)=\frac{t^2}{2},\qquad I(t^2)=\frac{t^3}{3}. \]

Therefore

\[ [I]_{\mathcal C\leftarrow \mathcal B} = \begin{bmatrix} 0&0&0\\ 1&0&0\\ 0&\frac12&0\\ 0&0&\frac13 \end{bmatrix}. \]

Code
I = sp.Matrix([[0, 0, 0],
               [1, 0, 0],
               [0, sp.Rational(1, 2), 0],
               [0, 0, sp.Rational(1, 3)]])
q = sp.Matrix([6, -2, 3]) # 6 - 2t + 3t^2
I * q

\(\displaystyle \left[\begin{matrix}0\\6\\-1\\1\end{matrix}\right]\)

The output coordinate vector is

\[ \begin{bmatrix}0\\6\\-1\\1\end{bmatrix}, \]

so

\[ I(6-2t+3t^2)=6t-t^2+t^3. \]

5.4 5.4 The commutative diagram viewpoint

Coordinate matrices make the same transformation visible in two languages: abstract vectors and coordinate columns.

For a linear transformation \(T:V\to W\), the matrix

\[ M=[T]_{\mathcal C\leftarrow \mathcal B} \]

makes the following diagram commute:

\[ \begin{array}{ccc} V & \xrightarrow{\ T\ } & W \\ \downarrow {[\cdot]_{\mathcal B}} & & \downarrow {[\cdot]_{\mathcal C}} \\ \mathbb F^n & \xrightarrow{\ M\ } & \mathbb F^m. \end{array} \]

Commutativity means that the two paths from \(V\) to \(\mathbb F^m\) give the same answer:

\[ [T(v)]_{\mathcal C}=M[v]_{\mathcal B}. \]

TipStory interpretation

There are two ways to use the machine \(T\).

  1. Use the abstract machine first, then translate the output into coordinates.
  2. Translate the input into coordinates first, then multiply by the coordinate matrix.

The theorem says both routes agree.

5.5 5.5 Change of coordinates

Now suppose there are two coordinate systems for the same vector space. How do we translate from one to the other?

5.5.1 5.5.1 From a nonstandard basis to the standard basis

NoteDefinition 5.7: Basis matrix in \(\mathbb F^n\)

Let

\[ \mathcal B=(b_1,\ldots,b_n) \]

be a basis for \(\mathbb F^n\). The matrix

\[ P_{\mathcal B}=\begin{bmatrix}b_1&b_2&\cdots&b_n\end{bmatrix} \]

is called the basis matrix or change-of-coordinate matrix from \(\mathcal B\)-coordinates to standard coordinates.

ImportantProposition 5.8: Converting coordinates

Let \(\mathcal B\) be a basis for \(\mathbb F^n\). Then

\[ x=P_{\mathcal B}[x]_{\mathcal B} \]

and

\[ [x]_{\mathcal B}=P_{\mathcal B}^{-1}x. \]

The first formula reconstructs the vector. The second formula finds its coordinates.

5.5.2 5.5.2 Example: changing to standard coordinates

Let

\[ \mathcal B=\left( \begin{bmatrix}1\\1\end{bmatrix}, \begin{bmatrix}-1\\2\end{bmatrix} \right). \]

Then

\[ P_{\mathcal B}=\begin{bmatrix}1&-1\\1&2\end{bmatrix}. \]

If

\[ [x]_{\mathcal B}=\begin{bmatrix}4\\2\end{bmatrix}, \]

then

\[ x=P_{\mathcal B}[x]_{\mathcal B} = \begin{bmatrix}1&-1\\1&2\end{bmatrix} \begin{bmatrix}4\\2\end{bmatrix} = \begin{bmatrix}2\\8\end{bmatrix}. \]

Code
P_B = sp.Matrix([[1, -1], [1, 2]])
coords_B = sp.Matrix([4, 2])
P_B * coords_B

\(\displaystyle \left[\begin{matrix}2\\8\end{matrix}\right]\)

5.5.3 5.5.3 Changing between two nonstandard bases

Let \(\mathcal B\) and \(\mathcal C\) be two bases of \(\mathbb F^n\). Then

\[ x=P_{\mathcal B}[x]_{\mathcal B}=P_{\mathcal C}[x]_{\mathcal C}. \]

Solving for \([x]_{\mathcal C}\) gives

\[ [x]_{\mathcal C}=P_{\mathcal C}^{-1}P_{\mathcal B}[x]_{\mathcal B}. \]

ImportantTheorem 5.9: Change-of-basis matrix

Let \(\mathcal B\) and \(\mathcal C\) be bases for \(\mathbb F^n\). The matrix that converts \(\mathcal B\)-coordinates into \(\mathcal C\)-coordinates is

\[ P_{\mathcal C\leftarrow \mathcal B}=P_{\mathcal C}^{-1}P_{\mathcal B}. \]

That is,

\[ [x]_{\mathcal C}=P_{\mathcal C\leftarrow \mathcal B}[x]_{\mathcal B}. \]

WarningDirection matters

The matrix \(P_{\mathcal C\leftarrow \mathcal B}\) starts with \(\mathcal B\)-coordinates and ends with \(\mathcal C\)-coordinates. The arrow points toward the output coordinate system.

5.5.4 5.5.4 Example: two bases in \(\mathbb R^2\)

Let

\[ \mathcal B=\left( \begin{bmatrix}1\\1\end{bmatrix}, \begin{bmatrix}-1\\2\end{bmatrix} \right), \qquad \mathcal C=\left( \begin{bmatrix}2\\0\end{bmatrix}, \begin{bmatrix}0\\1\end{bmatrix} \right). \]

Then

\[ P_{\mathcal B}=\begin{bmatrix}1&-1\\1&2\end{bmatrix}, \qquad P_{\mathcal C}=\begin{bmatrix}2&0\\0&1\end{bmatrix}. \]

Thus

\[ P_{\mathcal C\leftarrow \mathcal B}=P_{\mathcal C}^{-1}P_{\mathcal B}. \]

Code
P_B = sp.Matrix([[1, -1], [1, 2]])
P_C = sp.Matrix([[2, 0], [0, 1]])
P_C_from_B = P_C.inv() * P_B
P_C_from_B

\(\displaystyle \left[\begin{matrix}\frac{1}{2} & - \frac{1}{2}\\1 & 2\end{matrix}\right]\)

If \([x]_{\mathcal B}=(4,2)^T\), then

Code
x_B = sp.Matrix([4, 2])
x_C = P_C_from_B * x_B
x_standard = P_B * x_B
x_C, x_standard, P_C * x_C

\(\displaystyle \left( \left[\begin{matrix}1\\8\end{matrix}\right], \ \left[\begin{matrix}2\\8\end{matrix}\right], \ \left[\begin{matrix}2\\8\end{matrix}\right]\right)\)

The coordinate vector changes, but the actual vector does not.

5.6 5.6 Similarity: one operator, many matrices

Now let \(T:V\to V\) be a linear transformation from a vector space to itself. Such a map is called a linear operator. If we use one basis for both the input and output, the matrix of \(T\) is written

\[ [T]_{\mathcal B}. \]

Changing the basis changes the matrix of the same operator.

ImportantTheorem 5.10: Similarity under change of basis

Let \(T:V\to V\) be a linear operator on a finite-dimensional vector space. Let \(\mathcal B\) and \(\mathcal C\) be bases of \(V\). Then

\[ [T]_{\mathcal C} = P_{\mathcal C\leftarrow \mathcal B}\,[T]_{\mathcal B}\,P_{\mathcal B\leftarrow \mathcal C}. \]

Equivalently, if

\[ S=P_{\mathcal B\leftarrow \mathcal C}, \]

then

\[ [T]_{\mathcal C}=S^{-1}[T]_{\mathcal B}S. \]

Matrices related by

\[ B=S^{-1}AS \]

are called similar matrices.

Similarity is not just an algebraic trick. It means two matrices are the same linear transformation written in two different coordinate systems.

NoteInvariants under similarity

Similar matrices have the same determinant, trace, rank, characteristic polynomial, eigenvalues, and nullity. These quantities describe the underlying operator rather than a particular coordinate costume.

5.6.1 5.6.1 Example: same transformation in two bases

Let

\[ A=\begin{bmatrix}3&1\\0&2\end{bmatrix} \]

be the standard matrix of a linear operator \(T:\mathbb R^2\to\mathbb R^2\). Let

\[ \mathcal C=\left( \begin{bmatrix}1\\1\end{bmatrix}, \begin{bmatrix}1\\0\end{bmatrix} \right). \]

Then

\[ P_{\mathcal C}=\begin{bmatrix}1&1\\1&0\end{bmatrix}. \]

The matrix of the same operator relative to \(\mathcal C\) is

\[ [T]_{\mathcal C}=P_{\mathcal C}^{-1}AP_{\mathcal C}. \]

Code
A = sp.Matrix([[3, 1], [0, 2]])
P_C = sp.Matrix([[1, 1], [1, 0]])
A_C = P_C.inv() * A * P_C
A_C

\(\displaystyle \left[\begin{matrix}2 & 0\\2 & 3\end{matrix}\right]\)

The two matrices look different, but they represent the same linear operator.

Code
A.det(), A_C.det(), A.trace(), A_C.trace(), A.eigenvals(), A_C.eigenvals()

\(\displaystyle \left( 6, \ 6, \ 5, \ 5, \ \left\{ 2 : 1, \ 3 : 1\right\}, \ \left\{ 2 : 1, \ 3 : 1\right\}\right)\)

They have the same determinant, trace, and eigenvalues.

5.7 5.7 Choosing a basis to simplify a transformation

The purpose of changing basis is not only to translate coordinates. Often we choose a basis to make a transformation easier to understand.

The best possible situation is when the basis vectors are eigenvectors.

NoteDefinition 5.11: Eigenbasis

Let \(T:V\to V\) be a linear operator. A basis

\[ \mathcal B=(v_1,\ldots,v_n) \]

is called an eigenbasis for \(T\) if every \(v_i\) is an eigenvector of \(T\).

If

\[ T(v_i)=\lambda_i v_i, \]

then the matrix of \(T\) in the eigenbasis is diagonal:

\[ [T]_{\mathcal B}=\begin{bmatrix} \lambda_1&0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&\lambda_n \end{bmatrix}. \]

ImportantTheorem 5.12: Diagonalization by a basis of eigenvectors

Let \(A\in\mathbb F^{n\times n}\). The following are equivalent.

  1. \(A\) is diagonalizable.
  2. There exists a basis of \(\mathbb F^n\) consisting of eigenvectors of \(A\).
  3. There exists an invertible matrix \(P\) and a diagonal matrix \(D\) such that

\[ A=PDP^{-1}. \]

In this case, the columns of \(P\) are eigenvectors of \(A\), and the diagonal entries of \(D\) are the corresponding eigenvalues.

5.7.1 5.7.1 Example: diagonalizing by choosing the right basis

Let

\[ A=\begin{bmatrix}4&1\\2&3\end{bmatrix}. \]

Code
A = sp.Matrix([[4, 1], [2, 3]])
A.eigenvects()

\(\displaystyle \left[ \left( 2, \ 1, \ \left[ \left[\begin{matrix}- \frac{1}{2}\\1\end{matrix}\right]\right]\right), \ \left( 5, \ 1, \ \left[ \left[\begin{matrix}1\\1\end{matrix}\right]\right]\right)\right]\)

The eigenvalues are \(5\) and \(2\). Eigenvectors may be chosen as

\[ v_1=\begin{bmatrix}1\\1\end{bmatrix}, \qquad v_2=\begin{bmatrix}-1\\2\end{bmatrix}. \]

Let

\[ P=\begin{bmatrix}1&-1\\1&2\end{bmatrix}, \qquad D=\begin{bmatrix}5&0\\0&2\end{bmatrix}. \]

Then

\[ P^{-1}AP=D. \]

Code
P = sp.Matrix([[1, -1], [1, 2]])
D = P.inv() * A * P
D

\(\displaystyle \left[\begin{matrix}5 & 0\\0 & 2\end{matrix}\right]\)

Thus, in the eigenbasis, the transformation simply scales the first coordinate by \(5\) and the second coordinate by \(2\).

5.8 5.8 Application: dynamical systems

Suppose a state vector changes by the rule

\[ x_{k+1}=Ax_k. \]

Then

\[ x_k=A^k x_0. \]

If \(A=PDP^{-1}\), then

\[ A^k=PD^kP^{-1}. \]

This makes long-term behavior much easier to compute.

Code
A = sp.Matrix([[4, 1], [2, 3]])
P = sp.Matrix([[1, -1], [1, 2]])
D = sp.diag(5, 2)
x0 = sp.Matrix([3, 0])
for k in range(5):
    print(k, A**k * x0, P * (D**k) * P.inv() * x0)
0 Matrix([[3], [0]]) Matrix([[3], [0]])
1 Matrix([[12], [6]]) Matrix([[12], [6]])
2 Matrix([[54], [42]]) Matrix([[54], [42]])
3 Matrix([[258], [234]]) Matrix([[258], [234]])
4 Matrix([[1266], [1218]]) Matrix([[1266], [1218]])

The formula \(A^k=PD^kP^{-1}\) says: translate into eigen-coordinates, scale by powers of eigenvalues, then translate back.

TipStory interpretation

Diagonalization is a change of language. In the original coordinates, the system mixes directions. In eigen-coordinates, each coordinate evolves independently.

5.9 5.9 Application: data representations

In data science, the same object may be represented in different coordinate systems.

A data point may be written in raw features:

\[ x=\begin{bmatrix}\text{height}\\ \text{weight}\\ \text{age}\end{bmatrix}. \]

Or it may be written in transformed features:

\[ [x]_{\mathcal B}=\begin{bmatrix}\text{overall size}\\ \text{contrast feature}\\ \text{age direction}\end{bmatrix}. \]

A change of basis can separate mixed information into more interpretable coordinates. Later, in spectral decomposition, SVD, PCA, Fourier bases, and wavelets, we will repeatedly use the same idea:

\[ \text{choose a basis that reveals structure.} \]

5.9.1 5.9.1 Small feature example

Suppose two raw features are stored in standard coordinates, and we choose a new basis

\[ b_1=\begin{bmatrix}1\\1\end{bmatrix}, \qquad b_2=\begin{bmatrix}1\\-1\end{bmatrix}. \]

The first direction measures total size; the second measures contrast.

Code
P = sp.Matrix([[1, 1], [1, -1]])
data = sp.Matrix([[6, 8, 10, 4],
                  [4, 8, 2, 6]])  # columns are data points
coords = P.inv() * data
coords

\(\displaystyle \left[\begin{matrix}5 & 8 & 6 & 5\\1 & 0 & 4 & -1\end{matrix}\right]\)

Each column has been rewritten in size-contrast coordinates. This is the beginning of the idea behind many feature transformations.

5.10 5.10 Application: constraint coordinates

In optimization, variables often satisfy linear constraints. Instead of working in all of \(\mathbb R^n\), we can choose coordinates adapted to the feasible subspace.

For example, suppose

\[ x_1+x_2+x_3=0. \]

The solution space is a plane in \(\mathbb R^3\) with basis

\[ v_1=\begin{bmatrix}1\\-1\\0\end{bmatrix}, \qquad v_2=\begin{bmatrix}1\\0\\-1\end{bmatrix}. \]

Every feasible vector has the form

\[ x=s v_1+t v_2. \]

The coordinates \((s,t)\) describe feasible directions directly.

Code
v1 = sp.Matrix([1, -1, 0])
v2 = sp.Matrix([1, 0, -1])
B = sp.Matrix.hstack(v1, v2)
s, t = sp.symbols('s t')
x = B * sp.Matrix([s, t])
x

\(\displaystyle \left[\begin{matrix}s + t\\- s\\- t\end{matrix}\right]\)

The constraint is automatically satisfied:

Code
sp.Matrix([[1,1,1]]) * x

\(\displaystyle \left[\begin{matrix}0\end{matrix}\right]\)

This is an example of a useful basis: it builds the constraint into the coordinates.

5.11 5.11 Challenge questions

ImportantChallenge 1: Same vector, two coordinate systems

Let

\[ \mathcal B=\left(\begin{bmatrix}1\\2\end{bmatrix},\begin{bmatrix}3\\1\end{bmatrix}\right), \qquad \mathcal C=\left(\begin{bmatrix}1\\0\end{bmatrix},\begin{bmatrix}1\\1\end{bmatrix}\right). \]

Find \(P_{\mathcal C\leftarrow \mathcal B}\). Then compute \([x]_{\mathcal C}\) when

\[ [x]_{\mathcal B}=\begin{bmatrix}2\\-1\end{bmatrix}. \]

ImportantChallenge 2: Matrix of a polynomial transformation

Let

\[ T:P_2(\mathbb R)\to P_2(\mathbb R), \qquad T(p)=p+p'. \]

Use the basis \(\mathcal B=(1,t,t^2)\) for both domain and codomain. Find \([T]_{\mathcal B}\).

ImportantChallenge 3: Similarity check

Let

\[ A=\begin{bmatrix}2&1\\0&3\end{bmatrix}, \qquad P=\begin{bmatrix}1&1\\0&1\end{bmatrix}. \]

Compute \(B=P^{-1}AP\). Verify that \(A\) and \(B\) have the same trace, determinant, and eigenvalues.

ImportantChallenge 4: Find a simplifying basis

Find a basis in which the linear transformation with standard matrix

\[ A=\begin{bmatrix}1&2\\2&1\end{bmatrix} \]

is diagonal. Explain what the new coordinates mean geometrically.

5.12 5.12 Practice problems

5.12.1 Problem 1: Coordinates in \(\mathbb R^2\)

Let

\[ \mathcal B=\left(\begin{bmatrix}2\\1\end{bmatrix},\begin{bmatrix}1\\1\end{bmatrix}\right). \]

Find \([x]_{\mathcal B}\) for

\[ x=\begin{bmatrix}7\\4\end{bmatrix}. \]

5.12.2 Problem 2: Coordinates in a polynomial basis

Let

\[ \mathcal B=(1,\ 1+t,\ t+t^2) \]

be a basis for \(P_2(\mathbb R)\). Find

\[ [p]_{\mathcal B} \]

for

\[ p(t)=4+7t+3t^2. \]

5.12.3 Problem 3: Matrix of a derivative map

Let

\[ D:P_4(\mathbb R)\to P_3(\mathbb R), \qquad D(p)=p'. \]

Use standard bases

\[ \mathcal B=(1,t,t^2,t^3,t^4), \qquad \mathcal C=(1,t,t^2,t^3). \]

Find \([D]_{\mathcal C\leftarrow \mathcal B}\).

5.12.4 Problem 4: Matrix of a linear map between polynomial spaces

Let

\[ T:P_2(\mathbb R)\to P_2(\mathbb R), \qquad T(p)=t p'(t). \]

Use the standard basis \(\mathcal B=(1,t,t^2)\) for both domain and codomain. Find \([T]_{\mathcal B}\).

5.12.5 Problem 5: Change of basis

Let

\[ \mathcal B=\left(\begin{bmatrix}1\\1\end{bmatrix},\begin{bmatrix}1\\-1\end{bmatrix}\right), \qquad \mathcal C=\left(\begin{bmatrix}2\\0\end{bmatrix},\begin{bmatrix}0\\3\end{bmatrix}\right). \]

Find \(P_{\mathcal C\leftarrow \mathcal B}\).

5.12.6 Problem 6: Similar matrices

Let

\[ A=\begin{bmatrix}4&1\\0&2\end{bmatrix}, \qquad P=\begin{bmatrix}1&1\\0&1\end{bmatrix}. \]

Compute

\[ B=P^{-1}AP. \]

Verify that \(A\) and \(B\) have the same determinant and trace.

5.12.7 Problem 7: Diagonalization

Diagonalize the matrix

\[ A=\begin{bmatrix}3&2\\2&3\end{bmatrix}. \]

That is, find \(P\) and \(D\) such that

\[ A=PDP^{-1}. \]

5.12.8 Problem 8: Constraint coordinates

Let

\[ N=\{x\in\mathbb R^3:x_1+x_2+x_3=0\}. \]

Find a basis for \(N\), and write

\[ x=\begin{bmatrix}2\\-5\\3\end{bmatrix} \]

in that basis.

5.13 5.13 Solutions to practice problems

5.13.1 Solution 1

We solve

\[ c_1\begin{bmatrix}2\\1\end{bmatrix}+c_2\begin{bmatrix}1\\1\end{bmatrix}=\begin{bmatrix}7\\4\end{bmatrix}. \]

This gives

\[ 2c_1+c_2=7,\qquad c_1+c_2=4. \]

Subtracting gives \(c_1=3\), and then \(c_2=1\). Thus

\[ [x]_{\mathcal B}=\begin{bmatrix}3\\1\end{bmatrix}. \]

Code
P = sp.Matrix([[2,1],[1,1]])
x = sp.Matrix([7,4])
P.LUsolve(x)

\(\displaystyle \left[\begin{matrix}3\\1\end{matrix}\right]\)

5.13.2 Solution 2

Write

\[ 4+7t+3t^2=c_1(1)+c_2(1+t)+c_3(t+t^2). \]

Matching coefficients gives

\[ c_1+c_2=4, \qquad c_2+c_3=7, \qquad c_3=3. \]

Thus \(c_3=3\), \(c_2=4\), and \(c_1=0\). Therefore

\[ [p]_{\mathcal B}=\begin{bmatrix}0\\4\\3\end{bmatrix}. \]

5.13.3 Solution 3

We compute

\[ D(1)=0, \quad D(t)=1, \quad D(t^2)=2t, \quad D(t^3)=3t^2, \quad D(t^4)=4t^3. \]

Therefore

\[ [D]_{\mathcal C\leftarrow \mathcal B} = \begin{bmatrix} 0&1&0&0&0\\ 0&0&2&0&0\\ 0&0&0&3&0\\ 0&0&0&0&4 \end{bmatrix}. \]

5.13.4 Solution 4

Use \(\mathcal B=(1,t,t^2)\). We compute

\[ T(1)=t\cdot 0=0, \]

\[ T(t)=t\cdot 1=t, \]

and

\[ T(t^2)=t\cdot 2t=2t^2. \]

Thus

\[ [T]_{\mathcal B} = \begin{bmatrix} 0&0&0\\ 0&1&0\\ 0&0&2 \end{bmatrix}. \]

5.13.5 Solution 5

We have

\[ P_{\mathcal B}=\begin{bmatrix}1&1\\1&-1\end{bmatrix}, \qquad P_{\mathcal C}=\begin{bmatrix}2&0\\0&3\end{bmatrix}. \]

Therefore

\[ P_{\mathcal C\leftarrow \mathcal B}=P_{\mathcal C}^{-1}P_{\mathcal B} = \begin{bmatrix}1/2&1/2\\1/3&-1/3\end{bmatrix}. \]

Code
P_B = sp.Matrix([[1,1],[1,-1]])
P_C = sp.Matrix([[2,0],[0,3]])
P_C.inv() * P_B

\(\displaystyle \left[\begin{matrix}\frac{1}{2} & \frac{1}{2}\\\frac{1}{3} & - \frac{1}{3}\end{matrix}\right]\)

5.13.6 Solution 6

Compute

\[ B=P^{-1}AP. \]

Code
A = sp.Matrix([[4,1],[0,2]])
P = sp.Matrix([[1,1],[0,1]])
B = P.inv() * A * P
B, A.det(), B.det(), A.trace(), B.trace()

\(\displaystyle \left( \left[\begin{matrix}4 & 3\\0 & 2\end{matrix}\right], \ 8, \ 8, \ 6, \ 6\right)\)

The output shows

\[ B=\begin{bmatrix}4&3\\0&2\end{bmatrix}. \]

Also

\[ \det(A)=\det(B)=8, \qquad \operatorname{tr}(A)=\operatorname{tr}(B)=6. \]

5.13.7 Solution 7

The matrix

\[ A=\begin{bmatrix}3&2\\2&3\end{bmatrix} \]

has eigenvectors

\[ v_1=\begin{bmatrix}1\\1\end{bmatrix} \quad\text{with eigenvalue }5, \qquad v_2=\begin{bmatrix}1\\-1\end{bmatrix} \quad\text{with eigenvalue }1. \]

Thus

\[ P=\begin{bmatrix}1&1\\1&-1\end{bmatrix}, \qquad D=\begin{bmatrix}5&0\\0&1\end{bmatrix}. \]

Then

\[ A=PDP^{-1}. \]

Code
A = sp.Matrix([[3,2],[2,3]])
P = sp.Matrix([[1,1],[1,-1]])
D = P.inv() * A * P
D

\(\displaystyle \left[\begin{matrix}5 & 0\\0 & 1\end{matrix}\right]\)

5.13.8 Solution 8

The equation

\[ x_1+x_2+x_3=0 \]

gives

\[ x_3=-x_1-x_2. \]

Let \(x_1=s\) and \(x_2=t\). Then

\[ x=\begin{bmatrix}s\\t\\-s-t\end{bmatrix} =s\begin{bmatrix}1\\0\\-1\end{bmatrix} +t\begin{bmatrix}0\\1\\-1\end{bmatrix}. \]

So a basis is

\[ \left(\begin{bmatrix}1\\0\\-1\end{bmatrix}, \begin{bmatrix}0\\1\\-1\end{bmatrix}\right). \]

For

\[ x=\begin{bmatrix}2\\-5\\3\end{bmatrix}, \]

we have \(s=2\) and \(t=-5\). Therefore

\[ [x]_{\mathcal B}=\begin{bmatrix}2\\-5\end{bmatrix}. \]

5.14 5.14 AI companion activities

TipActivity 1: Coordinate checker

Ask an AI assistant:

I have the basis \(\mathcal B=((1,1)^T,(-1,2)^T)\) and the vector \(x=(2,8)^T\). Find \([x]_{\mathcal B}\) and explain why the answer is not the same object as \(x\).

Then check whether the assistant reconstructs \(x\) by multiplying \(P_{\mathcal B}[x]_{\mathcal B}\).

TipActivity 2: Matrix representation explainer

Ask:

Explain how to find the matrix of the derivative map \(D:P_3\to P_2\) using the standard polynomial bases. Give the matrix and explain what each column means.

Then compare the answer with Section 5.3.1.

TipActivity 3: Basis-change debugging

Give an AI assistant two bases \(\mathcal B\) and \(\mathcal C\). Ask it for both matrices

\[ P_{\mathcal C\leftarrow \mathcal B} \quad\text{and}\quad P_{\mathcal B\leftarrow \mathcal C}. \]

Then verify that their product is the identity matrix.

TipActivity 4: Similarity investigation

Choose a random invertible matrix \(P\) and a matrix \(A\). Compute

\[ B=P^{-1}AP. \]

Ask AI to predict which quantities are the same for \(A\) and \(B\). Check determinant, trace, rank, characteristic polynomial, and eigenvalues in Python.

TipActivity 5: Create your own simplifying basis

Find a matrix with two distinct real eigenvalues. Ask an AI assistant to diagonalize it. Then ask the assistant to explain the diagonalization in terms of changing coordinate systems, not only as a formula.

5.15 5.15 Summary

This chapter introduced the coordinate view of linear algebra.

  • A basis gives each vector a unique coordinate vector.
  • The coordinate map \(v\mapsto [v]_{\mathcal B}\) is an isomorphism.
  • A linear transformation becomes a matrix after choosing bases for the domain and codomain.
  • Change-of-basis matrices translate coordinates between different bases.
  • Similar matrices represent the same linear operator in different bases.
  • Diagonalization is the special case where a basis of eigenvectors makes the operator as simple as possible.

The main lesson is this:

\[ \boxed{\text{Coordinates change. Linear structure remains.}} \]