Code
import sympy as sp
sp.init_printing()How one linear transformation can wear many coordinate costumes
Guiding question.
A vector can be written in many coordinate systems. A linear transformation can be represented by many matrices. What stays the same, and what changes, when we change the basis?
In Chapter 4 we learned that a basis is a coordinate system for a vector space. Once a basis is chosen, every vector becomes a column of numbers.
Chapter 5 takes the next step. If vectors can be translated into coordinate columns, then linear transformations can be translated into matrices. This is one of the deepest practical ideas in linear algebra:
\[ \text{abstract linear map} \quad \longleftrightarrow \quad \text{matrix after choosing bases}. \]
The matrix is not the transformation itself. It is a coordinate description of the transformation. When the basis changes, the matrix changes. The transformation remains the same.
This chapter is about that translation.
Keep three ideas separate.
This chapter is a good place to use Python or an AI assistant as a checking partner. Ask it to compute coordinate vectors, change-of-basis matrices, and matrix representations. Then ask: Which basis is being used? What is the domain? What is the codomain? Does the matrix represent a vector or a map?
import sympy as sp
sp.init_printing()A basis is a dictionary. It gives names to directions. Once the dictionary is chosen, every vector has a unique sentence written in that dictionary.
Let \(V\) be a vector space over a field \(\mathbb F\), and let
\[ \mathcal B=(b_1,b_2,\ldots,b_n) \]
be a basis for \(V\). Then every vector \(v\in V\) can be written uniquely in the form
\[ v=c_1b_1+c_2b_2+\cdots+c_nb_n, \]
where \(c_1,\ldots,c_n\in\mathbb F\).
The word uniquely is essential. A spanning set lets us describe every vector, but possibly in many ways. A linearly independent spanning set describes every vector in exactly one way.
Let \(\mathcal B=(b_1,\ldots,b_n)\) be a basis for \(V\). If
\[ v=c_1b_1+\cdots+c_nb_n, \]
then the coordinate vector of \(v\) relative to \(\mathcal B\) is
\[ [v]_{\mathcal B}=\begin{bmatrix}c_1\\ \vdots\\ c_n\end{bmatrix}\in \mathbb F^n. \]
The vector \(v\) and the coordinate vector \([v]_{\mathcal B}\) are not usually the same object. The vector \(v\) lives in \(V\). The coordinate vector \([v]_{\mathcal B}\) lives in \(\mathbb F^n\). The basis \(\mathcal B\) is the translation rule.
Let
\[ b_1=\begin{bmatrix}1\\1\end{bmatrix},\qquad b_2=\begin{bmatrix}-1\\2\end{bmatrix},\qquad \mathcal B=(b_1,b_2). \]
Find \([x]_{\mathcal B}\) for
\[ x=\begin{bmatrix}2\\8\end{bmatrix}. \]
We solve
\[ c_1\begin{bmatrix}1\\1\end{bmatrix} +c_2\begin{bmatrix}-1\\2\end{bmatrix} = \begin{bmatrix}2\\8\end{bmatrix}. \]
This gives
\[ c_1-c_2=2,\qquad c_1+2c_2=8. \]
Thus \(c_2=2\) and \(c_1=4\). Therefore
\[ [x]_{\mathcal B}=\begin{bmatrix}4\\2\end{bmatrix}. \]
Python verifies the calculation by using the basis vectors as columns.
b1 = sp.Matrix([1, 1])
b2 = sp.Matrix([-1, 2])
P_B = sp.Matrix.hstack(b1, b2)
x = sp.Matrix([2, 8])
coords = P_B.LUsolve(x)
P_B, coords\(\displaystyle \left( \left[\begin{matrix}1 & -1\\1 & 2\end{matrix}\right], \ \left[\begin{matrix}4\\2\end{matrix}\right]\right)\)
The matrix
\[ P_{\mathcal B}=\begin{bmatrix}1&-1\\1&2\end{bmatrix} \]
reconstructs the vector from its coordinates:
\[ x=P_{\mathcal B}[x]_{\mathcal B}. \]
P_B * coords\(\displaystyle \left[\begin{matrix}2\\8\end{matrix}\right]\)
Let \(V=P_2(\mathbb R)\) and let
\[ \mathcal B=(1,\ 1+t,\ 1+t+t^2). \]
Find the coordinate vector of
\[ p(t)=3+5t+2t^2 \]
relative to \(\mathcal B\).
We solve
\[ p(t)=c_1(1)+c_2(1+t)+c_3(1+t+t^2). \]
Matching coefficients gives
\[ c_1+c_2+c_3=3,\qquad c_2+c_3=5,\qquad c_3=2. \]
Therefore
\[ c_3=2,\qquad c_2=3,\qquad c_1=-2, \]
and hence
\[ [p]_{\mathcal B}=\begin{bmatrix}-2\\3\\2\end{bmatrix}. \]
# Polynomial coordinates are found by matching coefficients.
# Basis: 1, 1+t, 1+t+t^2.
P = sp.Matrix([[1, 1, 1], # constant coefficients
[0, 1, 1], # t coefficients
[0, 0, 1]]) # t^2 coefficients
p_std = sp.Matrix([3, 5, 2])
P.LUsolve(p_std)\(\displaystyle \left[\begin{matrix}-2\\3\\2\end{matrix}\right]\)
Let \(V=M_2(\mathbb R)\) and use the standard matrix basis
\[ E_{11}=\begin{bmatrix}1&0\\0&0\end{bmatrix},\quad E_{12}=\begin{bmatrix}0&1\\0&0\end{bmatrix},\quad E_{21}=\begin{bmatrix}0&0\\1&0\end{bmatrix},\quad E_{22}=\begin{bmatrix}0&0\\0&1\end{bmatrix}. \]
For
\[ A=\begin{bmatrix}2&-1\\5&4\end{bmatrix}, \]
we have
\[ A=2E_{11}-E_{12}+5E_{21}+4E_{22}. \]
Therefore
\[ [A]_{\mathcal B}=\begin{bmatrix}2\\-1\\5\\4\end{bmatrix}. \]
This example shows a key idea: matrices themselves can be vectors once a basis has been chosen.
A coordinate map converts an abstract vector-space problem into a problem in \(\mathbb F^n\) without losing the linear structure.
Let \(\mathcal B=(b_1,\ldots,b_n)\) be a basis for a vector space \(V\). The map
\[ \Phi_{\mathcal B}:V\to \mathbb F^n, \qquad \Phi_{\mathcal B}(v)=[v]_{\mathcal B} \]
is called the coordinate map determined by \(\mathcal B\).
For every basis \(\mathcal B\) of an \(n\)-dimensional vector space \(V\), the coordinate map
\[ \Phi_{\mathcal B}:V\to \mathbb F^n \]
is a one-to-one and onto linear transformation. Therefore
\[ V\cong \mathbb F^n. \]
Linearity follows from the uniqueness of coordinates:
\[ [v+w]_{\mathcal B}=[v]_{\mathcal B}+[w]_{\mathcal B}, \qquad [cv]_{\mathcal B}=c[v]_{\mathcal B}. \]
Injectivity follows because \([v]_{\mathcal B}=0\) means all coordinates of \(v\) are zero, so \(v=0\). Surjectivity follows because every column \((c_1,\ldots,c_n)^T\) comes from the vector \(c_1b_1+\cdots+c_nb_n\).
Different bases give different coordinate maps. The vector is the same; the coordinate column changes.
A linear transformation is a machine that respects addition and scalar multiplication. To represent it by a matrix, we must choose coordinates for the input and output spaces.
Let \(T:V\to W\) be a linear transformation. Let
\[ \mathcal B=(b_1,\ldots,b_n) \]
be a basis for \(V\), and let
\[ \mathcal C=(w_1,\ldots,w_m) \]
be a basis for \(W\). The \(\mathcal B\)-\(\mathcal C\) matrix of \(T\) is the \(m\times n\) matrix
\[ [T]_{\mathcal C\leftarrow \mathcal B} = \begin{bmatrix} [T(b_1)]_{\mathcal C} & [T(b_2)]_{\mathcal C} & \cdots & [T(b_n)]_{\mathcal C} \end{bmatrix}. \]
The arrow notation reminds us of direction:
\[ [T]_{\mathcal C\leftarrow \mathcal B} \quad\text{takes}\quad \mathcal B\text{-coordinates as input and returns }\mathcal C\text{-coordinates as output.} \]
Let \(T:V\to W\) be linear, and let
\[ M=[T]_{\mathcal C\leftarrow \mathcal B}. \]
Then for every \(v\in V\),
\[ [T(v)]_{\mathcal C}=M[v]_{\mathcal B}. \]
To build the matrix of \(T\):
Let
\[ D:P_3(\mathbb R)\to P_2(\mathbb R),\qquad D(p)=p'. \]
Use the bases
\[ \mathcal B=(1,t,t^2,t^3),\qquad \mathcal C=(1,t,t^2). \]
Then
\[ D(1)=0, \quad D(t)=1, \quad D(t^2)=2t, \quad D(t^3)=3t^2. \]
Thus
\[ [D]_{\mathcal C\leftarrow \mathcal B} = \begin{bmatrix} 0&1&0&0\\ 0&0&2&0\\ 0&0&0&3 \end{bmatrix}. \]
For
\[ p(t)=4-2t+5t^2+t^3, \qquad [p]_{\mathcal B}=\begin{bmatrix}4\\-2\\5\\1\end{bmatrix}, \]
we get
\[ [D(p)]_{\mathcal C} = \begin{bmatrix} 0&1&0&0\\ 0&0&2&0\\ 0&0&0&3 \end{bmatrix} \begin{bmatrix}4\\-2\\5\\1\end{bmatrix} = \begin{bmatrix}-2\\10\\3\end{bmatrix}. \]
Therefore
\[ D(p)=-2+10t+3t^2. \]
D = sp.Matrix([[0, 1, 0, 0],
[0, 0, 2, 0],
[0, 0, 0, 3]])
p = sp.Matrix([4, -2, 5, 1])
D * p\(\displaystyle \left[\begin{matrix}-2\\10\\3\end{matrix}\right]\)
Let
\[ T:M_2(\mathbb R)\to \mathbb R, \qquad T(A)=\operatorname{tr}(A). \]
Use the basis
\[ \mathcal B=(E_{11},E_{12},E_{21},E_{22}) \]
for \(M_2(\mathbb R)\) and \(\mathcal C=(1)\) for \(\mathbb R\). Since
\[ T(E_{11})=1, \quad T(E_{12})=0, \quad T(E_{21})=0, \quad T(E_{22})=1, \]
the matrix of \(T\) is
\[ [T]_{\mathcal C\leftarrow \mathcal B} = \begin{bmatrix}1&0&0&1\end{bmatrix}. \]
Thus, for
\[ A=\begin{bmatrix}a&b\\c&d\end{bmatrix}, \qquad [A]_{\mathcal B}=\begin{bmatrix}a\\b\\c\\d\end{bmatrix}, \]
we obtain
\[ [T(A)]_{\mathcal C} =\begin{bmatrix}1&0&0&1\end{bmatrix} \begin{bmatrix}a\\b\\c\\d\end{bmatrix} =a+d. \]
Let
\[ I:P_2(\mathbb R)\to P_3(\mathbb R), \qquad I(p)(t)=\int_0^t p(s)\,ds. \]
Using standard bases
\[ \mathcal B=(1,t,t^2),\qquad \mathcal C=(1,t,t^2,t^3), \]
we have
\[ I(1)=t,\qquad I(t)=\frac{t^2}{2},\qquad I(t^2)=\frac{t^3}{3}. \]
Therefore
\[ [I]_{\mathcal C\leftarrow \mathcal B} = \begin{bmatrix} 0&0&0\\ 1&0&0\\ 0&\frac12&0\\ 0&0&\frac13 \end{bmatrix}. \]
I = sp.Matrix([[0, 0, 0],
[1, 0, 0],
[0, sp.Rational(1, 2), 0],
[0, 0, sp.Rational(1, 3)]])
q = sp.Matrix([6, -2, 3]) # 6 - 2t + 3t^2
I * q\(\displaystyle \left[\begin{matrix}0\\6\\-1\\1\end{matrix}\right]\)
The output coordinate vector is
\[ \begin{bmatrix}0\\6\\-1\\1\end{bmatrix}, \]
so
\[ I(6-2t+3t^2)=6t-t^2+t^3. \]
Coordinate matrices make the same transformation visible in two languages: abstract vectors and coordinate columns.
For a linear transformation \(T:V\to W\), the matrix
\[ M=[T]_{\mathcal C\leftarrow \mathcal B} \]
makes the following diagram commute:
\[ \begin{array}{ccc} V & \xrightarrow{\ T\ } & W \\ \downarrow {[\cdot]_{\mathcal B}} & & \downarrow {[\cdot]_{\mathcal C}} \\ \mathbb F^n & \xrightarrow{\ M\ } & \mathbb F^m. \end{array} \]
Commutativity means that the two paths from \(V\) to \(\mathbb F^m\) give the same answer:
\[ [T(v)]_{\mathcal C}=M[v]_{\mathcal B}. \]
There are two ways to use the machine \(T\).
The theorem says both routes agree.
Now suppose there are two coordinate systems for the same vector space. How do we translate from one to the other?
Let
\[ \mathcal B=(b_1,\ldots,b_n) \]
be a basis for \(\mathbb F^n\). The matrix
\[ P_{\mathcal B}=\begin{bmatrix}b_1&b_2&\cdots&b_n\end{bmatrix} \]
is called the basis matrix or change-of-coordinate matrix from \(\mathcal B\)-coordinates to standard coordinates.
Let \(\mathcal B\) be a basis for \(\mathbb F^n\). Then
\[ x=P_{\mathcal B}[x]_{\mathcal B} \]
and
\[ [x]_{\mathcal B}=P_{\mathcal B}^{-1}x. \]
The first formula reconstructs the vector. The second formula finds its coordinates.
Let
\[ \mathcal B=\left( \begin{bmatrix}1\\1\end{bmatrix}, \begin{bmatrix}-1\\2\end{bmatrix} \right). \]
Then
\[ P_{\mathcal B}=\begin{bmatrix}1&-1\\1&2\end{bmatrix}. \]
If
\[ [x]_{\mathcal B}=\begin{bmatrix}4\\2\end{bmatrix}, \]
then
\[ x=P_{\mathcal B}[x]_{\mathcal B} = \begin{bmatrix}1&-1\\1&2\end{bmatrix} \begin{bmatrix}4\\2\end{bmatrix} = \begin{bmatrix}2\\8\end{bmatrix}. \]
P_B = sp.Matrix([[1, -1], [1, 2]])
coords_B = sp.Matrix([4, 2])
P_B * coords_B\(\displaystyle \left[\begin{matrix}2\\8\end{matrix}\right]\)
Let \(\mathcal B\) and \(\mathcal C\) be two bases of \(\mathbb F^n\). Then
\[ x=P_{\mathcal B}[x]_{\mathcal B}=P_{\mathcal C}[x]_{\mathcal C}. \]
Solving for \([x]_{\mathcal C}\) gives
\[ [x]_{\mathcal C}=P_{\mathcal C}^{-1}P_{\mathcal B}[x]_{\mathcal B}. \]
Let \(\mathcal B\) and \(\mathcal C\) be bases for \(\mathbb F^n\). The matrix that converts \(\mathcal B\)-coordinates into \(\mathcal C\)-coordinates is
\[ P_{\mathcal C\leftarrow \mathcal B}=P_{\mathcal C}^{-1}P_{\mathcal B}. \]
That is,
\[ [x]_{\mathcal C}=P_{\mathcal C\leftarrow \mathcal B}[x]_{\mathcal B}. \]
The matrix \(P_{\mathcal C\leftarrow \mathcal B}\) starts with \(\mathcal B\)-coordinates and ends with \(\mathcal C\)-coordinates. The arrow points toward the output coordinate system.
Let
\[ \mathcal B=\left( \begin{bmatrix}1\\1\end{bmatrix}, \begin{bmatrix}-1\\2\end{bmatrix} \right), \qquad \mathcal C=\left( \begin{bmatrix}2\\0\end{bmatrix}, \begin{bmatrix}0\\1\end{bmatrix} \right). \]
Then
\[ P_{\mathcal B}=\begin{bmatrix}1&-1\\1&2\end{bmatrix}, \qquad P_{\mathcal C}=\begin{bmatrix}2&0\\0&1\end{bmatrix}. \]
Thus
\[ P_{\mathcal C\leftarrow \mathcal B}=P_{\mathcal C}^{-1}P_{\mathcal B}. \]
P_B = sp.Matrix([[1, -1], [1, 2]])
P_C = sp.Matrix([[2, 0], [0, 1]])
P_C_from_B = P_C.inv() * P_B
P_C_from_B\(\displaystyle \left[\begin{matrix}\frac{1}{2} & - \frac{1}{2}\\1 & 2\end{matrix}\right]\)
If \([x]_{\mathcal B}=(4,2)^T\), then
x_B = sp.Matrix([4, 2])
x_C = P_C_from_B * x_B
x_standard = P_B * x_B
x_C, x_standard, P_C * x_C\(\displaystyle \left( \left[\begin{matrix}1\\8\end{matrix}\right], \ \left[\begin{matrix}2\\8\end{matrix}\right], \ \left[\begin{matrix}2\\8\end{matrix}\right]\right)\)
The coordinate vector changes, but the actual vector does not.
Now let \(T:V\to V\) be a linear transformation from a vector space to itself. Such a map is called a linear operator. If we use one basis for both the input and output, the matrix of \(T\) is written
\[ [T]_{\mathcal B}. \]
Changing the basis changes the matrix of the same operator.
Let \(T:V\to V\) be a linear operator on a finite-dimensional vector space. Let \(\mathcal B\) and \(\mathcal C\) be bases of \(V\). Then
\[ [T]_{\mathcal C} = P_{\mathcal C\leftarrow \mathcal B}\,[T]_{\mathcal B}\,P_{\mathcal B\leftarrow \mathcal C}. \]
Equivalently, if
\[ S=P_{\mathcal B\leftarrow \mathcal C}, \]
then
\[ [T]_{\mathcal C}=S^{-1}[T]_{\mathcal B}S. \]
Matrices related by
\[ B=S^{-1}AS \]
are called similar matrices.
Similarity is not just an algebraic trick. It means two matrices are the same linear transformation written in two different coordinate systems.
Similar matrices have the same determinant, trace, rank, characteristic polynomial, eigenvalues, and nullity. These quantities describe the underlying operator rather than a particular coordinate costume.
Let
\[ A=\begin{bmatrix}3&1\\0&2\end{bmatrix} \]
be the standard matrix of a linear operator \(T:\mathbb R^2\to\mathbb R^2\). Let
\[ \mathcal C=\left( \begin{bmatrix}1\\1\end{bmatrix}, \begin{bmatrix}1\\0\end{bmatrix} \right). \]
Then
\[ P_{\mathcal C}=\begin{bmatrix}1&1\\1&0\end{bmatrix}. \]
The matrix of the same operator relative to \(\mathcal C\) is
\[ [T]_{\mathcal C}=P_{\mathcal C}^{-1}AP_{\mathcal C}. \]
A = sp.Matrix([[3, 1], [0, 2]])
P_C = sp.Matrix([[1, 1], [1, 0]])
A_C = P_C.inv() * A * P_C
A_C\(\displaystyle \left[\begin{matrix}2 & 0\\2 & 3\end{matrix}\right]\)
The two matrices look different, but they represent the same linear operator.
A.det(), A_C.det(), A.trace(), A_C.trace(), A.eigenvals(), A_C.eigenvals()\(\displaystyle \left( 6, \ 6, \ 5, \ 5, \ \left\{ 2 : 1, \ 3 : 1\right\}, \ \left\{ 2 : 1, \ 3 : 1\right\}\right)\)
They have the same determinant, trace, and eigenvalues.
The purpose of changing basis is not only to translate coordinates. Often we choose a basis to make a transformation easier to understand.
The best possible situation is when the basis vectors are eigenvectors.
Let \(T:V\to V\) be a linear operator. A basis
\[ \mathcal B=(v_1,\ldots,v_n) \]
is called an eigenbasis for \(T\) if every \(v_i\) is an eigenvector of \(T\).
If
\[ T(v_i)=\lambda_i v_i, \]
then the matrix of \(T\) in the eigenbasis is diagonal:
\[ [T]_{\mathcal B}=\begin{bmatrix} \lambda_1&0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&\lambda_n \end{bmatrix}. \]
Let \(A\in\mathbb F^{n\times n}\). The following are equivalent.
\[ A=PDP^{-1}. \]
In this case, the columns of \(P\) are eigenvectors of \(A\), and the diagonal entries of \(D\) are the corresponding eigenvalues.
Let
\[ A=\begin{bmatrix}4&1\\2&3\end{bmatrix}. \]
A = sp.Matrix([[4, 1], [2, 3]])
A.eigenvects()\(\displaystyle \left[ \left( 2, \ 1, \ \left[ \left[\begin{matrix}- \frac{1}{2}\\1\end{matrix}\right]\right]\right), \ \left( 5, \ 1, \ \left[ \left[\begin{matrix}1\\1\end{matrix}\right]\right]\right)\right]\)
The eigenvalues are \(5\) and \(2\). Eigenvectors may be chosen as
\[ v_1=\begin{bmatrix}1\\1\end{bmatrix}, \qquad v_2=\begin{bmatrix}-1\\2\end{bmatrix}. \]
Let
\[ P=\begin{bmatrix}1&-1\\1&2\end{bmatrix}, \qquad D=\begin{bmatrix}5&0\\0&2\end{bmatrix}. \]
Then
\[ P^{-1}AP=D. \]
P = sp.Matrix([[1, -1], [1, 2]])
D = P.inv() * A * P
D\(\displaystyle \left[\begin{matrix}5 & 0\\0 & 2\end{matrix}\right]\)
Thus, in the eigenbasis, the transformation simply scales the first coordinate by \(5\) and the second coordinate by \(2\).
Suppose a state vector changes by the rule
\[ x_{k+1}=Ax_k. \]
Then
\[ x_k=A^k x_0. \]
If \(A=PDP^{-1}\), then
\[ A^k=PD^kP^{-1}. \]
This makes long-term behavior much easier to compute.
A = sp.Matrix([[4, 1], [2, 3]])
P = sp.Matrix([[1, -1], [1, 2]])
D = sp.diag(5, 2)
x0 = sp.Matrix([3, 0])
for k in range(5):
print(k, A**k * x0, P * (D**k) * P.inv() * x0)0 Matrix([[3], [0]]) Matrix([[3], [0]])
1 Matrix([[12], [6]]) Matrix([[12], [6]])
2 Matrix([[54], [42]]) Matrix([[54], [42]])
3 Matrix([[258], [234]]) Matrix([[258], [234]])
4 Matrix([[1266], [1218]]) Matrix([[1266], [1218]])
The formula \(A^k=PD^kP^{-1}\) says: translate into eigen-coordinates, scale by powers of eigenvalues, then translate back.
Diagonalization is a change of language. In the original coordinates, the system mixes directions. In eigen-coordinates, each coordinate evolves independently.
In data science, the same object may be represented in different coordinate systems.
A data point may be written in raw features:
\[ x=\begin{bmatrix}\text{height}\\ \text{weight}\\ \text{age}\end{bmatrix}. \]
Or it may be written in transformed features:
\[ [x]_{\mathcal B}=\begin{bmatrix}\text{overall size}\\ \text{contrast feature}\\ \text{age direction}\end{bmatrix}. \]
A change of basis can separate mixed information into more interpretable coordinates. Later, in spectral decomposition, SVD, PCA, Fourier bases, and wavelets, we will repeatedly use the same idea:
\[ \text{choose a basis that reveals structure.} \]
Suppose two raw features are stored in standard coordinates, and we choose a new basis
\[ b_1=\begin{bmatrix}1\\1\end{bmatrix}, \qquad b_2=\begin{bmatrix}1\\-1\end{bmatrix}. \]
The first direction measures total size; the second measures contrast.
P = sp.Matrix([[1, 1], [1, -1]])
data = sp.Matrix([[6, 8, 10, 4],
[4, 8, 2, 6]]) # columns are data points
coords = P.inv() * data
coords\(\displaystyle \left[\begin{matrix}5 & 8 & 6 & 5\\1 & 0 & 4 & -1\end{matrix}\right]\)
Each column has been rewritten in size-contrast coordinates. This is the beginning of the idea behind many feature transformations.
In optimization, variables often satisfy linear constraints. Instead of working in all of \(\mathbb R^n\), we can choose coordinates adapted to the feasible subspace.
For example, suppose
\[ x_1+x_2+x_3=0. \]
The solution space is a plane in \(\mathbb R^3\) with basis
\[ v_1=\begin{bmatrix}1\\-1\\0\end{bmatrix}, \qquad v_2=\begin{bmatrix}1\\0\\-1\end{bmatrix}. \]
Every feasible vector has the form
\[ x=s v_1+t v_2. \]
The coordinates \((s,t)\) describe feasible directions directly.
v1 = sp.Matrix([1, -1, 0])
v2 = sp.Matrix([1, 0, -1])
B = sp.Matrix.hstack(v1, v2)
s, t = sp.symbols('s t')
x = B * sp.Matrix([s, t])
x\(\displaystyle \left[\begin{matrix}s + t\\- s\\- t\end{matrix}\right]\)
The constraint is automatically satisfied:
sp.Matrix([[1,1,1]]) * x\(\displaystyle \left[\begin{matrix}0\end{matrix}\right]\)
This is an example of a useful basis: it builds the constraint into the coordinates.
Let
\[ \mathcal B=\left(\begin{bmatrix}1\\2\end{bmatrix},\begin{bmatrix}3\\1\end{bmatrix}\right), \qquad \mathcal C=\left(\begin{bmatrix}1\\0\end{bmatrix},\begin{bmatrix}1\\1\end{bmatrix}\right). \]
Find \(P_{\mathcal C\leftarrow \mathcal B}\). Then compute \([x]_{\mathcal C}\) when
\[ [x]_{\mathcal B}=\begin{bmatrix}2\\-1\end{bmatrix}. \]
Let
\[ T:P_2(\mathbb R)\to P_2(\mathbb R), \qquad T(p)=p+p'. \]
Use the basis \(\mathcal B=(1,t,t^2)\) for both domain and codomain. Find \([T]_{\mathcal B}\).
Let
\[ A=\begin{bmatrix}2&1\\0&3\end{bmatrix}, \qquad P=\begin{bmatrix}1&1\\0&1\end{bmatrix}. \]
Compute \(B=P^{-1}AP\). Verify that \(A\) and \(B\) have the same trace, determinant, and eigenvalues.
Find a basis in which the linear transformation with standard matrix
\[ A=\begin{bmatrix}1&2\\2&1\end{bmatrix} \]
is diagonal. Explain what the new coordinates mean geometrically.
Let
\[ \mathcal B=\left(\begin{bmatrix}2\\1\end{bmatrix},\begin{bmatrix}1\\1\end{bmatrix}\right). \]
Find \([x]_{\mathcal B}\) for
\[ x=\begin{bmatrix}7\\4\end{bmatrix}. \]
Let
\[ \mathcal B=(1,\ 1+t,\ t+t^2) \]
be a basis for \(P_2(\mathbb R)\). Find
\[ [p]_{\mathcal B} \]
for
\[ p(t)=4+7t+3t^2. \]
Let
\[ D:P_4(\mathbb R)\to P_3(\mathbb R), \qquad D(p)=p'. \]
Use standard bases
\[ \mathcal B=(1,t,t^2,t^3,t^4), \qquad \mathcal C=(1,t,t^2,t^3). \]
Find \([D]_{\mathcal C\leftarrow \mathcal B}\).
Let
\[ T:P_2(\mathbb R)\to P_2(\mathbb R), \qquad T(p)=t p'(t). \]
Use the standard basis \(\mathcal B=(1,t,t^2)\) for both domain and codomain. Find \([T]_{\mathcal B}\).
Let
\[ \mathcal B=\left(\begin{bmatrix}1\\1\end{bmatrix},\begin{bmatrix}1\\-1\end{bmatrix}\right), \qquad \mathcal C=\left(\begin{bmatrix}2\\0\end{bmatrix},\begin{bmatrix}0\\3\end{bmatrix}\right). \]
Find \(P_{\mathcal C\leftarrow \mathcal B}\).
Let
\[ A=\begin{bmatrix}4&1\\0&2\end{bmatrix}, \qquad P=\begin{bmatrix}1&1\\0&1\end{bmatrix}. \]
Compute
\[ B=P^{-1}AP. \]
Verify that \(A\) and \(B\) have the same determinant and trace.
Diagonalize the matrix
\[ A=\begin{bmatrix}3&2\\2&3\end{bmatrix}. \]
That is, find \(P\) and \(D\) such that
\[ A=PDP^{-1}. \]
Let
\[ N=\{x\in\mathbb R^3:x_1+x_2+x_3=0\}. \]
Find a basis for \(N\), and write
\[ x=\begin{bmatrix}2\\-5\\3\end{bmatrix} \]
in that basis.
We solve
\[ c_1\begin{bmatrix}2\\1\end{bmatrix}+c_2\begin{bmatrix}1\\1\end{bmatrix}=\begin{bmatrix}7\\4\end{bmatrix}. \]
This gives
\[ 2c_1+c_2=7,\qquad c_1+c_2=4. \]
Subtracting gives \(c_1=3\), and then \(c_2=1\). Thus
\[ [x]_{\mathcal B}=\begin{bmatrix}3\\1\end{bmatrix}. \]
P = sp.Matrix([[2,1],[1,1]])
x = sp.Matrix([7,4])
P.LUsolve(x)\(\displaystyle \left[\begin{matrix}3\\1\end{matrix}\right]\)
Write
\[ 4+7t+3t^2=c_1(1)+c_2(1+t)+c_3(t+t^2). \]
Matching coefficients gives
\[ c_1+c_2=4, \qquad c_2+c_3=7, \qquad c_3=3. \]
Thus \(c_3=3\), \(c_2=4\), and \(c_1=0\). Therefore
\[ [p]_{\mathcal B}=\begin{bmatrix}0\\4\\3\end{bmatrix}. \]
We compute
\[ D(1)=0, \quad D(t)=1, \quad D(t^2)=2t, \quad D(t^3)=3t^2, \quad D(t^4)=4t^3. \]
Therefore
\[ [D]_{\mathcal C\leftarrow \mathcal B} = \begin{bmatrix} 0&1&0&0&0\\ 0&0&2&0&0\\ 0&0&0&3&0\\ 0&0&0&0&4 \end{bmatrix}. \]
Use \(\mathcal B=(1,t,t^2)\). We compute
\[ T(1)=t\cdot 0=0, \]
\[ T(t)=t\cdot 1=t, \]
and
\[ T(t^2)=t\cdot 2t=2t^2. \]
Thus
\[ [T]_{\mathcal B} = \begin{bmatrix} 0&0&0\\ 0&1&0\\ 0&0&2 \end{bmatrix}. \]
We have
\[ P_{\mathcal B}=\begin{bmatrix}1&1\\1&-1\end{bmatrix}, \qquad P_{\mathcal C}=\begin{bmatrix}2&0\\0&3\end{bmatrix}. \]
Therefore
\[ P_{\mathcal C\leftarrow \mathcal B}=P_{\mathcal C}^{-1}P_{\mathcal B} = \begin{bmatrix}1/2&1/2\\1/3&-1/3\end{bmatrix}. \]
P_B = sp.Matrix([[1,1],[1,-1]])
P_C = sp.Matrix([[2,0],[0,3]])
P_C.inv() * P_B\(\displaystyle \left[\begin{matrix}\frac{1}{2} & \frac{1}{2}\\\frac{1}{3} & - \frac{1}{3}\end{matrix}\right]\)
Compute
\[ B=P^{-1}AP. \]
A = sp.Matrix([[4,1],[0,2]])
P = sp.Matrix([[1,1],[0,1]])
B = P.inv() * A * P
B, A.det(), B.det(), A.trace(), B.trace()\(\displaystyle \left( \left[\begin{matrix}4 & 3\\0 & 2\end{matrix}\right], \ 8, \ 8, \ 6, \ 6\right)\)
The output shows
\[ B=\begin{bmatrix}4&3\\0&2\end{bmatrix}. \]
Also
\[ \det(A)=\det(B)=8, \qquad \operatorname{tr}(A)=\operatorname{tr}(B)=6. \]
The matrix
\[ A=\begin{bmatrix}3&2\\2&3\end{bmatrix} \]
has eigenvectors
\[ v_1=\begin{bmatrix}1\\1\end{bmatrix} \quad\text{with eigenvalue }5, \qquad v_2=\begin{bmatrix}1\\-1\end{bmatrix} \quad\text{with eigenvalue }1. \]
Thus
\[ P=\begin{bmatrix}1&1\\1&-1\end{bmatrix}, \qquad D=\begin{bmatrix}5&0\\0&1\end{bmatrix}. \]
Then
\[ A=PDP^{-1}. \]
A = sp.Matrix([[3,2],[2,3]])
P = sp.Matrix([[1,1],[1,-1]])
D = P.inv() * A * P
D\(\displaystyle \left[\begin{matrix}5 & 0\\0 & 1\end{matrix}\right]\)
The equation
\[ x_1+x_2+x_3=0 \]
gives
\[ x_3=-x_1-x_2. \]
Let \(x_1=s\) and \(x_2=t\). Then
\[ x=\begin{bmatrix}s\\t\\-s-t\end{bmatrix} =s\begin{bmatrix}1\\0\\-1\end{bmatrix} +t\begin{bmatrix}0\\1\\-1\end{bmatrix}. \]
So a basis is
\[ \left(\begin{bmatrix}1\\0\\-1\end{bmatrix}, \begin{bmatrix}0\\1\\-1\end{bmatrix}\right). \]
For
\[ x=\begin{bmatrix}2\\-5\\3\end{bmatrix}, \]
we have \(s=2\) and \(t=-5\). Therefore
\[ [x]_{\mathcal B}=\begin{bmatrix}2\\-5\end{bmatrix}. \]
Ask an AI assistant:
I have the basis \(\mathcal B=((1,1)^T,(-1,2)^T)\) and the vector \(x=(2,8)^T\). Find \([x]_{\mathcal B}\) and explain why the answer is not the same object as \(x\).
Then check whether the assistant reconstructs \(x\) by multiplying \(P_{\mathcal B}[x]_{\mathcal B}\).
Ask:
Explain how to find the matrix of the derivative map \(D:P_3\to P_2\) using the standard polynomial bases. Give the matrix and explain what each column means.
Then compare the answer with Section 5.3.1.
Give an AI assistant two bases \(\mathcal B\) and \(\mathcal C\). Ask it for both matrices
\[ P_{\mathcal C\leftarrow \mathcal B} \quad\text{and}\quad P_{\mathcal B\leftarrow \mathcal C}. \]
Then verify that their product is the identity matrix.
Choose a random invertible matrix \(P\) and a matrix \(A\). Compute
\[ B=P^{-1}AP. \]
Ask AI to predict which quantities are the same for \(A\) and \(B\). Check determinant, trace, rank, characteristic polynomial, and eigenvalues in Python.
Find a matrix with two distinct real eigenvalues. Ask an AI assistant to diagonalize it. Then ask the assistant to explain the diagonalization in terms of changing coordinate systems, not only as a formula.
This chapter introduced the coordinate view of linear algebra.
The main lesson is this:
\[ \boxed{\text{Coordinates change. Linear structure remains.}} \]