Code
import sympy as sp
sp.init_printing()How linear algebra measures information, redundancy, and lost directions
Guiding question.
When a vector space becomes large or abstract, how do we choose coordinates, count degrees of freedom, and measure what a linear transformation preserves or loses?
In Chapter 3 we learned that vectors do not have to be arrows in the plane. They can be columns, polynomials, matrices, functions, or signals. We also learned that linear transformations preserve the operations of addition and scalar multiplication.
Chapter 4 answers the next natural question: once we have a vector space, how do we build a coordinate system for it?
A basis is a coordinate system. It is a list of vectors that is just large enough to describe the whole space and just small enough to avoid redundancy. Dimension is the number of independent directions in the space. Rank–nullity is the accounting law that says every linear transformation divides the input information into two parts: information that survives in the image and information that disappears into the kernel.
This chapter is the bridge from abstract linear spaces to computation. It explains why row reduction finds bases, why rank measures information, why kernels describe lost directions, and why invertibility is the special case where nothing is lost.
This chapter has three layers.
Use Python or an AI assistant to check row reductions, compute ranks, and test conjectures. But always ask for an explanation in mathematical language: What is the space? What is the basis? What is the kernel? What information is lost?
import sympy as sp
sp.init_printing()Imagine several people give you directions to the same destination. Some directions may be necessary; some may repeat information already contained in the others. Linear independence is the mathematical test for whether a list of vectors contains redundancy.
Let \(V\) be a vector space over a field \(\mathbb F\). Vectors
\[ \vec v_1,\vec v_2,\ldots,\vec v_p\in V \]
are linearly independent if the equation
\[ x_1\vec v_1+x_2\vec v_2+\cdots+x_p\vec v_p=\vec 0 \]
has only the trivial solution
\[ x_1=x_2=\cdots=x_p=0. \]
They are linearly dependent if there exist scalars \(a_1,\ldots,a_p\in \mathbb F\), not all zero, such that
\[ a_1\vec v_1+a_2\vec v_2+\cdots+a_p\vec v_p=\vec 0. \]
Such an equation is called a nontrivial linear relation.
For an infinite set \(S\subseteq V\), linear independence means that every finite subset of \(S\) is linearly independent. In this chapter most computations use finite lists.
A dependent list contains at least one vector that can be built from the others. An independent list has no redundant vector.
Suppose \(\vec v_i\) is a linear combination of the preceding vectors \(\vec v_1,\ldots,\vec v_{i-1}\). Then removing \(\vec v_i\) does not change the span:
\[ \operatorname{span}\{\vec v_1,\ldots,\vec v_p\} = \operatorname{span}\{\vec v_1,\ldots,\vec v_{i-1},\vec v_{i+1},\ldots,\vec v_p\}. \]
Let \(\vec v_1,\ldots,\vec v_p\in V\).
When the vectors are columns in \(\mathbb F^n\), independence becomes a homogeneous linear system.
Let \(\vec v_1,\ldots,\vec v_p\in \mathbb F^n\) and let
\[ A=\begin{bmatrix}\vec v_1&\vec v_2&\cdots&\vec v_p\end{bmatrix}. \]
Then \(\{\vec v_1,\ldots,\vec v_p\}\) is linearly independent if and only if any, and hence all, of the following equivalent conditions hold:
If \(p>n\), then any set of \(p\) vectors in \(\mathbb F^n\) is linearly dependent.
The converse is false. Having \(p\leq n\) does not guarantee independence. For example, two equal nonzero vectors in \(\mathbb R^n\) are dependent.
Let
\[ \vec v_1=\begin{bmatrix}1\\-3\\4\end{bmatrix},\qquad \vec v_2=\begin{bmatrix}2\\-2\\5\end{bmatrix},\qquad \vec v_3=\begin{bmatrix}3\\-1\\6\end{bmatrix}. \]
Form the matrix with these vectors as columns:
\[ A=\begin{bmatrix} 1&2&3\\ -3&-2&-1\\ 4&5&6 \end{bmatrix}. \]
A = sp.Matrix([[1, 2, 3],
[-3, -2, -1],
[4, 5, 6]])
A_rref, pivots = A.rref()
A, A_rref, pivots\(\displaystyle \left( \left[\begin{matrix}1 & 2 & 3\\-3 & -2 & -1\\4 & 5 & 6\end{matrix}\right], \ \left[\begin{matrix}1 & 0 & -1\\0 & 1 & 2\\0 & 0 & 0\end{matrix}\right], \ \left( 0, \ 1\right)\right)\)
The reduced row echelon form is
\[ \operatorname{rref}(A)= \begin{bmatrix} 1&0&-1\\ 0&1&2\\ 0&0&0 \end{bmatrix}. \]
Thus \(x_3\) is free, and the equations are
\[ x_1=x_3,\qquad x_2=-2x_3. \]
Taking \(x_3=1\) gives
\[ (x_1,x_2,x_3)=(1,-2,1). \]
Therefore
\[ \vec v_1-2\vec v_2+\vec v_3=\vec 0. \]
The third vector is redundant:
\[ \vec v_3=-\vec v_1+2\vec v_2. \]
Consider the polynomials in \(P_3(\mathbb R)\):
\[ p_1(t)=1+t,\qquad p_2(t)=1-t,\qquad p_3(t)=t^2+t^3,\qquad p_4(t)=t^2-t^3. \]
Suppose
\[ a_1p_1+a_2p_2+a_3p_3+a_4p_4=0. \]
Comparing coefficients of \(1,t,t^2,t^3\) gives
\[ \begin{aligned} a_1+a_2&=0, & a_1-a_2&=0,\\ a_3+a_4&=0, & a_3-a_4&=0. \end{aligned} \]
The coefficient matrix is
\[ \begin{bmatrix} 1&1&0&0\\ 1&-1&0&0\\ 0&0&1&1\\ 0&0&1&-1 \end{bmatrix}. \]
C = sp.Matrix([[1,1,0,0],
[1,-1,0,0],
[0,0,1,1],
[0,0,1,-1]])
C.det(), C.rank(), C.rref()[0]\(\displaystyle \left( 4, \ 4, \ \left[\begin{matrix}1 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{matrix}\right]\right)\)
Since the determinant is nonzero, the only solution is
\[ a_1=a_2=a_3=a_4=0. \]
So the four polynomials are linearly independent.
The functions \(e^t,e^{2t},e^{3t}\) are linearly independent in the vector space of all real-valued functions on \(\mathbb R\).
One computational way to test this is to evaluate at three points. If
\[ a e^t+b e^{2t}+c e^{3t}=0 \quad \text{for all }t, \]
then evaluating at \(t=0,1,2\) gives
\[ \begin{bmatrix} 1&1&1\\ e&e^2&e^3\\ e^2&e^4&e^6 \end{bmatrix} \begin{bmatrix}a\\b\\c\end{bmatrix}=0. \]
This is a Vandermonde-type matrix with distinct nodes \(1,e,e^2\), hence it is invertible. Therefore \(a=b=c=0\).
t = sp.symbols('t')
M = sp.Matrix([[1, 1, 1],
[sp.E, sp.E**2, sp.E**3],
[sp.E**2, sp.E**4, sp.E**6]])
sp.factor(M.det())\(\displaystyle \left(-1 + e\right)^{3} \left(1 + e\right) e^{4}\)
A basis is a list of vectors that works like a dictionary. Every vector in the space can be translated into coordinates using the basis, and that translation is unique.
Let \(V\) be a vector space over \(\mathbb F\). A subset
\[ B=\{\vec b_1,\ldots,\vec b_n\} \]
is a basis for \(V\) if
A basis balances two competing goals. A smaller set has a better chance to be independent; a larger set has a better chance to span. A basis is both:
If \(\{\vec v_1,\ldots,\vec v_n\}\) is linearly independent in \(V\) and
\[ V=\operatorname{span}\{\vec w_1,\ldots,\vec w_m\}, \]
then
\[ n\leq m. \]
Let \(S=\{\vec v_1,\ldots,\vec v_p\}\) span a subspace \(H\) of \(V\).
Let \(V\) be finite-dimensional.
The standard basis of \(\mathbb F^n\) consists of the columns of the identity matrix:
\[ \vec e_1=\begin{bmatrix}1\\0\\\vdots\\0\end{bmatrix},\quad \vec e_2=\begin{bmatrix}0\\1\\\vdots\\0\end{bmatrix},\quad \ldots,\quad \vec e_n=\begin{bmatrix}0\\0\\\vdots\\1\end{bmatrix}. \]
The vector space \(M_2(\mathbb F)\) of \(2\times 2\) matrices has standard basis
\[ E_{11}=\begin{bmatrix}1&0\\0&0\end{bmatrix},\quad E_{12}=\begin{bmatrix}0&1\\0&0\end{bmatrix},\quad E_{21}=\begin{bmatrix}0&0\\1&0\end{bmatrix},\quad E_{22}=\begin{bmatrix}0&0\\0&1\end{bmatrix}. \]
Indeed,
\[ \begin{bmatrix}a&b\\c&d\end{bmatrix} = aE_{11}+bE_{12}+cE_{21}+dE_{22}. \]
The standard basis for \(P_2(\mathbb R)\) is
\[ \{1,t,t^2\}. \]
But this is not the only basis. The set
\[ B=\{1,1+t,1+t+t^2\} \]
is also a basis. To test independence, suppose
\[ a(1)+b(1+t)+c(1+t+t^2)=0. \]
Comparing coefficients gives
\[ a+b+c=0,\qquad b+c=0,\qquad c=0. \]
Thus \(a=b=c=0\).
Let \(B=\{\vec b_1,\ldots,\vec b_n\}\) be a basis for \(V\). If
\[ \vec v=c_1\vec b_1+\cdots+c_n\vec b_n, \]
then the coordinate vector of \(\vec v\) relative to \(B\) is
\[ [\vec v]_B= \begin{bmatrix} c_1\\ \vdots\\ c_n \end{bmatrix}. \]
The basis tells us which directions are allowed. The coordinate vector tells us how much of each direction we need.
Let
\[ B=\left\{ \vec b_1=\begin{bmatrix}1\\1\end{bmatrix}, \vec b_2=\begin{bmatrix}1\\-1\end{bmatrix} \right\}. \]
Find the coordinates of
\[ \vec v=\begin{bmatrix}5\\1\end{bmatrix} \]
relative to \(B\).
We solve
\[ c_1\begin{bmatrix}1\\1\end{bmatrix} +c_2\begin{bmatrix}1\\-1\end{bmatrix} = \begin{bmatrix}5\\1\end{bmatrix}. \]
This gives
\[ c_1+c_2=5,\qquad c_1-c_2=1. \]
Hence
\[ c_1=3,\qquad c_2=2. \]
Therefore
\[ [\vec v]_B=\begin{bmatrix}3\\2\end{bmatrix}. \]
B = sp.Matrix([[1, 1],
[1, -1]])
v = sp.Matrix([5, 1])
coords = B.LUsolve(v)
coords\(\displaystyle \left[\begin{matrix}3\\2\end{matrix}\right]\)
Dimension is the number of coordinates needed to describe every vector in a space. It is not the number of objects in the space; most vector spaces contain infinitely many vectors. Dimension counts independent directions.
A vector space \(V\) is finite-dimensional if it has a finite basis. If no finite basis exists, then \(V\) is infinite-dimensional.
If \(V\) is finite-dimensional, then every basis of \(V\) has the same number of vectors.
This common number is called the dimension of \(V\) and is denoted \(\dim V\).
Let \(V\) be a vector space with \(\dim V=n\).
This theorem is one of the most useful time-saving tools in the course. In an \(n\)-dimensional space, if you already have exactly \(n\) vectors, then you only need to check one property:
\[ \dim \mathbb F^n=n. \]
\[ \dim P_n(\mathbb F)=n+1, \]
because the standard basis is
\[ \{1,t,t^2,\ldots,t^n\}. \]
\[ \dim M_{m\times n}(\mathbb F)=mn, \]
because there is one independent coordinate for each matrix entry.
For square matrices,
\[ \dim M_n(\mathbb F)=n^2. \]
Let
\[ B=\{1,1+t,1+t+t^2\}\subset P_2(\mathbb R). \]
Since \(\dim P_2(\mathbb R)=3\), once we know that the three vectors in \(B\) are independent, the Basis Theorem tells us that \(B\) is a basis. We do not also need to prove spanning separately.
# Coordinate columns of 1, 1+t, 1+t+t^2 in the standard basis (1,t,t^2)
P = sp.Matrix([[1, 1, 1],
[0, 1, 1],
[0, 0, 1]])
P.det(), P.rank()\(\displaystyle \left( 1, \ 3\right)\)
The determinant is nonzero, so the columns are independent. Therefore the polynomials form a basis.
A subspace is a smaller vector space inside a larger one. A complement is a second subspace that fills in the missing directions.
Let \(U\) be a subspace of a vector space \(V\). A subspace \(W\subseteq V\) is called a complement of \(U\) in \(V\) if
\[ V=U\oplus W. \]
Equivalently, every vector \(\vec v\in V\) can be written uniquely as
\[ \vec v=\vec u+\vec w, \qquad \vec u\in U,\quad \vec w\in W. \]
Let \(U\) and \(W\) be finite-dimensional subspaces of a vector space \(V\). Then
\[ \dim(U+W)=\dim U+\dim W-\dim(U\cap W). \]
If \(U\cap W=\{\vec 0\}\), then
\[ \dim(U+W)=\dim U+\dim W. \]
In particular, if \(V=U\oplus W\), then
\[ \dim V=\dim U+\dim W. \]
Let
\[ U=\{(x,y,0):x,y\in\mathbb R\}, \qquad W=\{(0,0,z):z\in\mathbb R\}. \]
Then every vector in \(\mathbb R^3\) decomposes uniquely as
\[ (a,b,c)=(a,b,0)+(0,0,c). \]
Thus
\[ \mathbb R^3=U\oplus W. \]
Here
\[ \dim U=2, \qquad \dim W=1, \qquad \dim \mathbb R^3=3. \]
Let
\[ U=\{(x,y,0):x,y\in\mathbb R\}, \qquad W=\{(0,y,z):y,z\in\mathbb R\}. \]
Then \(U+W=\mathbb R^3\), but
\[ U\cap W=\{(0,y,0):y\in\mathbb R\}. \]
The intersection is one-dimensional, so the sum is not direct.
The dimension formula gives
\[ \dim(U+W)=2+2-1=3. \]
This explains how two planes in \(\mathbb R^3\) can span all of \(\mathbb R^3\) while still overlapping in a line.
A linear map \(T:V\to W\) sends vectors in the input space to vectors in the output space. Some input directions may disappear; these form the kernel. The directions that appear in the output form the image.
Let \(T:V\to W\) be a linear transformation.
The kernel of \(T\) is
\[ \ker(T)=\{\vec v\in V:T(\vec v)=\vec 0\}. \]
The image of \(T\) is
\[ \operatorname{im}(T)=\{T(\vec v):\vec v\in V\}. \]
If \(V\) is finite-dimensional, the nullity of \(T\) is
\[ \operatorname{nullity}(T)=\dim\ker(T), \]
and the rank of \(T\) is
\[ \operatorname{rank}(T)=\dim\operatorname{im}(T). \]
Let \(T:V\to W\) be a linear transformation and suppose \(V\) is finite-dimensional. Then
\[ \dim V=\operatorname{rank}(T)+\operatorname{nullity}(T). \]
Equivalently,
\[ \dim V=\dim\operatorname{im}(T)+\dim\ker(T). \]
Rank–nullity is an information balance law. The input dimension splits into
\[ \text{input directions} = \text{visible output directions} + \text{lost kernel directions}. \]
If \(A\in\mathbb F^{m\times n}\) and \(T(\vec x)=A\vec x\), then
\[ \operatorname{rank}(A)+\dim\ker(A)=n. \]
The number \(n\) is the number of input variables, not the number of rows.
Let \(T:\mathbb R^4\to\mathbb R^3\) be defined by
\[ T(\vec x)=A\vec x, \qquad A= \begin{bmatrix} 0&0&2&8\\ 1&5&2&-5\\ 2&10&6&-2 \end{bmatrix}. \]
A = sp.Matrix([[0, 0, 2, 8],
[1, 5, 2, -5],
[2, 10, 6, -2]])
A_rref, pivots = A.rref()
A_rref, pivots, A.rank(), A.nullspace()\(\displaystyle \left( \left[\begin{matrix}1 & 5 & 0 & -13\\0 & 0 & 1 & 4\\0 & 0 & 0 & 0\end{matrix}\right], \ \left( 0, \ 2\right), \ 2, \ \left[ \left[\begin{matrix}-5\\1\\0\\0\end{matrix}\right], \ \left[\begin{matrix}13\\0\\-4\\1\end{matrix}\right]\right]\right)\)
Row reduction gives
\[ \operatorname{rref}(A)= \begin{bmatrix} 1&5&0&-13\\ 0&0&1&4\\ 0&0&0&0 \end{bmatrix}. \]
The pivot columns are columns \(1\) and \(3\). Therefore a basis for the image is given by the corresponding original columns:
\[ \operatorname{im}(T)=\operatorname{span}\left\{ \begin{bmatrix}0\\1\\2\end{bmatrix}, \begin{bmatrix}2\\2\\6\end{bmatrix} \right\}. \]
To find the kernel, solve \(A\vec x=\vec 0\). From the RREF,
\[ x_1=-5x_2+13x_4, \qquad x_3=-4x_4. \]
Thus
\[ \vec x=x_2 \begin{bmatrix}-5\\1\\0\\0\end{bmatrix} +x_4 \begin{bmatrix}13\\0\\-4\\1\end{bmatrix}. \]
So
\[ \ker(T)=\operatorname{span}\left\{ \begin{bmatrix}-5\\1\\0\\0\end{bmatrix}, \begin{bmatrix}13\\0\\-4\\1\end{bmatrix} \right\}. \]
The rank is \(2\) and the nullity is \(2\). Rank–nullity says
\[ 4=2+2. \]
def matrix_summary(A):
A = sp.Matrix(A)
R, pivots = A.rref()
image_basis = [A[:, j] for j in pivots]
kernel_basis = A.nullspace()
return {
"A": A,
"rref": R,
"pivot_columns_0_based": pivots,
"rank": A.rank(),
"nullity": A.shape[1] - A.rank(),
"image_basis": image_basis,
"kernel_basis": kernel_basis,
}
summary = matrix_summary(A)
summary{'A': Matrix([
[0, 0, 2, 8],
[1, 5, 2, -5],
[2, 10, 6, -2]]),
'rref': Matrix([
[1, 5, 0, -13],
[0, 0, 1, 4],
[0, 0, 0, 0]]),
'pivot_columns_0_based': (0, 2),
'rank': 2,
'nullity': 2,
'image_basis': [Matrix([
[0],
[1],
[2]]),
Matrix([
[2],
[2],
[6]])],
'kernel_basis': [Matrix([
[-5],
[ 1],
[ 0],
[ 0]]),
Matrix([
[13],
[ 0],
[-4],
[ 1]])]}
Let
\[ D:P_3(\mathbb R)\to P_2(\mathbb R), \qquad D(p)=p'. \]
If
\[ p(t)=a+bt+ct^2+dt^3, \]
then
\[ D(p)=b+2ct+3dt^2. \]
The kernel consists of constant polynomials:
\[ \ker(D)=\operatorname{span}\{1\}. \]
The image is all of \(P_2(\mathbb R)\):
\[ \operatorname{im}(D)=P_2(\mathbb R). \]
Therefore
\[ \operatorname{nullity}(D)=1, \qquad \operatorname{rank}(D)=3. \]
Since \(\dim P_3(\mathbb R)=4\), rank–nullity gives
\[ 4=3+1. \]
Invertibility means perfect information preservation. A linear map is invertible when every output comes from exactly one input.
Let \(T:V\to W\) be a linear transformation between finite-dimensional vector spaces with
\[ \dim V=\dim W=n. \]
Then the following are equivalent:
For a square matrix \(A\in\mathbb F^{n\times n}\), this becomes the familiar invertible matrix theorem.
Let \(A\in\mathbb F^{n\times n}\). The following are equivalent:
Let
\[ A= \begin{bmatrix} 1&2&0\\ 0&1&3\\ 2&0&1 \end{bmatrix}. \]
Do the columns form a basis of \(\mathbb R^3\)?
A = sp.Matrix([[1,2,0],
[0,1,3],
[2,0,1]])
A.det(), A.rank(), A.rref()[0]\(\displaystyle \left( 13, \ 3, \ \left[\begin{matrix}1 & 0 & 0\\0 & 1 & 0\\0 & 0 & 1\end{matrix}\right]\right)\)
Since \(\det(A)=-11\ne 0\), the columns form a basis of \(\mathbb R^3\).
Dimension also helps us understand important subspaces that arise in applications.
A hyperplane in \(\mathbb F^n\) is a subspace of dimension \(n-1\).
Equivalently, a hyperplane is the kernel of a nonzero linear functional
\[ \phi:\mathbb F^n\to\mathbb F. \]
For example,
\[ H=\{(x,y,z)\in\mathbb R^3:2x-y+3z=0\} \]
is a plane through the origin. It is the kernel of
\[ \phi(x,y,z)=2x-y+3z. \]
Since \(\phi:\mathbb R^3\to\mathbb R\) has rank \(1\), rank–nullity gives
\[ \dim\ker(\phi)=3-1=2. \]
Let
\[ \operatorname{Sym}_n=\{A\in\mathbb R^{n\times n}:A^T=A\} \]
and
\[ \operatorname{Skew}_n=\{A\in\mathbb R^{n\times n}:A^T=-A\}. \]
Both are subspaces of \(\mathbb R^{n\times n}\).
Every real square matrix \(A\in\mathbb R^{n\times n}\) can be written uniquely as
\[ A=\frac{A+A^T}{2}+\frac{A-A^T}{2}, \]
where \(\frac{A+A^T}{2}\) is symmetric and \(\frac{A-A^T}{2}\) is skew-symmetric. Therefore
\[ \mathbb R^{n\times n}=\operatorname{Sym}_n\oplus \operatorname{Skew}_n. \]
The dimensions are
\[ \dim\operatorname{Sym}_n=\frac{n(n+1)}{2}, \qquad \dim\operatorname{Skew}_n=\frac{n(n-1)}{2}. \]
Their sum is
\[ \frac{n(n+1)}{2}+\frac{n(n-1)}{2}=n^2, \]
which matches
\[ \dim\mathbb R^{n\times n}=n^2. \]
A = sp.Matrix([[1, 2, 5],
[0, -1, 4],
[7, 3, 2]])
S = (A + A.T)/2
K = (A - A.T)/2
A, S, K, S + K, S.T == S, K.T == -K(Matrix([
[1, 2, 5],
[0, -1, 4],
[7, 3, 2]]),
Matrix([
[1, 1, 6],
[1, -1, 7/2],
[6, 7/2, 2]]),
Matrix([
[ 0, 1, -1],
[-1, 0, 1/2],
[ 1, -1/2, 0]]),
Matrix([
[1, 2, 5],
[0, -1, 4],
[7, 3, 2]]),
True,
True)
These questions are designed to connect computation with interpretation. They are not only about getting an answer; they are about explaining why the answer makes sense.
A student says: “A vector has coordinates, so the coordinates are the vector.” Explain why this is not quite correct. Use a nonstandard basis of \(\mathbb R^2\) to illustrate your answer.
In a data set, suppose one feature vector is a linear combination of other feature vectors. What does this mean mathematically? What might it mean in a data-science model?
Explain why rank can be interpreted as the number of independent output directions of a linear transformation.
Let \(T:V\to W\) be linear. Explain why two vectors \(\vec v_1\) and \(\vec v_2\) have the same output under \(T\) exactly when \(\vec v_1-\vec v_2\in\ker(T)\).
Find two different complements of the \(x\)-axis in \(\mathbb R^2\).
Why is it often enough to check independence but not spanning when you have exactly \(n\) vectors in an \(n\)-dimensional space?
Suppose \(A\in\mathbb R^{m\times n}\) has a large kernel. What does this suggest about solving \(A\vec x=\vec b\)? Why might there be many solutions or no solutions?
Explain why \(\operatorname{Sym}_n\) and \(\operatorname{Skew}_n\) are natural subspaces. Why is their direct-sum decomposition useful?
A vector is an element of a vector space. Coordinates are the numbers used to describe that vector relative to a chosen basis.
For example, in the standard basis of \(\mathbb R^2\),
\[ \vec v=\begin{bmatrix}5\\1\end{bmatrix} \]
has coordinates \((5,1)\). But relative to
\[ B=\left\{\begin{bmatrix}1\\1\end{bmatrix},\begin{bmatrix}1\\-1\end{bmatrix}\right\}, \]
we found
\[ [\vec v]_B=\begin{bmatrix}3\\2\end{bmatrix}. \]
The vector did not change. The coordinate language changed.
If one feature vector is a linear combination of others, then it adds no new direction to the span. Mathematically, it is redundant.
In data science, this may mean that one feature is determined by other features. Such redundancy can cause inefficient computation, unstable regression coefficients, or unnecessary model complexity.
The image of \(T\) is the set of all outputs \(T(\vec v)\). Its dimension is the number of independent directions that actually appear in the output. This is the rank. Thus rank measures the effective output dimension of the transformation.
We have
\[ T(\vec v_1)=T(\vec v_2) \]
if and only if
\[ T(\vec v_1)-T(\vec v_2)=\vec 0. \]
By linearity,
\[ T(\vec v_1-\vec v_2)=\vec 0. \]
Thus
\[ \vec v_1-\vec v_2\in\ker(T). \]
So the kernel tells us which differences are invisible to the transformation.
Let
\[ U=\operatorname{span}\{(1,0)\}. \]
One complement is
\[ W_1=\operatorname{span}\{(0,1)\}. \]
Another complement is
\[ W_2=\operatorname{span}\{(1,1)\}. \]
Both satisfy
\[ \mathbb R^2=U\oplus W_i. \]
Complements are generally not unique.
In an \(n\)-dimensional space, no independent set can have more than \(n\) vectors. Therefore, an independent set with exactly \(n\) vectors is already as large as possible. It must span the space, so it is a basis.
Similarly, a spanning set with exactly \(n\) vectors cannot contain redundancy. Otherwise, it could be reduced to a smaller spanning set, contradicting the definition of dimension.
A large kernel means many input directions are sent to zero. Thus different inputs may produce the same output. If the system \(A\vec x=\vec b\) is consistent, then adding any vector in \(\ker(A)\) to a solution gives another solution. Therefore there may be infinitely many solutions.
But a large kernel does not guarantee consistency. If \(\vec b\) is not in \(\operatorname{im}(A)\), then there is no solution.
Symmetric and skew-symmetric matrices are natural because they are defined by simple structural equations:
\[ A^T=A, \qquad A^T=-A. \]
They appear in quadratic forms, geometry, differential equations, mechanics, optimization, and numerical linear algebra.
The decomposition
\[ A=\frac{A+A^T}{2}+\frac{A-A^T}{2} \]
separates a matrix into two meaningful parts. The symmetric part controls many energy and quadratic-form properties, while the skew-symmetric part often represents rotation-like behavior.
Determine whether the following vectors are linearly independent:
\[ \vec v_1=\begin{bmatrix}1\\2\\1\end{bmatrix}, \quad \vec v_2=\begin{bmatrix}2\\4\\2\end{bmatrix}, \quad \vec v_3=\begin{bmatrix}0\\1\\1\end{bmatrix}. \]
If they are dependent, find a nontrivial linear relation.
Show that
\[ B=\{1+t,1-t,t^2\} \]
is a basis for \(P_2(\mathbb R)\). Then find the coordinate vector of
\[ p(t)=3+5t+2t^2 \]
relative to \(B\).
Let
\[ U=\operatorname{span}\{(1,0,1),(0,1,1)\} \]
and
\[ W=\operatorname{span}\{(1,1,2),(1,-1,0)\}. \]
Find \(\dim(U)\), \(\dim(W)\), \(\dim(U+W)\), and \(\dim(U\cap W)\).
Let \(T:\mathbb R^4\to\mathbb R^3\) be defined by
\[ A= \begin{bmatrix} 1&2&0&1\\ 0&1&1&3\\ 1&3&1&4 \end{bmatrix}. \]
Find bases for \(\ker(T)\) and \(\operatorname{im}(T)\). Verify rank–nullity.
Let
\[ D:P_4(\mathbb R)\to P_3(\mathbb R), \qquad D(p)=p'. \]
Find \(\ker(D)\), \(\operatorname{im}(D)\), \(\operatorname{rank}(D)\), and \(\operatorname{nullity}(D)\).
Determine whether the columns of
\[ A= \begin{bmatrix} 1&0&2\\ 0&1&-1\\ 2&1&3 \end{bmatrix} \]
form a basis of \(\mathbb R^3\).
Find dimensions of the following subspaces of \(M_3(\mathbb R)\):
Let
\[ H=\{(x_1,x_2,x_3,x_4)\in\mathbb R^4:x_1-2x_2+x_3+3x_4=0\}. \]
Find a basis for \(H\) and compute \(\dim H\).
Form the matrix with the vectors as columns:
\[ A=\begin{bmatrix} 1&2&0\\ 2&4&1\\ 1&2&1 \end{bmatrix}. \]
A = sp.Matrix([[1,2,0],
[2,4,1],
[1,2,1]])
A.rref(), A.nullspace()\(\displaystyle \left( \left( \left[\begin{matrix}1 & 2 & 0\\0 & 0 & 1\\0 & 0 & 0\end{matrix}\right], \ \left( 0, \ 2\right)\right), \ \left[ \left[\begin{matrix}-2\\1\\0\end{matrix}\right]\right]\right)\)
The first two vectors satisfy
\[ \vec v_2=2\vec v_1. \]
Therefore the vectors are dependent. A nontrivial relation is
\[ -2\vec v_1+\vec v_2+0\vec v_3=\vec 0. \]
Write
\[ a(1+t)+b(1-t)+c t^2=0. \]
Comparing coefficients gives
\[ a+b=0, \qquad a-b=0, \qquad c=0. \]
Thus \(a=b=c=0\), so the set is independent. Since \(\dim P_2(\mathbb R)=3\), the set is a basis.
To find coordinates of \(p(t)=3+5t+2t^2\), solve
\[ a(1+t)+b(1-t)+c t^2=3+5t+2t^2. \]
This gives
\[ a+b=3, \qquad a-b=5, \qquad c=2. \]
So
\[ a=4, \qquad b=-1, \qquad c=2. \]
Therefore
\[ [p]_B=\begin{bmatrix}4\\-1\\2\end{bmatrix}. \]
a,b,c = sp.symbols('a b c')
sol = sp.solve([sp.Eq(a+b,3), sp.Eq(a-b,5), sp.Eq(c,2)], [a,b,c])
sol\(\displaystyle \left\{ a : 4, \ b : -1, \ c : 2\right\}\)
Let the four generators be columns of a matrix:
\[ A=\begin{bmatrix} 1&0&1&1\\ 0&1&1&-1\\ 1&1&2&0 \end{bmatrix}. \]
A = sp.Matrix([[1,0,1,1],
[0,1,1,-1],
[1,1,2,0]])
A.rank(), A.rref()\(\displaystyle \left( 2, \ \left( \left[\begin{matrix}1 & 0 & 1 & 1\\0 & 1 & 1 & -1\\0 & 0 & 0 & 0\end{matrix}\right], \ \left( 0, \ 1\right)\right)\right)\)
For \(U\), the two generators are independent, so \(\dim U=2\). For \(W\), the two generators are independent, so \(\dim W=2\).
The rank of the combined matrix is \(2\), so
\[ \dim(U+W)=2. \]
The dimension formula gives
\[ \dim(U\cap W)=\dim U+\dim W-\dim(U+W)=2+2-2=2. \]
In fact, in this example \(U=W\).
A = sp.Matrix([[1,2,0,1],
[0,1,1,3],
[1,3,1,4]])
R, pivots = A.rref()
rank = A.rank()
null_basis = A.nullspace()
image_basis = [A[:,j] for j in pivots]
R, pivots, rank, null_basis, image_basis\(\displaystyle \left( \left[\begin{matrix}1 & 0 & -2 & -5\\0 & 1 & 1 & 3\\0 & 0 & 0 & 0\end{matrix}\right], \ \left( 0, \ 1\right), \ 2, \ \left[ \left[\begin{matrix}2\\-1\\1\\0\end{matrix}\right], \ \left[\begin{matrix}5\\-3\\0\\1\end{matrix}\right]\right], \ \left[ \left[\begin{matrix}1\\0\\1\end{matrix}\right], \ \left[\begin{matrix}2\\1\\3\end{matrix}\right]\right]\right)\)
The RREF is
\[ \begin{bmatrix} 1&0&-2&-5\\ 0&1&1&3\\ 0&0&0&0 \end{bmatrix}. \]
The pivot columns are \(1\) and \(2\), so a basis for the image is
\[ \left\{ \begin{bmatrix}1\\0\\1\end{bmatrix}, \begin{bmatrix}2\\1\\3\end{bmatrix} \right\}. \]
Solving \(A\vec x=\vec 0\) gives
\[ x_1=2x_3+5x_4, \qquad x_2=-x_3-3x_4. \]
Thus
\[ \ker(T)=\operatorname{span}\left\{ \begin{bmatrix}2\\-1\\1\\0\end{bmatrix}, \begin{bmatrix}5\\-3\\0\\1\end{bmatrix} \right\}. \]
Rank–nullity gives
\[ 4=2+2. \]
If
\[ p(t)=a_0+a_1t+a_2t^2+a_3t^3+a_4t^4, \]
then
\[ D(p)=a_1+2a_2t+3a_3t^2+4a_4t^3. \]
The kernel is the space of constant polynomials:
\[ \ker(D)=\operatorname{span}\{1\}. \]
Every polynomial in \(P_3(\mathbb R)\) occurs as a derivative of some polynomial in \(P_4(\mathbb R)\), so
\[ \operatorname{im}(D)=P_3(\mathbb R). \]
Therefore
\[ \operatorname{nullity}(D)=1, \qquad \operatorname{rank}(D)=4. \]
Since \(\dim P_4(\mathbb R)=5\),
\[ 5=4+1. \]
Compute the determinant:
A = sp.Matrix([[1,0,2],
[0,1,-1],
[2,1,3]])
A.det(), A.rank(), A.rref()[0]\(\displaystyle \left( 0, \ 2, \ \left[\begin{matrix}1 & 0 & 2\\0 & 1 & -1\\0 & 0 & 0\end{matrix}\right]\right)\)
The determinant is
\[ \det(A)=0. \]
Therefore the columns do not form a basis of \(\mathbb R^3\).
In \(M_3(\mathbb R)\):
\[ \frac{3(3+1)}{2}=6. \]
\[ \frac{3(3-1)}{2}=3. \]
The equation is
\[ x_1-2x_2+x_3+3x_4=0. \]
Solve for \(x_1\):
\[ x_1=2x_2-x_3-3x_4. \]
Thus
\[ \begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix} = x_2\begin{bmatrix}2\\1\\0\\0\end{bmatrix} +x_3\begin{bmatrix}-1\\0\\1\\0\end{bmatrix} +x_4\begin{bmatrix}-3\\0\\0\\1\end{bmatrix}. \]
A basis is
\[ \left\{ \begin{bmatrix}2\\1\\0\\0\end{bmatrix}, \begin{bmatrix}-1\\0\\1\\0\end{bmatrix}, \begin{bmatrix}-3\\0\\0\\1\end{bmatrix} \right\}. \]
Therefore
\[ \dim H=3. \]
These activities are designed to help students use AI tools responsibly. The goal is not to outsource the solution, but to improve mathematical communication and verification.
Ask an AI assistant:
Explain what a basis is at three levels: for a beginner, for a linear algebra student, and for a data scientist.
Then check whether the explanation includes both independence and spanning. Rewrite the answer in your own words.
Ask an AI assistant:
A student claims: “Three vectors in \(\mathbb R^3\) always form a basis.” Find the mistake and give a counterexample.
Then create your own counterexample and verify it using row reduction.
Ask:
For a matrix \(A\in\mathbb R^{3\times 5}\) with rank \(2\), explain rank–nullity geometrically and computationally.
Your final answer should mention:
Ask an AI assistant to generate:
Then use Python to verify the rank and nullity of each matrix.
# Template for checking AI-generated matrices
A = sp.Matrix([[1, 2, 3],
[0, 1, 4],
[5, 6, 0]])
A.rank(), A.det() if A.rows == A.cols else None, A.nullspace()\(\displaystyle \left( 3, \ 1, \ \left[ \right]\right)\)
Choose one theorem from this chapter, such as rank–nullity or the Basis Theorem. Ask an AI assistant to explain it as a story. Then rewrite the explanation using precise mathematical definitions.
The ideas in this chapter appear throughout applied mathematics, data science, and AI.
A basis gives coordinates. Dimension counts coordinates. Rank counts visible output directions. Nullity counts invisible input directions. Rank–nullity says that every input direction is either seen in the output or lost in the kernel.