20  Chapter 20: Hilbert Spaces and Applications

Infinite-Dimensional Linear Algebra for Signals, Probability, PDEs, and Machine Learning

Author

He Wang

21 The story: when vectors become functions

In the first chapters of this course, a vector was often a column of numbers. Later, a vector became a polynomial, a matrix, a signal, or a state of a system. Chapter 20 makes this viewpoint explicit:

A vector can be any object that supports addition, scalar multiplication, length, angle, projection, and approximation.

The main challenge is that many natural spaces are infinite-dimensional. For example, a signal is a function, and a function may require infinitely many coordinates. To do linear algebra safely in such spaces, we need one extra idea:

Completeness: Cauchy sequences should have limits inside the space.

A Hilbert space is an inner product space that is complete. This one definition is the bridge from finite-dimensional linear algebra to Fourier analysis, least squares, probability, differential equations, quantum mechanics, and kernel methods.

21.1 Learning goals

By the end of this chapter, you should be able to:

  1. distinguish metric, normed, Banach, inner product, and Hilbert spaces;
  2. explain why completeness matters;
  3. compute projections using Gram matrices;
  4. interpret Fourier coefficients as coordinates in a Hilbert space;
  5. understand why \(L^2\) identifies functions that differ only on sets of measure zero;
  6. state and use the Riesz representation theorem;
  7. connect kernels and RKHS ideas to linear algebra.

22 From finite-dimensional geometry to Hilbert spaces

In \(\mathbb{R}^n\), the dot product gives length, angle, orthogonality, projection, and least squares. The guiding question is:

Which infinite-dimensional spaces still allow the same geometry?

The answer is: Hilbert spaces.

TipBig idea

Finite-dimensional linear algebra teaches us geometry. Hilbert space theory keeps the same geometry but allows infinite coordinates and limiting processes.

23 A hierarchy of spaces

The following hierarchy is useful:

\[ \text{set} \supset \text{metric space} \supset \text{normed vector space} \supset \text{inner product space} \supset \text{Hilbert space}. \]

There is also another branch:

\[ \text{complete normed vector space}=\text{Banach space}. \]

A Hilbert space is both an inner product space and a Banach space, with the norm coming from the inner product.

23.1 Definition: metric space

A metric space is a set \(S\) with a distance function

\[ d:S\times S\to \mathbb{R} \]

such that for all \(x,y,z\in S\),

  1. \(d(x,y)\ge 0\) and \(d(x,y)=0\) if and only if \(x=y\);
  2. \(d(x,y)=d(y,x)\);
  3. \(d(x,z)\le d(x,y)+d(y,z)\).

23.2 Definition: Cauchy sequence and completeness

A sequence \((x_n)\) in a metric space is Cauchy if for every \(\varepsilon>0\), there exists \(N\) such that

\[ d(x_m,x_n)<\varepsilon \qquad\text{whenever }m,n\ge N. \]

A metric space is complete if every Cauchy sequence converges to a point in the space.

A Cauchy sequence is a sequence whose terms eventually become arbitrarily close to each other. It is trying to converge. Completeness says that whenever a sequence is internally convergent in this sense, the limit is not missing from the space.

For example, \(\mathbb{Q}\) is not complete because rational approximations can converge to \(\sqrt{2}\), which is not rational.

23.3 Definition: normed vector space and Banach space

A norm on a vector space \(V\) is a function

\[ \|\cdot\|:V\to\mathbb{R} \]

such that for all \(u,v\in V\) and scalars \(c\),

  1. \(\|v\|\ge 0\) and \(\|v\|=0\) if and only if \(v=0\);
  2. \(\|cv\|=|c|\|v\|\);
  3. \(\|u+v\|\le \|u\|+\|v\|\).

A complete normed vector space is called a Banach space.

23.4 Definition: inner product space

A real inner product on a vector space \(V\) is a function

\[ \langle\cdot,\cdot\rangle:V\times V\to \mathbb{R} \]

such that for all \(u,v,w\in V\) and \(c\in\mathbb{R}\),

  1. \(\langle u,v\rangle=\langle v,u\rangle\);
  2. \(\langle u+v,w\rangle=\langle u,w\rangle+\langle v,w\rangle\);
  3. \(\langle cu,v\rangle=c\langle u,v\rangle\);
  4. \(\langle v,v\rangle\ge 0\), and \(\langle v,v\rangle=0\) if and only if \(v=0\).

For complex spaces, symmetry is replaced by conjugate symmetry:

\[ \langle u,v\rangle=\overline{\langle v,u\rangle}. \]

The norm induced by an inner product is

\[ \|v\|=\sqrt{\langle v,v\rangle}. \]

23.5 Example: matrix inner product

The space \(\mathbb{R}^{m\times n}\) has the Frobenius inner product

\[ \langle A,B\rangle=\operatorname{tr}(AB^T)=\sum_{i=1}^m\sum_{j=1}^n a_{ij}b_{ij}. \]

The induced norm is

\[ \|A\|_F=\sqrt{\sum_{i,j}a_{ij}^2}. \]

23.6 Example: random variables as vectors

Let \(X\) and \(Y\) be real random variables with finite second moments. Then

\[ \langle X,Y\rangle=\mathbb{E}(XY) \]

is an inner product after identifying random variables that are equal almost surely. The induced norm is

\[ \|X\|_2=\sqrt{\mathbb{E}(X^2)}. \]

This is one reason least squares and regression are naturally Hilbert space ideas.

24 Hilbert spaces

24.1 Definition: Hilbert space

A Hilbert space is a complete inner product space.

A Hilbert space is a space where all the finite-dimensional geometric tools still work and where infinite limits are allowed.

24.2 Example: finite-dimensional Hilbert spaces

Every finite-dimensional inner product space over \(\mathbb{R}\) or \(\mathbb{C}\) is complete. Therefore \(\mathbb{R}^n\) and \(\mathbb{C}^n\) are Hilbert spaces with their standard inner products.

24.3 Example: the sequence space \(\ell^2\)

The space

\[ \ell^2=\left\{x=(x_1,x_2,\ldots):\sum_{n=1}^{\infty}|x_n|^2<\infty\right\} \]

with inner product

\[ \langle x,y\rangle=\sum_{n=1}^{\infty}x_n\overline{y_n} \]

is a Hilbert space.

Its standard orthonormal basis is

\[ e_1=(1,0,0,\ldots),\quad e_2=(0,1,0,\ldots),\quad \ldots \]

Every \(x\in \ell^2\) has the expansion

\[ x=\sum_{n=1}^{\infty}x_ne_n, \]

where convergence is in the \(\ell^2\) norm.

24.4 Example: \(L^2[-\pi,\pi]\)

The function space

\[ L^2[-\pi,\pi] = \left\{ f:\int_{-\pi}^{\pi}|f(x)|^2\,dx<\infty \right\} \]

has inner product

\[ \langle f,g\rangle = \int_{-\pi}^{\pi}f(x)\overline{g(x)}\,dx. \]

After identifying functions that agree except on a set of measure zero, \(L^2[-\pi,\pi]\) is a Hilbert space.

WarningImportant subtlety

In \(L^2\), a vector is not exactly one function formula. It is an equivalence class of functions that agree almost everywhere.

25 A non-example: polynomials are not complete

Let

\[ \mathcal{P} = \{\text{all real polynomials on }[0,1]\} \]

with inner product

\[ \langle f,g\rangle=\int_0^1 f(t)g(t)\,dt. \]

This is an inner product space, but it is not a Hilbert space.

25.1 Proposition

The space \(\mathcal{P}\) of all polynomials on \([0,1]\) with the \(L^2\) inner product is not complete.

Consider the partial Taylor polynomials

\[ p_m(x)=\sum_{k=0}^{m}\frac{x^k}{k!}. \]

For \(n>m\),

\[ p_n-p_m=\sum_{k=m+1}^{n}\frac{x^k}{k!}. \]

Using the triangle inequality in the \(L^2\) norm,

\[ \|p_n-p_m\|_{L^2} \le \sum_{k=m+1}^{n} \left\|\frac{x^k}{k!}\right\|_{L^2}. \]

But

\[ \left\|\frac{x^k}{k!}\right\|_{L^2} = \frac{1}{k!} \left(\int_0^1 x^{2k}\,dx\right)^{1/2} = \frac{1}{k!\sqrt{2k+1}}. \]

The series

\[ \sum_{k=0}^{\infty}\frac{1}{k!\sqrt{2k+1}} \]

converges. Hence \((p_m)\) is Cauchy in the \(L^2\) norm.

However, \(p_m\to e^x\) uniformly on \([0,1]\), hence also in \(L^2[0,1]\). The function \(e^x\) is not a polynomial. Therefore the Cauchy sequence \((p_m)\) does not converge to an element of \(\mathcal{P}\). Thus \(\mathcal{P}\) is not complete.

25.2 Remark

For a fixed degree \(d\), the space

\[ \mathcal{P}_d=\{\text{polynomials of degree at most }d\} \]

is finite-dimensional. Hence it is complete and is a Hilbert space with the \(L^2\) inner product.

26 Orthogonal projection and least squares

The most important reason Hilbert spaces are useful is the projection theorem. It is the infinite-dimensional version of least squares.

26.1 Theorem: orthogonal projection theorem

Let \(\mathcal{H}\) be a Hilbert space and let \(\mathcal{M}\subseteq \mathcal{H}\) be a closed linear subspace. For every \(y\in\mathcal{H}\), there exists a unique element \(p\in\mathcal{M}\) such that

\[ \|y-p\|=\min_{w\in\mathcal{M}}\|y-w\|. \]

Moreover, \(p\) is characterized by the orthogonality condition

\[ y-p\perp \mathcal{M}, \]

meaning

\[ \langle y-p,w\rangle=0 \qquad \text{for all }w\in\mathcal{M}. \]

We write

\[ p=\operatorname{Proj}_{\mathcal{M}}y. \]

The finite-dimensional proof says: find the closest point, then show the residual is orthogonal to the subspace.

In an infinite-dimensional Hilbert space, the difficulty is existence of a closest point. Choose a minimizing sequence \((w_n)\) in \(\mathcal{M}\) such that

\[ \|y-w_n\|\to \inf_{w\in \mathcal{M}}\|y-w\|. \]

Using the parallelogram identity, one proves that \((w_n)\) is Cauchy. Since \(\mathcal{H}\) is complete and \(\mathcal{M}\) is closed, \(w_n\) converges to some \(p\in\mathcal{M}\). This \(p\) is the minimizer.

The orthogonality condition follows by minimizing

\[ \phi(t)=\|y-(p+tw)\|^2 \]

for arbitrary \(w\in\mathcal{M}\). Since \(p\) is closest, \(t=0\) is a minimum, so \(\phi'(0)=0\), which gives \(\langle y-p,w\rangle=0\).

26.2 Projection onto a finite-dimensional subspace

Let

\[ \mathcal{M}=\operatorname{span}\{u_1,\ldots,u_m\} \]

inside a Hilbert space \(\mathcal{H}\). If

\[ p=c_1u_1+\cdots+c_mu_m, \]

then the condition \(y-p\perp \mathcal{M}\) gives

\[ \langle y-p,u_i\rangle=0, \qquad i=1,\ldots,m. \]

Therefore

\[ \sum_{j=1}^{m}c_j\langle u_j,u_i\rangle=\langle y,u_i\rangle. \]

In matrix form,

\[ G\mathbf{c}=\mathbf{b}, \]

where

\[ G_{ij}=\langle u_j,u_i\rangle, \qquad b_i=\langle y,u_i\rangle. \]

This is the Hilbert-space normal equation.

27 Orthonormal systems and Fourier viewpoint

27.1 Definition: orthonormal set

A collection \(\{e_i\}_{i\in I}\) in an inner product space is orthonormal if

\[ \langle e_i,e_j\rangle = \begin{cases} 1,&i=j,\\ 0,&i\ne j. \end{cases} \]

27.2 Proposition

Every orthonormal set is linearly independent.

Suppose

\[ c_1e_1+\cdots+c_me_m=0. \]

Taking the inner product with \(e_j\) gives

\[ c_j= \langle c_1e_1+\cdots+c_me_m,e_j\rangle = \langle 0,e_j\rangle = 0. \]

Thus all coefficients are zero.

27.3 Theorem: Bessel inequality

Let \(\{e_1,e_2,\ldots\}\) be an orthonormal sequence in a Hilbert space \(\mathcal{H}\). Then for every \(f\in\mathcal{H}\),

\[ \sum_{n=1}^{\infty}|\langle f,e_n\rangle|^2 \le \|f\|^2. \]

For each \(N\), let

\[ p_N=\sum_{n=1}^{N}\langle f,e_n\rangle e_n. \]

Then \(p_N\) is the orthogonal projection of \(f\) onto \(\operatorname{span}\{e_1,\ldots,e_N\}\). Therefore

\[ \|f\|^2=\|p_N\|^2+\|f-p_N\|^2\ge \|p_N\|^2. \]

Since the \(e_n\) are orthonormal,

\[ \|p_N\|^2=\sum_{n=1}^{N}|\langle f,e_n\rangle|^2. \]

Letting \(N\to\infty\) gives the result.

27.4 Definition: Hilbert basis

An orthonormal sequence \(\{e_n\}\) is a Hilbert basis, or complete orthonormal basis, if every \(f\in\mathcal{H}\) can be written as

\[ f=\sum_{n=1}^{\infty}\langle f,e_n\rangle e_n \]

with convergence in the Hilbert space norm.

27.5 Theorem: Parseval identity

If \(\{e_n\}\) is a Hilbert basis for \(\mathcal{H}\), then

\[ \|f\|^2=\sum_{n=1}^{\infty}|\langle f,e_n\rangle|^2. \]

27.6 Example: Fourier basis

In \(L^2[-\pi,\pi]\), the functions

\[ e_k(x)=\frac{1}{\sqrt{2\pi}}e^{ikx}, \qquad k\in\mathbb{Z}, \]

form an orthonormal basis. The Fourier coefficient of \(f\) is

\[ \widehat f(k)= \langle f,e_k\rangle = \frac{1}{\sqrt{2\pi}} \int_{-\pi}^{\pi}f(x)e^{-ikx}\,dx. \]

Thus Fourier series are coordinate expansions in a Hilbert space.

28 Riesz representation theorem

The dual space of a finite-dimensional inner product space is naturally identified with the original space. The same remains true in Hilbert spaces if we restrict to continuous linear functionals.

28.1 Definition: bounded linear functional

Let \(\mathcal{H}\) be a Hilbert space. A linear functional

\[ L:\mathcal{H}\to \mathbb{R} \quad\text{or}\quad L:\mathcal{H}\to \mathbb{C} \]

is bounded if there exists \(C>0\) such that

\[ |L(f)|\le C\|f\| \]

for all \(f\in\mathcal{H}\).

28.2 Theorem: Riesz representation theorem

Let \(\mathcal{H}\) be a Hilbert space. For every bounded linear functional \(L\) on \(\mathcal{H}\), there exists a unique vector \(g\in\mathcal{H}\) such that

\[ L(f)=\langle f,g\rangle \]

for all \(f\in\mathcal{H}\).

If \(L=0\), take \(g=0\). Otherwise, \(\ker L\) is a closed subspace of \(\mathcal{H}\). Choose a unit vector

\[ u\in(\ker L)^\perp. \]

Every \(f\in\mathcal{H}\) decomposes as

\[ f=w+\alpha u, \qquad w\in\ker L. \]

Then \(L(f)=\alpha L(u)\), and \(\alpha=\langle f,u\rangle\). Therefore

\[ L(f)=\langle f,\overline{L(u)}u\rangle \]

under the convention that the inner product is linear in the first variable. Uniqueness follows by testing the difference of two representing vectors against itself.

28.3 Example: integral functionals

Let \(\mathcal{H}=L^2[0,1]\) and fix \(g\in L^2[0,1]\). Then

\[ L(f)=\int_0^1 f(x)g(x)\,dx \]

is a bounded linear functional. Riesz says that every bounded linear functional on \(L^2[0,1]\) has this form for a unique \(g\in L^2[0,1]\).

29 Why \(L^2\) uses equivalence classes

The natural function Hilbert spaces are built using the Lebesgue integral. The reason is that Cauchy sequences of nice functions often converge to less nice functions.

29.1 Definition: positive and negative parts

For a real-valued function \(f\), define

\[ f_+(x)=\max\{f(x),0\}, \qquad f_-(x)=\max\{-f(x),0\}. \]

Then

\[ f=f_+-f_-, \qquad |f|=f_++f_-. \]

29.2 Definition: \(L^2\) equivalence

Two measurable functions \(f,g\) are identified in \(L^2\) if

\[ \int |f-g|^2=0. \]

Equivalently, they are equal except on a set of measure zero.

29.3 Example: changing one value

Let \(f(x)=0\) on \([0,1]\). Define \(g\) by

\[ g(1/2)=100, \qquad g(x)=0 \text{ otherwise}. \]

Then

\[ \int_0^1 |f(x)-g(x)|^2\,dx=0. \]

Thus \(f\) and \(g\) represent the same vector in \(L^2[0,1]\).

30 Kernel viewpoint and RKHS

A kernel is a function that behaves like an inner product after mapping data into a Hilbert space. This idea connects Hilbert spaces with modern machine learning.

30.1 Definition: positive semidefinite kernel

Let \(X\) be a set. A function

\[ K:X\times X\to\mathbb{R} \]

is a positive semidefinite kernel if for any \(x_1,\ldots,x_m\in X\), the Gram matrix

\[ G=[K(x_i,x_j)]_{i,j=1}^{m} \]

is positive semidefinite.

30.2 Example: polynomial kernel

For \(x,y\in\mathbb{R}^d\), the function

\[ K(x,y)=(1+x^Ty)^r \]

is a positive semidefinite kernel. It corresponds to taking inner products after mapping \(x\) into a larger feature space of monomials up to degree \(r\).

30.3 Definition: reproducing kernel Hilbert space

A Hilbert space \(\mathcal{H}\) of functions on a set \(X\) is called a reproducing kernel Hilbert space if for every \(x\in X\), evaluation at \(x\) is a bounded linear functional. By the Riesz theorem, there exists \(K_x\in\mathcal{H}\) such that

\[ f(x)=\langle f,K_x\rangle \qquad \text{for all }f\in\mathcal{H}. \]

The function

\[ K(x,y)=K_y(x) \]

is called the reproducing kernel.

TipLinear algebra interpretation

The equation

\[ f(x)=\langle f,K_x\rangle \]

says that evaluating a function is the same as taking an inner product with a special vector \(K_x\).

31 Applications

31.1 Fourier analysis

In \(L^2[-\pi,\pi]\), the Fourier basis decomposes a signal into orthogonal frequency directions. Projection onto a finite-dimensional subspace

\[ \operatorname{span}\{e^{-iNx},\ldots,e^{iNx}\} \]

gives a low-frequency approximation of the signal.

31.2 Probability and statistics

Random variables with finite variance form a Hilbert space. Orthogonal projection becomes conditional expectation:

\[ \mathbb{E}(Y\mid X) \]

is the projection of \(Y\) onto the closed subspace of functions of \(X\). This is the Hilbert space foundation of least squares regression.

31.3 PDEs and weak solutions

Many differential equations are solved not by searching for classical derivatives, but by searching in a Hilbert space such as \(L^2\) or a Sobolev space. The equation is tested against all vectors in a space, producing a weak formulation.

31.4 Quantum mechanics

A quantum state is modeled by a unit vector in a complex Hilbert space. Observables are represented by self-adjoint linear operators. Orthogonal projection describes measurement onto a subspace of possible states.

31.5 Kernel methods

Support vector machines, Gaussian processes, and kernel ridge regression use kernels to perform linear algebra in high-dimensional or infinite-dimensional Hilbert spaces without explicitly writing the feature map.

32 Python computation 1: projection in a finite-dimensional Hilbert space

This computation projects a vector onto the span of two non-orthonormal vectors.

Code
import numpy as np

v1 = np.array([1.0, 1.0, 0.0])
v2 = np.array([1.0, 0.0, 1.0])
y  = np.array([2.0, 1.0, 3.0])

A = np.column_stack([v1, v2])

# Projection onto im(A): A(A^T A)^{-1}A^T y
proj = A @ np.linalg.solve(A.T @ A, A.T @ y)
residual = y - proj

print("Projection:", proj)
print("Residual:", residual)
print("A^T residual:", A.T @ residual)
Projection: [2.66666667 0.33333333 2.33333333]
Residual: [-0.66666667  0.66666667  0.66666667]
A^T residual: [-2.22044605e-16 -4.44089210e-16]

The last line should be approximately zero. This verifies the orthogonality condition.

33 Python computation 2: best quadratic approximation to \(e^x\)

We approximate \(e^x\) on \([0,1]\) by a quadratic polynomial

\[ p(x)=c_0+c_1x+c_2x^2 \]

using the \(L^2\) inner product.

The normal equations are

\[ G\mathbf{c}=\mathbf{b}, \]

where

\[ G_{ij}=\int_0^1 x^{i+j}\,dx, \qquad b_i=\int_0^1 e^x x^i\,dx. \]

Code
import sympy as sp

x = sp.symbols("x")
basis = [1, x, x**2]
f = sp.exp(x)

G = sp.Matrix([[sp.integrate(basis[j]*basis[i], (x, 0, 1))
                for j in range(3)] for i in range(3)])

rhs = sp.Matrix([sp.integrate(f*basis[i], (x, 0, 1)) for i in range(3)])

c = G.LUsolve(rhs)
p = sp.expand(sum(c[i]*basis[i] for i in range(3)))

G, rhs, [sp.simplify(ci) for ci in c], p
(Matrix([
 [  1, 1/2, 1/3],
 [1/2, 1/3, 1/4],
 [1/3, 1/4, 1/5]]),
 Matrix([
 [-1 + E],
 [     1],
 [-2 + E]]),
 [-105 + 39*E, 588 - 216*E, -570 + 210*E],
 -570*x**2 + 210*E*x**2 - 216*E*x + 588*x - 105 + 39*E)
Code
# Numerical values
[float(ci) for ci in c]
[1.012991309902764, 0.8511250528462292, 0.8391839763994994]

34 Python computation 3: Fourier coefficients as coordinates

We approximate a square wave using finitely many sine terms.

Code
import numpy as np
import matplotlib.pyplot as plt

xs = np.linspace(-np.pi, np.pi, 1000)
f = np.sign(np.sin(xs))

def fourier_square(xs, N):
    s = np.zeros_like(xs)
    for k in range(1, N+1, 2):
        s += (4/np.pi) * np.sin(k*xs)/k
    return s

for N in [1, 3, 9, 25]:
    plt.plot(xs, fourier_square(xs, N), label=f"N={N}")

plt.plot(xs, f, "--", label="square wave")
plt.legend()
plt.title("Fourier projection onto low-frequency sine modes")
plt.xlabel("x")
plt.ylabel("value")
plt.show()

35 Challenge questions

35.1 Challenge 1: why closed subspaces matter

Explain why the projection theorem can fail if the subspace is not closed. Use the polynomial subspace of \(L^2[0,1]\) as an intuitive example.

Let \(\mathcal{H}=L^2[0,1]\) and let \(\mathcal{M}\) be the space of polynomials. The polynomial space is dense in \(L^2[0,1]\), but it is not closed.

Take a function \(f\in L^2[0,1]\) that is not a polynomial, such as \(f(x)=e^x\). Since polynomials are dense,

\[ \inf_{p\in\mathcal{M}}\|f-p\|_{L^2}=0. \]

But no polynomial equals \(f\) as an \(L^2\) vector. Therefore there is no closest polynomial in \(\mathcal{M}\). The infimum is zero but is not attained.

35.2 Challenge 2: projection from Gram matrices

Let \(\mathcal{M}=\operatorname{span}\{u_1,\ldots,u_m\}\) in a Hilbert space. Derive the system for the coefficients of \(\operatorname{Proj}_{\mathcal{M}}y\).

Write

\[ p=c_1u_1+\cdots+c_mu_m. \]

The projection condition is

\[ y-p\perp \mathcal{M}. \]

It is enough to impose

\[ \langle y-p,u_i\rangle=0,\qquad i=1,\ldots,m. \]

Thus

\[ \langle y,u_i\rangle = \sum_{j=1}^{m}c_j\langle u_j,u_i\rangle. \]

This is the Gram system

\[ G\mathbf{c}=\mathbf{b}. \]

35.3 Challenge 3: Fourier coefficients as coordinates

Let \(\{e_n\}\) be an orthonormal basis for a Hilbert space. Explain why \(\langle f,e_n\rangle\) is the \(n\)-th coordinate of \(f\).

If

\[ f=\sum_{n=1}^{\infty}c_ne_n, \]

then taking the inner product with \(e_j\) gives

\[ \langle f,e_j\rangle = \sum_{n=1}^{\infty}c_n\langle e_n,e_j\rangle = c_j. \]

Thus \(\langle f,e_j\rangle\) is exactly the \(j\)-th coordinate of \(f\).

36 Practice problems

36.1 Problem 1

Let \(u_1=(1,1,0)\), \(u_2=(1,0,1)\), and \(y=(2,1,3)\) in \(\mathbb{R}^3\). Find the projection of \(y\) onto \(\operatorname{span}\{u_1,u_2\}\).

Let \(A=[u_1\ u_2]\). Then

\[ p=A(A^TA)^{-1}A^Ty. \]

Here

\[ A= \begin{bmatrix} 1&1\\ 1&0\\ 0&1 \end{bmatrix}. \]

Compute

\[ A^TA= \begin{bmatrix} 2&1\\ 1&2 \end{bmatrix}, \qquad A^Ty= \begin{bmatrix} 3\\ 5 \end{bmatrix}. \]

Solving gives

\[ \begin{bmatrix} 2&1\\ 1&2 \end{bmatrix} \begin{bmatrix} c_1\\c_2 \end{bmatrix} = \begin{bmatrix} 3\\5 \end{bmatrix}. \]

Thus \(c_1=\frac13\) and \(c_2=\frac73\). Therefore

\[ p=\frac13u_1+\frac73u_2 = \left(\frac83,\frac13,\frac73\right). \]

36.2 Problem 2

Show that the sequence \(x=(1,\frac12,\frac13,\ldots)\) is not in \(\ell^2\), but \(y=(1,\frac12,\frac14,\frac18,\ldots)\) is in \(\ell^2\).

For \(x\),

\[ \sum_{n=1}^{\infty}\left|\frac1n\right|^2 = \sum_{n=1}^{\infty}\frac1{n^2} <\infty. \]

So \(x\) actually is in \(\ell^2\).

For \(y\),

\[ \sum_{n=0}^{\infty}\left(\frac1{2^n}\right)^2 = \sum_{n=0}^{\infty}\frac1{4^n} = \frac{1}{1-\frac14} = \frac43. \]

So \(y\in \ell^2\).

A correct nonexample is \(z=(1,\frac1{\sqrt2},\frac1{\sqrt3},\ldots)\), since

\[ \sum_{n=1}^{\infty}\frac1n \]

diverges.

36.3 Problem 3

Let \(L(f)=\int_0^1 f(x)x^2\,dx\) on \(L^2[0,1]\). Find the Riesz representing vector.

The Riesz representation theorem says that

\[ L(f)=\langle f,g\rangle \]

for a unique \(g\in L^2[0,1]\). Since

\[ L(f)=\int_0^1 f(x)x^2\,dx, \]

we have

\[ g(x)=x^2. \]

37 AI companion activities

Use AI as a study partner, not as a replacement for your own reasoning.

37.1 Activity 1: explain the hierarchy

Ask an AI tool:

Explain the difference between metric spaces, normed spaces, Banach spaces, inner product spaces, and Hilbert spaces using examples from linear algebra.

Then check whether the answer correctly says that every Hilbert space is a Banach space, but not every Banach space is a Hilbert space.

37.2 Activity 2: generate projection examples

Ask:

Give me three examples of orthogonal projection: one in \(\mathbb{R}^3\), one in a polynomial space, and one in \(L^2[0,1]\).

Verify each example by checking the residual is orthogonal to the subspace.

37.3 Activity 3: connect Fourier series and coordinates

Ask:

Explain Fourier coefficients as coordinates in an infinite-dimensional Hilbert space.

Then write your own one-paragraph explanation.

37.4 Activity 4: kernel methods

Ask:

Explain how a positive semidefinite kernel acts like an inner product in a hidden feature space.

Then test the idea numerically by constructing a Gram matrix from a kernel and checking its eigenvalues.

38 Summary

Hilbert spaces are infinite-dimensional linear algebra spaces with enough completeness to support limits, projections, and approximations.

The main ideas are:

  • A Hilbert space is a complete inner product space.
  • Orthogonal projection generalizes least squares.
  • Orthonormal bases generalize coordinate systems.
  • Fourier series are Hilbert space coordinate expansions.
  • Riesz representation identifies continuous linear functionals with inner products.
  • \(L^2\) spaces are central examples in analysis, probability, PDEs, signal processing, and machine learning.