14) Linear Algebra#
Today#
Matrices as linear transformations
Polynomial evaluation and fitting
Orthogonality
using Plots
default(linewidth=4, legendfontsize=12)
1. Matrices as linear transformations#
Tip
Resource: An excellent resource for Linear Algebra is the MIT OpenCourseWare course by Prof. Gilbert Strang (now retired). Here is a link to the Fall 2011 undergraduate course edition. Other video galleries of his lectures can also be found.
Linear algebra is the study of linear transformations on vectors, which represent points in a finite dimensional space. The matrix-vector product \(y = A x\) is a linear combination of the columns of \(A\). The familiar definition,
can also be viewed as
That is, a linear combination of the columns of A.
Math and Julia Notation#
The notation \(A_{i,j}\) corresponds to the Julia syntax A[i,j]
and the colon :
means the entire range (row or column). So \(A_{:,j}\) is the \(j\)th column and \(A_{i,:}\) is the \(i\)th row. The corresponding Julia syntax is A[:,j]
and A[i,:]
.
Julia has syntax for row vectors, column vectors, and arrays.
[1 2 3] # a row vector
1×3 Matrix{Int64}:
1 2 3
[1; 2; 3] # a column vector
3-element Vector{Int64}:
1
2
3
[1. 2 3; 4 5 6] # a 2x3 real matrix
# compare with Python's syntax: np.array([[1, 2, 3], [4, 5, 6]])
2×3 Matrix{Float64}:
1.0 2.0 3.0
4.0 5.0 6.0
[1 2; 4 3]
2×2 Matrix{Int64}:
1 2
4 3
[1 0; 0 2; 10 3]
3×2 Matrix{Int64}:
1 0
0 2
10 3
[1; 2 + 1im; 3]' # ' is transpose, and for complex-valued matrices is the conjugate transpose ("adjoint")
1×3 adjoint(::Vector{Complex{Int64}}) with eltype Complex{Int64}:
1+0im 2-1im 3+0im
Implementing multiplication by row#
function matmult1(A, x)
m, n = size(A)
y = zeros(m)
for i in 1:m # row index first, i.e., for each row
for j in 1:n # we iterate over the columns
y[i] += A[i,j] * x[j] # we apply the familiar definition
end
end
y
end
matmult1 (generic function with 1 method)
A = reshape(1.:12, 3, 4) # a 3x4 matrix with the numbers 1:12
3×4 reshape(::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, 3, 4) with eltype Float64:
1.0 4.0 7.0 10.0
2.0 5.0 8.0 11.0
3.0 6.0 9.0 12.0
x = [10., 0, 0, 0]
4-element Vector{Float64}:
10.0
0.0
0.0
0.0
matmult1(A, x)
3-element Vector{Float64}:
10.0
20.0
30.0
# Dot product
A[2, :]' * x
20.0
function matmult2(A, x)
m, n = size(A)
y = zeros(m)
for i in 1:m # iterate over rows of A
y[i] = A[i,:]' * x # this way we use the dot product between the transposed row of A and the whole x
end
y
end
matmult2(A, x)
3-element Vector{Float64}:
10.0
20.0
30.0
Curiosity: which one is faster? Which one takes more allocations?
@time matmult1(A,x)
@time matmult2(A,x)
0.000006 seconds (1 allocation: 80 bytes)
0.000007 seconds (4 allocations: 368 bytes)
3-element Vector{Float64}:
10.0
20.0
30.0
Implementing multiplication by column#
function matmult3(A, x)
m, n = size(A)
y = zeros(m)
for j in 1:n # iterate over columns of A
y += A[:, j] * x[j]
end
y
end
matmult3(A, x)
3-element Vector{Float64}:
10.0
20.0
30.0
@time matmult3(A, x)
0.000007 seconds (13 allocations: 1.016 KiB)
3-element Vector{Float64}:
10.0
20.0
30.0
A * x # built-in matrix-vector multiply
3-element Vector{Float64}:
10.0
20.0
30.0
@time A * x # We'll use this version
0.000005 seconds (1 allocation: 80 bytes)
3-element Vector{Float64}:
10.0
20.0
30.0
Check the standard operations documentation page.
2. Polynomial evaluation#
Polynomial evaluation is (continuous) linear algebra#
We can evaluate polynomials using matrix-vector multiplication. For example,
using Pkg
Pkg.add("Polynomials")
using Polynomials
P(x) = Polynomial(x)
p = [0, -3, 0, 5] # vector of coefficients for the canonical basis
q = [1, 2, 3, 4]
@show f = P(p)
@show g = P(q)
h = f + g
@show h
@show P(p+q) # we can see that the polynomial evaluation as a linear combination (of matrix columns) is a linear transformation!
x = [0., 1, 2]
h.(x) # can be applied to each element of a vector with the dot operator
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
f = P(p) = Polynomial(-3*x + 5*x^3)
g = P(q) = Polynomial(1 + 2*x + 3*x^2 + 4*x^3)
h = Polynomial(1 - x + 3*x^2 + 9*x^3)
P(p + q) =
Polynomial(1 - x + 3*x^2 + 9*x^3)
3-element Vector{Float64}:
1.0
12.0
83.0
plot(h, legend=:bottomright, xlim=(-2, 2))
Polynomial evaluation is (discrete) linear algebra#
V = [one.(x) x x.^2 x.^3] # Vandermonde matrix
3×4 Matrix{Float64}:
1.0 0.0 0.0 0.0
1.0 1.0 1.0 1.0
1.0 2.0 4.0 8.0
V * p + V * q # same as h.(x)
3-element Vector{Float64}:
1.0
12.0
83.0
V * (p + q) # again, matrix multiplication is a linear transformation
3-element Vector{Float64}:
1.0
12.0
83.0
Vandermonde matrices#
A Vandermonde matrix is one whose columns are polynomials (monomials) evaluated at discrete points.
function vander(x, k=nothing)
if isnothing(k)
k = length(x)
end
m = length(x)
V = ones(m, k)
for j in 2:k
V[:, j] = V[:, j-1] .* x
end
V
end
vander (generic function with 2 methods)
@show x = LinRange(-1, 1, 50)
V = vander(x, 4)
scatter(x, V, legend=:bottomright, label = ["V_1" "V_2" "V_3" "V_4"])
x = LinRange(-1, 1, 50) = LinRange{Float64}(-1.0, 1.0, 50)
Fitting (polynomial interpolation) is linear algebra#
x1 = [-.9, 0.1, .5, .8]
y1 = [1, 2.4, -.2, 1.3]
scatter(x1, y1, markersize=8)
V = vander(x1)
@show size(V)
p = V \ y1 # write y1 in the polynomial basis; left-division, read V^{-1} * y1 (like solving p V = y1)
scatter(x1, y1, markersize=8, xlims=(-1, 1))
# plot!(P(p), label="P(p)")
plot!(x, vander(x, 4) * p, label="\$ V(x) p\$", linestyle=:dash)
size(V) = (4, 4)
Some common terminology#
The range of \(A\) is the space spanned by its columns. This definition coincides with the range of a function \(f(x)\) when \(f(x) = A x\).
The (right) nullspace of \(A\) is the space of vectors \(x\) such that \(A x = 0\).
The rank of \(A\) is the dimension of its range.
A matrix has full rank if the nullspace of either \(A\) or \(A^T\) is empty (only the 0 vector). Equivalently, if all the columns of \(A\) (or \(A^T\)) are linearly independent.
A nonsingular (or invertible) matrix is a square matrix of full rank. We call the inverse \(A^{-1}\) and it satisfies \(A^{-1} A = A A^{-1} = I\).
\(\DeclareMathOperator{\rank}{rank} \DeclareMathOperator{\null}{null} \)
Poll 15.1: If \(A \in \mathbb{R}^{m\times m}\), which of these doesn’t belong?#
\(A\) has an inverse, \(A^{-1}\)
\(\rank (A) = m\)
\(\null(A) = \{0\}\)
\(A A^T = A^T A\)
\(\det(A) \ne 0\)
\(A x = 0\) implies that \(x = 0\)
Pkg.add("LinearAlgebra")
using LinearAlgebra
A = rand(4, 4)
B = A' * A - A * A' # no. 4! (only valid for orthogonal matrices - more later)
@show B
det(A)
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
B = [0.4604803467837072 -0.39735175587097926 0.4235515952872667 0.4777312134489571; -0.39735175587097926 -1.2686917015919867 -0.6586784735378189 -0.459195328124272; 0.4235515952872667 -0.6586784735378189 -0.48731906585032037 0.04901441435627363; 0.4777312134489571 -0.459195328124272 0.04901441435627363 1.2955304206586]
-0.1193511533879711
What is an inverse?#
When we write \(x = A^{-1} y\), we mean that \(x\) is the unique vector such that \(A x = y\). (It is rare that we explicitly compute a matrix \(A^{-1}\), though it’s not as “bad” as people may have told you.) A vector \(y\) is equivalent to \(\sum_i e_i y_i\) where \(e_i\) are columns of the identity. Meanwhile, \(x = A^{-1} y\) means that we are expressing that same vector \(y\) in the basis of the columns of \(A\), i.e., \(\sum_i A_{:,i} x_i\).
using LinearAlgebra
A = rand(4, 4)
4×4 Matrix{Float64}:
0.326576 0.635067 0.37926 0.328146
0.506579 0.334969 0.982688 0.731816
0.203791 0.0209698 0.123718 0.0392216
0.679442 0.759598 0.191247 0.063243
A \ A # left-division, read A^{-1} * A; notice I (all 1's on the diagonal, and numerically 0's elsewhere)
4×4 Matrix{Float64}:
1.0 0.0 0.0 -1.89805e-14
-1.97811e-16 1.0 0.0 9.72807e-15
-1.03241e-15 0.0 1.0 4.64995e-14
1.36209e-15 0.0 0.0 1.0
inv(A) * A
4×4 Matrix{Float64}:
1.0 5.27e-15 1.37572e-14 2.84297e-14
1.50684e-15 1.0 4.53645e-15 -2.77006e-15
-2.14199e-14 -1.40743e-13 1.0 -6.19817e-14
4.90319e-14 8.017e-14 -8.61476e-15 1.0
3. Orthogonality#
The inner product
Examples with inner products#
x = [0, 1]
y = [1, 1]
@show x' * y
@show y' * x;
x' * y = 1
y' * x = 1
ϕ = pi/6
y = [cos(ϕ), sin(ϕ)]
cos_θ = x'*y / (norm(x) * norm(y))
@show cos_θ
@show cos(ϕ-pi/2);
cos_θ = 0.49999999999999994
cos(ϕ - pi / 2) = 0.4999999999999999
Polynomials can be orthogonal too!#
x = LinRange(-1, 1, 50)
A = vander(x, 4)
M = A * [.5 0 0 0; # 0.5
0 1 0 0; # x
0 0 1 0]' # x^2
# that is, M = [0.5 | x | x^2]
scatter(x, M, label = ["M_1" "M_2" "M_3"])
plot!(x, 0*x, label=:none, color=:black)
Which inner product will be zero?
Which functions are even and odd?
Polynomial inner products#
M[:,1]' * M[:,2]
-2.220446049250313e-16
M[:,1]' * M[:,3]
8.673469387755102
M[:,2]' * M[:,3]
-4.440892098500626e-16