19) Matrix condition number and SVD#
Last time#
Solving Systems
Review
Today#
Condition Number of a Matrix
Least squares and normal equations
Intro to SVD
Geometry of the Singular Value Decomposition
using LinearAlgebra
using Plots
default(linewidth=4, legendfontsize=12)
1. Condition Number of a Matrix#
We may have informally referred to a matrix as “ill-conditioned” when the columns are nearly linearly dependent, but let’s make this concept for precise. Recall the definition of (relative) condition number:
We understood this definition for scalar problems, but it also makes sense when the inputs and/or outputs are vectors (or matrices, etc.) and absolute value is replaced by vector (or matrix) norms. Consider matrix-vector multiplication, for which
There are two problems here:
I wrote
but my formula depends on .What is that
beastie?
Matrix norms induced by vector norms#
Vector norms are built into the linear space (and defined in term of the inner product). Matrix norms are induced by vector norms, according to
This equation makes sense for non-square matrices – the vector norms of the input and output spaces may differ.
Due to linearity, all that matters is the direction of
, so it could equivalently be written as
The formula makes sense, but still depends on .#
Consider the matrix
What is the norm of this matrix?
What is the condition number when
?What is the condition number when
?
Condition number of the matrix#
The condition number of matrix-vector multiplication depends on the vector. The condition number of the matrix is the worst case (maximum) of the condition number for any vector, i.e.,
If
Evidently multiplying by a matrix is just as ill-conditioned of an operation as solving a linear system using that matrix.
2. Least squares and the normal equations#
A least squares problem takes the form: given an
So if
Solution by QR (Householder triangulation)#
Solves
This method is stable and accurate.
Solution by Cholesky#
The mathematically equivalent form
The product
Solution by Singular Value Decomposition#
Next, we will discuss a factorization
Singular values of a linear operator (matrix
It may remind you of an eigenvalue decomposition
the SVD exists for all matrices (including non-square and deficient matrices)
have orthogonal columns (while can be arbitrarily ill-conditioned).Indeed, if a matrix is symmetric and positive definite (all positive eigenvalues), then
and . Computing an SVD requires a somewhat complicated iterative algorithm, but a crude estimate of the cost is . Note that this is similar to the cost of when , but much more expensive for square matrices.
4. Geometry of the Singular Value Decomposition#
default(aspect_ratio=:equal)
function peanut()
theta = LinRange(0, 2*pi, 50)
r = 1 .+ .4*sin.(3*theta) + .6*sin.(2*theta)
r' .* [cos.(theta) sin.(theta)]'
end
function circle()
theta = LinRange(0, 2*pi, 50)
[cos.(theta) sin.(theta)]'
end
function Aplot(A)
"Plot a transformation from X to Y, first applied to a peanut shape and then to a circle"
X = peanut()
Y = A * X
p = scatter(X[1,:], X[2,:], label="in")
scatter!(p, Y[1,:], Y[2,:], label="out")
X = circle()
Y = A * X
q = scatter(X[1,:], X[2,:], label="in")
scatter!(q, Y[1,:], Y[2,:], label="out")
plot(p, q, layout=2)
end
Aplot (generic function with 1 method)
Multiplication by a scalar: #
Perhaps the simplest transformation is a scalar multiple of the identity.
Aplot(1.1*I)
Diagonal matrices#
It is the same thing if we apply it to any diagonal matrix
Aplot([2 0; 0 2]) # In this case D = 2 I
The diagonal entries can be different sizes. Example:
Aplot([2 0; 0 .5])
In this case, the circles becomes an ellipse that is aligned with the coordinate axes (
Givens Rotation (as example of orthogonal matrix)#
We can rotate the input using a
function givens(theta)
s = sin(theta)
c = cos(theta)
[c -s; s c]
end
G = givens(0.3)
Aplot(G)
Let’s look at the effect of its transpose:
Aplot(G')
Reflections#
We’ve previously seen that reflectors have the form
function reflect(theta)
v = [cos(theta), sin(theta)]
I - 2 * v * v'
end
Aplot(reflect(0.3))
Singular Value Decomposition#
The SVD is
Let’s try a random symmetric matrix.
A = randn(2, 2)
A += A' # make symmetric
@show det(A) # Positive means orientation is preserved
Aplot(A)
det(A) = -7.387133825214749
U, S, V = svd(A) # using the Julia built-in
@show norm(U * diagm(S) * V' - A) # Should be zero
Aplot(V') # Rotate/reflect in preparation for scaling
norm(U * diagm(S) * V' - A) = 7.691850745534255e-16
What visual features indicate that this is a symmetric matrix?
Is the orthogonal matrix a reflection or rotation?
Does this change when the determinant is positive versus negative (rerun the cell above as needed).
Let’s look at the parts of the SVD#
Aplot(diagm(S)) # scale along axes
Aplot(U) # rotate/reflect back
Putting it together#
Aplot(U * diagm(S) * V') # SVD
Observations#
The circle always maps to an ellipse
The
and factors may reflect even when