25) Interpolation#
Last time:#
Introduction to Fortran
Today:#
Interpolation using polynomials
Lagrange Interpolating Polynomials
Newton polynomials
Choice of points and choice of bases
using LinearAlgebra
using Plots
using LaTeXStrings
default(linewidth=4, legendfontsize=12)
function vander(x, k=nothing)
if isnothing(k)
k = length(x)
end
m = length(x)
V = ones(m, k)
for j in 2:k
V[:, j] = V[:, j-1] .* x
end
V
end
function peanut()
theta = LinRange(0, 2*pi, 50)
r = 1 .+ .4*sin.(3*theta) + .6*sin.(2*theta)
r' .* [cos.(theta) sin.(theta)]'
end
function circle()
theta = LinRange(0, 2*pi, 50)
[cos.(theta) sin.(theta)]'
end
function Aplot(A)
"Plot a transformation from X to Y"
X = peanut()
Y = A * X
p = scatter(X[1,:], X[2,:], label="in")
scatter!(p, Y[1,:], Y[2,:], label="out")
X = circle()
Y = A * X
q = scatter(X[1,:], X[2,:], label="in")
scatter!(q, Y[1,:], Y[2,:], label="out")
plot(p, q, layout=2, aspect_ratio=:equal)
end
Aplot (generic function with 1 method)
1. Interpolation using polynomials#
What is interpolation?#
Given data \((x_i, y_i)\), find a (smooth?) function \(f(x)\) such that \(f(x_i) = y_i\).
Data in#
direct field observations/measurement of a physical or social system
numerically processed observations, perhaps by applying physical principles
output from an expensive “exact” numerical computation
output from an approximate numerical computation
Functions to use as interpolants#
Polynomials
Piecewise polynomials (includes nearest-neighbor)
Powers and exponentials
Trigonometric functions (sine and cosine)
Neural networks
Interpolation fits the data exactly!
Polynomial interpolation#
Recall that in the Linear Algebra lecture, we’ve seen how we can fit a polynomial using Vandermonde matrices, one column per basis function and one row per observation.
which is the matrix form of the system of equations:
Poll 25.1: It’s possible to find a unique interpolating polynomial \(\mathbf p \in P_n\) when which of the following are true?#
\(m \le n+1\)
\(m = n+1\)
\(m \ge n+1\)
Polynomial interpolation with a Vandermonde matrix#
x = LinRange(-1.5, 2, 4)
y = sin.(x)
A = vander(x)
p = A \ y # p = A^{-1} y
scatter(x, y)
s = LinRange(-3, 3, 50)
plot!(s, [sin.(s) vander(s, length(p)) * p])
Vandermonde matrices can be ill-conditioned#
A = vander(LinRange(-1, 1, 20))
cond(A)
2.7224082312417406e8
Reflection:
It’s because of the points \(x\)?
It’s because of the basis functions \(\{ 1, x, x^2, x^3, \dotsc \}\)?
2. Lagrange Interpolating Polynomials#
Suppose we are given function values \(y_0, \dotsc, y_m\) at the distinct points \(x_0, \dotsc, x_m\) and we would like to build a polynomial of degree \(m\) that goes through all these points. This explicit construction is attributed to Lagrange (though it is said that he was not first):
By construction, this is a degree \(m\) polynomial, with \(l_i(x_j) = \delta_{ij}\) (where \(\delta_{ij}\) denotes the Kronecker delta). This just says that \(l_i(x)\) has zeros at all the \(x_j\)’s except \(x_i\), where it is identically equal to \(1\). Hence, we notice that the Lagrange basis for polynomials of degree \(≤ m\) for those nodes is the set of polynomials \(\{l_0(x), l_1(x), \ldots, l_m(x) \}\) which take values \(l_j(x_k) = 0\) if \(k \neq j\) and \(l_j(x_j)=1\).
Lagrange’s interpolating formula
What is the degree of this polynomial?
Why is \(p(x_i) = y_i\)?
How expensive (in terms of \(m\)) is it to evaluate \(p(x)\)?
How expensive (in terms of \(m\)) is it to convert to standard form \(p(x) = \sum_{i=0}^m a_i x^i\)?
Can we easily evaluate the derivative \(p'(x)\)?
What can go wrong? Is this formulation numerically stable?
Let’s look at some concrete examples:
Suppose we are given the three points \((x_0, y_0), (x_1, y_1), (x_2, y_2)\). Then the polynomial
is the Lagrange’s interpolating for these points. First we notice that the polynomial exactly interpolate the data points. In fact, when \(x_0\) is substituted for \(x\), the terms evaluate to \(y_0 + 0 + 0 = y_0\). Second, notice that the polynomial is of degree 2 in the variable \(x\).
Example:#
Find an interpolating polynomial for the data points \((0, 1)\), \((2, 2)\), and \((3, 4)\). Substituting into Lagrange’s formula yields:
Check that \(p_2(0)=1\), \(p_2(2)=2\), and \(p_2(3)=4\).
x = [0,2,3]
y = [1,2,4]
scatter(x, y, label = L"(x_i, y_i)")
Lagrange interpolation in code#
function lagrange(x, y)
function p(t)
m = length(x)
w = 0
for (i, yi) in enumerate(y)
w += yi * (prod(t .- x[1:i-1]) * prod(t .- x[i+1:end])
/ (prod(x[i] .- x[1:i-1]) * prod(x[i] .- x[i+1:end])))
end
w
end
return p
end
x = LinRange(-1.5, 2, 4)
y = sin.(x)
p = lagrange(x, y)
scatter(x, y, label = L"(x_i, y_i)")
s = LinRange(-3, 3, 50)
plot!(s, [sin.(s) p.(s)], label = ["sin(x)" "p(x)"])
# Notice how important this is
prod(Float64[])
1.0
We don’t have cond(lagrange(x, y))
#
Recall that the relative condition number is just a function and we know
We also don’t have an easy way to evaluate derivatives.
k = 5; xx = LinRange(-1, 1, k)
B = vander(s, k) / vander(xx) # 50×5 Matrix
plot(s, B) # plot the columns of B
scatter!(xx, [zero.(xx) one.(xx)], color=:black, legend=:none, ylims=(-2, 2)) # a closer look xlims = (-1,1)
3. Newton polynomials#
Given the set of data points \((x_0,y_0), (x_1,y_1), \ldots, (x_m,y_m)\), the Newton interpolation polynomial is a linear combination of Newton basis polynomials:
with the Newton basis polynomials defined as
where no two \(x_j\) are the same. This inductively constructs the polynomial. The coefficient \(a_k\) only depends on the points up to \(x_k\).
The idea here is to build up the polynomial inductively, adding one point at a time. The added term will not “spoil” the previous work.
This gives the Vandermonde interpolation problem as
How does the Vandermonde procedure change if we replace the monomial \(x^k\) with \(n_k(x)\)?
Does the matrix have recognizable structure?
By the construction, adding a new point to the interpolation is easy because the coefficients already computed do not change.
The Newton coefficients \(a_k\) are defined inductively via divided differences (check this reference for their tabulated form).
function vander_newton(x, abscissa=nothing)
if isnothing(abscissa)
abscissa = x
end
n = length(abscissa)
A = ones(length(x), n)
for i in 2:n
A[:,i] = A[:,i-1] .* (x .- abscissa[i-1])
end
A
end
A = vander_newton(s, x)
plot(s, A, ylims=(-3, 3), label = ["A_1" "A_2" "A_3" "A_4"])
scatter!(x, [zero.(x)], color=:black, label=nothing)
p = vander_newton(x, x) \ y
scatter(x, y)
plot!(s, [sin.(s), vander_newton(s, x) * p], label = ["sin(x)" "vander_newton p(x)"])
Newton Vandermonde matrix structure#
How much does it cost to solve with a general \(n\times n\) dense matrix?
\(O(n \log n)\)
\(O(n^2)\)
\(O(n^3)\)
vander_newton(LinRange(-1, 1, 5))
5×5 Matrix{Float64}:
1.0 0.0 -0.0 0.0 -0.0
1.0 0.5 0.0 -0.0 0.0
1.0 1.0 0.5 0.0 -0.0
1.0 1.5 1.5 0.75 0.0
1.0 2.0 3.0 3.0 1.5
How much does it cost to solve with a Newton Vandermonde matrix?
You can rearrange the Newton Vandermonde interpolation problem using nested multiplications that reduces the problem to \(O(n^2)\).
How is the conditioning of these matrices?#
vcond(mat, points, nmax) = [cond(mat(points(-1, 1, n))) for n in 2:nmax]
plot([vcond(vander, LinRange, 20)], yscale=:log10, legend=:none)
A well-conditioned basis:#
Let’s construct the Vandermonde matrix, but instead of using the canonical basis for polynomials, we’ll use the Legendre polynomials.
function vander_legendre(x, k=nothing)
if isnothing(k)
k = length(x) # Square by default
end
m = length(x)
Q = ones(m, k)
Q[:, 2] = x
for n in 1:k-2
Q[:, n+2] = ((2*n + 1) * x .* Q[:, n+1] - n * Q[:, n]) / (n + 1) # definition of Legendre's polynomials via the generating function
end
Q
end
vander_legendre (generic function with 2 methods)
plot([vcond(vander_legendre, LinRange, 20)], yscale=:log10)
4. Choice of points and choice of bases#
Let’s look at different set of points and bases. Chebyshev nodes:
# The following are called Chebyshev nodes
CosRange(a, b, n) = (a + b)/2 .+ (b - a)/2 * cos.(LinRange(-pi, 0, n))
plot([vcond(vander, LinRange, 20)], yscale=:log10, legend=:topleft, label= "vander LinRange")
plot!([vcond(vander, CosRange, 20)], yscale=:log10, label = "vander CosRange")
plot!([vcond(vander_legendre, LinRange, 20)], yscale=:log10, label = "vander_legendre LinRange")
plot!([vcond(vander_legendre, CosRange, 20)], yscale=:log10, label = "vander_legendre CosRange")
What’s wrong with ill-conditioning?#
runge1(x) = 1 / (1 + 10*x^2)
x_lin = LinRange(-1, 1, 20)
x_cos = CosRange(-1, 1, 20)
y_lin = runge1.(x_lin)
y_cos = runge1.(x_cos)
plot(runge1, xlims=(-1, 1))
scatter!(x_lin, y_lin)
plot(runge1, xlims=(-1, 1))
scatter!(x_cos, y_cos)
ourvander = vander_legendre
p = ourvander(x_lin) \ y_lin
scatter(x_lin, y_lin, label = L"(x_i,y_i)")
s = LinRange(-1, 1, 100)
plot!(s, runge1.(s), label = "runge")
plot!(s, ourvander(s, length(x_lin)) * p, label = "vander_legendre", title = "equi-spaced points")
# let's try again with the Chebyshev nodes
ourvander = vander_legendre
p = ourvander(x_cos) \ y_cos
scatter(x_cos, y_cos, label = L"(x_i,y_i)")
s = LinRange(-1, 1, 100)
plot!(s, runge1.(s), label = "runge")
plot!(s, ourvander(s, length(x_cos)) * p, label = "vander_legendre", title = "Chebyshev nodes")
# let's look at what happened with the linear (equi-spaced points)
ourvander(s, length(x_lin)) # 100×20 matrix
ourvander(x_lin) # 20x20 matrix
ourvander(s, length(x_lin)) / ourvander(x_lin) # 100×20 matrix
cond(ourvander(s, length(x_lin)) / ourvander(x_lin))
4115.285306798416
cond(ourvander(x_lin))
7274.598185488539
Important
The take-home message here is that equally spaced interpolants of high degree can be disastrous (and should be avoided if possible). However, with the right choice of points, high degree interpolants can work.
The worst vector#
A = ourvander(s, length(x_lin)) / ourvander(x_lin)
U, S, V = svd(A) # Vt factor is 20×20
V[:, 1:1] # extract first column
plot(x_lin, V[:, 1:1])
plot(s, U[:,1])
@show S[1]
S[1] = 4141.821782071815
4141.821782071815