24) Interpolation#

Last time:#

  • Introduction to Fortran

Today:#

  1. Interpolation using polynomials

  2. Lagrange Interpolating Polynomials

  3. Newton polynomials

  4. Choice of points and choice of bases

using LinearAlgebra
using Plots
using LaTeXStrings
default(linewidth=4, legendfontsize=12)

function vander(x, k=nothing)
    if isnothing(k)
        k = length(x)
    end
    m = length(x)
    V = ones(m, k)
    for j in 2:k
        V[:, j] = V[:, j-1] .* x
    end
    V
end

function peanut()
    theta = LinRange(0, 2*pi, 50)
    r = 1 .+ .4*sin.(3*theta) + .6*sin.(2*theta)
    r' .* [cos.(theta) sin.(theta)]'
end

function circle()
    theta = LinRange(0, 2*pi, 50)
    [cos.(theta) sin.(theta)]'
end

function Aplot(A)
    "Plot a transformation from X to Y"
    X = peanut()
    Y = A * X
    p = scatter(X[1,:], X[2,:], label="in")
    scatter!(p, Y[1,:], Y[2,:], label="out")
    X = circle()
    Y = A * X
    q = scatter(X[1,:], X[2,:], label="in")
    scatter!(q, Y[1,:], Y[2,:], label="out")
    plot(p, q, layout=2, aspect_ratio=:equal)
end
ArgumentError: Package LaTeXStrings not found in current path.
- Run `import Pkg; Pkg.add("LaTeXStrings")` to install the LaTeXStrings package.

Stacktrace:
 [1] macro expansion
   @ ./loading.jl:1842 [inlined]
 [2] macro expansion
   @ ./lock.jl:267 [inlined]
 [3] __require(into::Module, mod::Symbol)
   @ Base ./loading.jl:1823
 [4] #invoke_in_world#3
   @ ./essentials.jl:926 [inlined]
 [5] invoke_in_world
   @ ./essentials.jl:923 [inlined]
 [6] require(into::Module, mod::Symbol)
   @ Base ./loading.jl:1816

1. Interpolation using polynomials#

What is interpolation?#

Given data (xi,yi), find a (smooth?) function f(x) such that f(xi)=yi.

Data in#

  • direct field observations/measurement of a physical or social system

  • numerically processed observations, perhaps by applying physical principles

  • output from an expensive “exact” numerical computation

  • output from an approximate numerical computation

Functions to use as interpolants#

  • Polynomials

  • Piecewise polynomials (includes nearest-neighbor)

  • Powers and exponentials

  • Trigonometric functions (sine and cosine)

  • Neural networks

Interpolation fits the data exactly!

Polynomial interpolation#

Recall that in the Linear Algebra lecture, we’ve seen how we can fit a polynomial using Vandermonde matrices, one column per basis function and one row per observation.

[1x0x02x031x1x12x131xnxn2xn3]AR(n+1)×m[p0p1pn]=[y0y1yn]

which is the matrix form of the system of equations:

(9)#p(x0)=a0+a1x0+a2x02++anx0m=y0p(x1)=a0+a1x1+a2x12++anx1m=y1p(x2)=a0+a1x2+a2x22++anx2m=y2p(xn)=a0+a1xn+a2xn2++anxnm=yn

Poll 25.1: It’s possible to find a unique interpolating polynomial pPn when which of the following are true?#

  1. mn+1

  2. m=n+1

  3. mn+1

Hint: Fondamental Theorem of Algebra.

Polynomial interpolation with a Vandermonde matrix#

x = LinRange(-1.5, 2, 4)
y = sin.(x)
A = vander(x)
p = A \ y # p = A^{-1} y

scatter(x, y)
s = LinRange(-3, 3, 50)
plot!(s, [sin.(s) vander(s, length(p)) * p])

Vandermonde matrices can be ill-conditioned#

A = vander(LinRange(-1, 1, 20))
cond(A)

Reflection:

  1. It’s because of the points x?

  2. It’s because of the basis functions {1,x,x2,x3,}?

2. Lagrange Interpolating Polynomials#

Suppose we are given function values y0,,ym at the distinct points x0,,xm and we would like to build a polynomial of degree m that goes through all these points. This explicit construction is attributed to Lagrange (though it is said that he was not first):

li(x)=(xx0)(xx1)(xxi1)(xxi+1)(xxm)(xix0)(xix1)(xixi1)(xixi+1)(xixm),i=0,1,,m .

By construction, this is a degree m polynomial, with li(xj)=δij (where δij denotes the Kronecker delta). This just says that li(x) has zeros at all the xj’s except xi, where it is identically equal to 1. Hence, we notice that the Lagrange basis for polynomials of degree m for those nodes is the set of polynomials {l0(x),l1(x),,lm(x)} which take values lj(xk)=0 if kj and lj(xj)=1.

Lagrange’s interpolating formula

p(x)=i=0myili(x)i=0myii=0ijmxxjxixj
  • What is the degree of this polynomial?

  • Why is p(xi)=yi?

  • How expensive (in terms of m) is it to evaluate p(x)?

  • How expensive (in terms of m) is it to convert to standard form p(x)=i=0maixi?

  • Can we easily evaluate the derivative p(x)?

  • What can go wrong? Is this formulation numerically stable?

Let’s look at some concrete examples:

  • Suppose we are given the three points (x0,y0),(x1,y1),(x2,y2). Then the polynomial

p2(x)=y0(xx1)(xx2)(x0x1)(x0x2)+y1(xx0)(xx2)(x1x0)(x1x2)+y2(xx0)(xx1)(x2x0)(x2x1)

is the Lagrange’s interpolating for these points. First we notice that the polynomial exactly interpolate the data points. In fact, when x0 is substituted for x, the terms evaluate to y0+0+0=y0. Second, notice that the polynomial is of degree 2 in the variable x.

Example:#

Find an interpolating polynomial for the data points (0,1), (2,2), and (3,4). Substituting into Lagrange’s formula yields:

(10)#p2(x)=1(x2)(x3)(02)(03)+2(x0)(x3)(20)(23)+4(x0)(x2)(30)(32)==16(x25x+6)+2(12)(x23x)+4(13)(x22x)==x22x2+1.

Check that p2(0)=1, p2(2)=2, and p2(3)=4.

x = [0,2,3]
y = [1,2,4]
scatter(x, y, label = L"(x_i, y_i)")

Lagrange interpolation in code#

function lagrange(x, y)
    function p(t)
        m = length(x)
        w = 0
        for (i, yi) in enumerate(y)
            w += yi * (prod(t .- x[1:i-1]) * prod(t .- x[i+1:end])
                / (prod(x[i] .- x[1:i-1]) * prod(x[i] .- x[i+1:end])))
        end
        w
    end
    return p
end


x = LinRange(-1.5, 2, 4)
y = sin.(x)
p = lagrange(x, y)

scatter(x, y, label = L"(x_i, y_i)")
s = LinRange(-3, 3, 50)

plot!(s, [sin.(s) p.(s)], label = ["sin(x)" "p(x)"])
# Notice how important this is
prod(Float64[])

We don’t have cond(lagrange(x, y))#

Recall that the relative condition number is just a function and we know

κ(f)=|f||x||f|
but this definition depends on the input x and it’s difficult to explore that space.

  • We also don’t have an easy way to evaluate derivatives.

k = 5; xx = LinRange(-1, 1, k)
B = vander(s, k) / vander(xx) # 50×5 Matrix
plot(s, B) # plot the columns of B
scatter!(xx, [zero.(xx) one.(xx)], color=:black, legend=:none, ylims=(-2, 2)) # a closer look xlims = (-1,1)

3. Newton polynomials#

Given the set of data points (x0,y0),(x1,y1),,(xm,ym), the Newton interpolation polynomial is a linear combination of Newton basis polynomials:

N(x)=k=0maknk(x)

with the Newton basis polynomials defined as

nk(x)=i=0k1(xxi)

where no two xj are the same. This inductively constructs the polynomial. The coefficient ak only depends on the points up to xk.

The idea here is to build up the polynomial inductively, adding one point at a time. The added term will not “spoil” the previous work.

This gives the Vandermonde interpolation problem as

[1|(xx0)|(xx0)(xx1)|][p]=[y]
  • How does the Vandermonde procedure change if we replace the monomial xk with nk(x)?

  • Does the matrix have recognizable structure?

  • By the construction, adding a new point to the interpolation is easy because the coefficients already computed do not change.

  • The Newton coefficients ak are defined inductively via divided differences (check this reference for their tabulated form).

function vander_newton(x, abscissa=nothing)
    if isnothing(abscissa)
        abscissa = x
    end
    n = length(abscissa)
    A = ones(length(x), n)
    for i in 2:n
        A[:,i] = A[:,i-1] .* (x .- abscissa[i-1])
    end
    A
end

A = vander_newton(s, x)
plot(s, A, ylims=(-3, 3), label = ["A_1" "A_2" "A_3" "A_4"])
scatter!(x, [zero.(x)], color=:black, label=nothing)
p = vander_newton(x, x) \ y
scatter(x, y)
plot!(s, [sin.(s), vander_newton(s, x) * p], label = ["sin(x)" "vander_newton p(x)"])

Newton Vandermonde matrix structure#

  • How much does it cost to solve with a general n×n dense matrix?

    1. O(nlogn)

    2. O(n2)

    3. O(n3)

vander_newton(LinRange(-1, 1, 5))
  • How much does it cost to solve with a Newton Vandermonde matrix?

You can rearrange the Newton Vandermonde interpolation problem using nested multiplications that reduces the problem to O(n2).

How is the conditioning of these matrices?#

vcond(mat, points, nmax) = [cond(mat(points(-1, 1, n))) for n in 2:nmax]

plot([vcond(vander, LinRange, 20)], yscale=:log10, legend=:none)

A well-conditioned basis:#

Let’s construct the Vandermonde matrix, but instead of using the canonical basis for polynomials, we’ll use the Legendre polynomials.

function vander_legendre(x, k=nothing)
    if isnothing(k)
        k = length(x) # Square by default
    end
    m = length(x)
    Q = ones(m, k)
    Q[:, 2] = x
    for n in 1:k-2
        Q[:, n+2] = ((2*n + 1) * x .* Q[:, n+1] - n * Q[:, n]) / (n + 1) # definition of Legendre's polynomials via the generating function
    end
    Q
end
plot([vcond(vander_legendre, LinRange, 20)], yscale=:log10)

4. Choice of points and choice of bases#

Let’s look at different set of points and bases. Chebyshev nodes:

# The following are called Chebyshev nodes
CosRange(a, b, n) = (a + b)/2 .+ (b - a)/2 * cos.(LinRange(-pi, 0, n))

plot([vcond(vander, LinRange, 20)], yscale=:log10, legend=:topleft, label= "vander LinRange")
plot!([vcond(vander, CosRange, 20)], yscale=:log10, label = "vander CosRange")
plot!([vcond(vander_legendre, LinRange, 20)], yscale=:log10, label = "vander_legendre LinRange")
plot!([vcond(vander_legendre, CosRange, 20)], yscale=:log10, label =  "vander_legendre CosRange")

What’s wrong with ill-conditioning?#

runge1(x) = 1 / (1 + 10*x^2)

x_lin = LinRange(-1, 1, 20)
x_cos = CosRange(-1, 1, 20)
y_lin = runge1.(x_lin)
y_cos = runge1.(x_cos)
plot(runge1, xlims=(-1, 1))
scatter!(x_lin, y_lin)
plot(runge1, xlims=(-1, 1))
scatter!(x_cos, y_cos)
ourvander = vander_legendre
p = ourvander(x_lin) \ y_lin
scatter(x_lin, y_lin, label = L"(x_i,y_i)")
s = LinRange(-1, 1, 100)
plot!(s, runge1.(s), label = "runge")
plot!(s, ourvander(s, length(x_lin)) * p, label = "vander_legendre", title = "equi-spaced points")
# let's try again with the Chebyshev nodes
ourvander = vander_legendre
p = ourvander(x_cos) \ y_cos
scatter(x_cos, y_cos, label = L"(x_i,y_i)")
s = LinRange(-1, 1, 100)
plot!(s, runge1.(s), label = "runge")
plot!(s, ourvander(s, length(x_cos)) * p, label = "vander_legendre", title = "Chebyshev nodes")
# let's look at what happened with the linear (equi-spaced points)
ourvander(s, length(x_lin)) # 100×20 matrix
ourvander(x_lin) # 20x20 matrix
ourvander(s, length(x_lin)) / ourvander(x_lin) # 100×20 matrix
cond(ourvander(s, length(x_lin)) / ourvander(x_lin))
cond(ourvander(x_lin))

Important

The take-home message here is that equally spaced interpolants of high degree can be disastrous (and should be avoided if possible). However, with the right choice of points, high degree interpolants can work.

The worst vector#

A = ourvander(s, length(x_lin)) / ourvander(x_lin)

U, S, V = svd(A) # Vt factor is 20×20
V[:, 1:1] # extract first column
plot(x_lin, V[:, 1:1])
plot(s, U[:,1])
@show S[1]