10) Rootfinding#

Last time#

  • More on floating point

  • Discuss Taylor Series activity

  • Condition numbers

Today#

  1. Intro to forward and backward error

  2. Computing volume of a polygon

  3. Rootfinding

  4. Bisection Method

using Plots
default(linewidth=4)

What we’re hoping for is for our problems (functions) and algorithms to be reliable (i.e., we can trust the result that our computer spits out). Reliable = well-conditioned and stable. But unfortunately

Mathematical functions \(f(x)\) can be ill-conditioned: (big \(\kappa\))

  • Modeling is how we turn an abstract question into a mathematical function

  • We want well-conditioned models (small \(\kappa\))

  • Some systems are intrinsically sensitive: fracture, chaotic systems, combustion

Algorithms f(x) can be unstable

  • Unreliable, though sometimes practical

  • Unstable algorithms are constructed from ill-conditioned parts

1. Introduction to forward and backward error#

  • Backward error measures what change to the original data would reproduce the result found by an algorithm.

  • Let \(\tilde{y} = \tilde{f}(x)\) be a computed result for the original data \(x\). If there is a value \(\tilde{x}\) such that

\[ f(\tilde{x})=\tilde{y} =\tilde{f}(x), \]

then we call \(|\tilde{x} −x|/|x|\) the (relative) backward error of the result.

  • Instead of asking, “How close to the true answer is your answer?”, backward error asks, “How close to the true question is the question you answered?”

Aside: Verification and Validation (V&V)

This is similar to the difference between Verification and Validation (V&V) of your results.

  • In Numerical Analysis, the concepts of Verification and Validation can be oversimplified in a succinct manner by saying that “verification is checking that you are solving the equations right” (verifying the numerics) and “validation is checking that you are solving the right equations” (for physical or engineering problems this involves verifying the physics - often done by comparing model results with actual data given by observations).

  • Similarly, in Software Engineering, we can say that verification is asking the question: “are we building the product right (i.e., correctly)?” and validation: “are we building the right product (i.e., does it match the original plan/design)?”

  • From a modeling and simulation point of view in scientific computing, we can have best practices like checking the convergence or order of accuracy of a numerical scheme (more on this later) and this would be part of verification. While, if we are modeling a physical system, using conservatiion laws, then checking that the quantities of interest are really conserved is part of validation (in fact, we would check that our assumptions are not violated).

  • For ill-conditioned functions, the best we can hope for is small backward error.

  • If an algorithm always produces small backward errors, then it is stable. But the converse is not always true: some stable algorithms may produce a large backward error.

Exercise 10.1#

  • Find a function \(f(x)\) that models something you’re interested in

  • Plot its condition number \(\kappa\) as a function of \(x\)

  • Plot the relative error (using single or double precision; compare using Julia’s big)

  • Is the relative error ever much bigger than \(\kappa \epsilon_{\text{machine}}\)?

  • If so, can you find what caused the instability in your algorithm?

Further reading: FNC Introduction#

2. Stability demo: Volume of a polygon#

  • Activity: run the following code and comment on the result

X = [1 0; 2 1; 1 3; 0 1; -1 1.5; -2 -1; .5 -2; 1 0]
R(θ) = [cos(θ) -sin(θ); sin(θ) cos(θ)] # rotation matrix
Y = X * R(deg2rad(30))' .+ [0 0]
plot(Y[:,1], Y[:,2], seriestype=:shape, aspect_ratio=:equal, xlims=(-3, 3), ylims=(-3, 3))
using LinearAlgebra
function pvolume(X)
    n = size(X, 1)
    vol = sum(det(X[i:i+1, :]) / 2 for i in 1:n-1)
end

@show ref_vol = pvolume(X)
@show rotated_vol = pvolume(Y)
ref_vol = pvolume(X) = 9.25
rotated_vol = pvolume(Y) = 9.25
9.25

Tip

If not familiar with, see this.

  • What happens if this polygon is translated, perhaps far away? (We’ll see more next time)

3. Rootfinding#

Given \(f(x)\), find \(x_{\star}\) such that \(f(x_{\star}) = 0\).

We’ll work with scalars (\(f\) and \(x\) are just numbers) for now, but keep in mind that they can be vector-valued.

Exercise 10.2#

Solve for \(x\)

  • \(f(x; b) = x^2 - b\)

    • \(x(b) = \sqrt{b}\)

  • \(f(x; b) = \tan x - b\)

    • \(x(b) = \arctan b\)

  • \(f(x) = \cos x + x - b\)

    • \(x(b) = ?\)

We aren’t given \(f(x)\), but rather an algorithm f(x) that approximates it.

  • Sometimes we get extra information, like fprime(x) that approximates \(f'(x)\)

  • If we have source code for f(x), maybe it can be transformed “automatically” to obtain fprime(x)

Example: Queueing#

In a simple queueing model, there is an arrival rate and a departure (serving) rate. While waiting in the queue, there is a probability of “dropping out”. The length of the queue in this model is

\[ \text{length} = \frac{\text{arrival} - \text{drop}}{\text{departure}} \]

One model for the waiting time (where these rates are taken from exponential distributions) is

wait(arrival, departure) = log(arrival / departure) / departure

plot(d -> wait(1, d), xlims=(.1, 1), xlabel="departure", ylabel="wait", label = "wait")

Departure rate given wait#

  • Easy to measure wait

  • I have a limited tolerance for waiting

my_wait = 0.8
plot([d -> wait(1, d) - my_wait, d -> 0], xlims=(.1, 1), xlabel="departure", ylabel="wait")
import Pkg; Pkg.add("Roots")
using Roots
d0 = find_zero(d -> wait(1, d) - my_wait, 1)
plot([d -> wait(1, d) - my_wait, d -> 0], xlims=(.1, 1), xlabel="departure", ylabel="wait")
scatter!([d0], [0], marker=:circle, color=:black, title="d0 = $d0")
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.10/Project.toml`
  No Changes to `~/.julia/environments/v1.10/Manifest.toml`

4. Bisection Method#

Bisection is a rootfinding technique that starts with an interval \([a,b]\) containing a root and does not require derivatives. Suppose \(f\) is continuous. What is a sufficient condition for \(f\) to have a root on \([a,b]\)?

Hint: Intermediate value theorem and its corollary, Bolzano’s theorem.

hasroot(f, a, b) = f(a) * f(b) < 0

f(x) = cos(x) - x
plot(f, label = "cos(x) - x")

Bisection Algorithm: a recursive algorithm#

function bisect(f, a, b, tol)
    mid = (a + b) / 2
    if abs(b - a) < tol
        return mid
    elseif hasroot(f, a, mid)
        return bisect(f, a, mid, tol)
    else
        return bisect(f, mid, b, tol)
    end
end

x0 = bisect(f, -1, 3, 1e-5)
x0
0.7390861511230469
# is it really a root? Let's check:
f(x0)
-1.7035832658995886e-6

How fast does it converge?#

Let’s look at how many iterations it took to find the root.

function bisect_hist(f, a, b, tol)
    mid = (a + b) / 2
    if abs(b - a) < tol
        return [mid]
    elseif hasroot(f, a, mid)
        return prepend!(bisect_hist(f, a, mid, tol), [mid])
    else
        return prepend!(bisect_hist(f, mid, b, tol), [mid])
    end
end
bisect_hist (generic function with 1 method)
bisect_hist(f, -1, 3, 1e-4)
17-element Vector{Float64}:
 1.0
 0.0
 0.5
 0.75
 0.625
 0.6875
 0.71875
 0.734375
 0.7421875
 0.73828125
 0.740234375
 0.7392578125
 0.73876953125
 0.739013671875
 0.7391357421875
 0.73907470703125
 0.739105224609375

Iterative bisection: an optimized algorithm#

  • Data structures are often optimized for appending rather than prepending.

  • Saves stack space

function bisect_iter(f, a, b, tol)
    hist = Float64[]
    while abs(b - a) > tol
        mid = (a + b) / 2
        push!(hist, mid)
        if hasroot(f, a, mid)
            b = mid
        else
            a = mid
        end
    end
    hist
end
bisect_iter (generic function with 1 method)
bisect_iter(f, -1, 3, 1e-4)
16-element Vector{Float64}:
 1.0
 0.0
 0.5
 0.75
 0.625
 0.6875
 0.71875
 0.734375
 0.7421875
 0.73828125
 0.740234375
 0.7392578125
 0.73876953125
 0.739013671875
 0.7391357421875
 0.73907470703125

Let’s plot the error#

\[ \lvert \texttt{bisect}^k(f, a, b) - r \rvert, \quad k = 1, 2, \dotsc \]

where \(r\) is the true root, \(f(r) = 0\).

r = bisect(f, -1, 3, 1e-14) # what are we trusting?
hist = bisect_hist(f, -1, 3, 1e-10)
scatter( abs.(hist .- r), yscale=:log10)
ks = 1:length(hist)
plot!(ks, 4 * (.5 .^ ks))

Evidently the error \(e_k = x_k - x_*\) after \(k\) bisection iterations satisfies the bound

\[ |e^k| \le c 2^{-k} . \]

  • The absolute error is halved at each step so the method converges linearly. But is this all there is to it? We’ll see more next time.

Exercise 10.2#

  1. Find an interval of length \(1\) with integer endpoints on which a solution to \(x^3=9\) lies. Then use the bisection code starting with this interval to find the solution within a \(10^{-4}\) tolerance.

  2. Find an interval of length \(1\) with integer endpoints on which a solution to \(6 + \cos^2{x} = x\) lies. Then use the bisection code starting with this interval to find the solution within a \(10^{-4}\) tolerance.

  3. Suppose we want to know what interest rate we would need to achieve (assuming constant interest rates) such that the initial deposit of \(\$50,000\) and annual contribution of \(\$10,000\) will grow to \(\$1,000,000\) in \(20\) years. Thus we need to solve \(1000000 = \left( 50000 + \frac{10000}{r} \right) e^{20r} - \frac{10000}{r}\) for the interest rate, \(r\). Find a reasonable interval on which the solution ought to lie and use the bisection code to find \(r\) to within \(10^{-6}\).

  4. Find appropriate intervals and use the bisection code to find ALL the root(s) of \(|x|e^{x}= 0.25\).

  5. Use the bisection code to find the root of \(27 x^3 - 27 x^2 + 9x = 1\) on \([0, 1]\) to within \(10^{-6}\).

  6. Use the bisection code to find the root of \(64 x^3 - 48x^2 + 12x = 1\) on \([0, 1]\) to within \(10^{-6}\).