10) Rootfinding#
Last time#
More on floating point
Discuss Taylor Series activity
Condition numbers
Today#
Intro to forward and backward error
Computing volume of a polygon
Rootfinding
Bisection Method
using Plots
default(linewidth=4)
What we’re hoping for is for our problems (functions) and algorithms to be reliable (i.e., we can trust the result that our computer spits out). Reliable = well-conditioned and stable. But unfortunately
Mathematical functions \(f(x)\) can be ill-conditioned: (big \(\kappa\))
Modeling is how we turn an abstract question into a mathematical function
We want well-conditioned models (small \(\kappa\))
Some systems are intrinsically sensitive: fracture, chaotic systems, combustion
Algorithms f(x)
can be unstable
Unreliable, though sometimes practical
Unstable algorithms are constructed from ill-conditioned parts
1. Introduction to forward and backward error#
Activity: Read Backward error
Backward error measures what change to the original data would reproduce the result found by an algorithm.
Let \(\tilde{y} = \tilde{f}(x)\) be a computed result for the original data \(x\). If there is a value \(\tilde{x}\) such that
then we call \(|\tilde{x} −x|/|x|\) the (relative) backward error of the result.
Instead of asking, “How close to the true answer is your answer?”, backward error asks, “How close to the true question is the question you answered?”
Aside: Verification and Validation (V&V)
This is similar to the difference between Verification and Validation (V&V) of your results.
In Numerical Analysis, the concepts of Verification and Validation can be oversimplified in a succinct manner by saying that “verification is checking that you are solving the equations right” (verifying the numerics) and “validation is checking that you are solving the right equations” (for physical or engineering problems this involves verifying the physics - often done by comparing model results with actual data given by observations).
Similarly, in Software Engineering, we can say that verification is asking the question: “are we building the product right (i.e., correctly)?” and validation: “are we building the right product (i.e., does it match the original plan/design)?”
From a modeling and simulation point of view in scientific computing, we can have best practices like checking the convergence or order of accuracy of a numerical scheme (more on this later) and this would be part of verification. While, if we are modeling a physical system, using conservatiion laws, then checking that the quantities of interest are really conserved is part of validation (in fact, we would check that our assumptions are not violated).
For ill-conditioned functions, the best we can hope for is small backward error.
If an algorithm always produces small backward errors, then it is stable. But the converse is not always true: some stable algorithms may produce a large backward error.
Exercise 10.1#
Find a function \(f(x)\) that models something you’re interested in
Plot its condition number \(\kappa\) as a function of \(x\)
Plot the relative error (using single or double precision; compare using Julia’s
big
)Is the relative error ever much bigger than \(\kappa \epsilon_{\text{machine}}\)?
If so, can you find what caused the instability in your algorithm?
Further reading: FNC Introduction#
2. Stability demo: Volume of a polygon#
Activity: run the following code and comment on the result
X = [1 0; 2 1; 1 3; 0 1; -1 1.5; -2 -1; .5 -2; 1 0]
R(θ) = [cos(θ) -sin(θ); sin(θ) cos(θ)] # rotation matrix
Y = X * R(deg2rad(30))' .+ [0 0]
plot(Y[:,1], Y[:,2], seriestype=:shape, aspect_ratio=:equal, xlims=(-3, 3), ylims=(-3, 3))
using LinearAlgebra
function pvolume(X)
n = size(X, 1)
vol = sum(det(X[i:i+1, :]) / 2 for i in 1:n-1)
end
@show ref_vol = pvolume(X)
@show rotated_vol = pvolume(Y)
ref_vol = pvolume(X) = 9.25
rotated_vol = pvolume(Y) = 9.25
9.25
Tip
If not familiar with, see this.
What happens if this polygon is translated, perhaps far away? (We’ll see more next time)
3. Rootfinding#
Given \(f(x)\), find \(x_{\star}\) such that \(f(x_{\star}) = 0\).
We’ll work with scalars (\(f\) and \(x\) are just numbers) for now, but keep in mind that they can be vector-valued.
Exercise 10.2#
Solve for \(x\)
\(f(x; b) = x^2 - b\)
\(x(b) = \sqrt{b}\)
\(f(x; b) = \tan x - b\)
\(x(b) = \arctan b\)
\(f(x) = \cos x + x - b\)
\(x(b) = ?\)
We aren’t given \(f(x)\), but rather an algorithm f(x)
that approximates it.
Sometimes we get extra information, like
fprime(x)
that approximates \(f'(x)\)If we have source code for
f(x)
, maybe it can be transformed “automatically” to obtainfprime(x)
Example: Queueing#
In a simple queueing model, there is an arrival rate and a departure (serving) rate. While waiting in the queue, there is a probability of “dropping out”. The length of the queue in this model is
One model for the waiting time (where these rates are taken from exponential distributions) is
wait(arrival, departure) = log(arrival / departure) / departure
plot(d -> wait(1, d), xlims=(.1, 1), xlabel="departure", ylabel="wait", label = "wait")
Departure rate given wait#
Easy to measure wait
I have a limited tolerance for waiting
my_wait = 0.8
plot([d -> wait(1, d) - my_wait, d -> 0], xlims=(.1, 1), xlabel="departure", ylabel="wait")
import Pkg; Pkg.add("Roots")
using Roots
d0 = find_zero(d -> wait(1, d) - my_wait, 1)
plot([d -> wait(1, d) - my_wait, d -> 0], xlims=(.1, 1), xlabel="departure", ylabel="wait")
scatter!([d0], [0], marker=:circle, color=:black, title="d0 = $d0")
Resolving package versions...
No Changes to `~/.julia/environments/v1.10/Project.toml`
No Changes to `~/.julia/environments/v1.10/Manifest.toml`
4. Bisection Method#
Bisection is a rootfinding technique that starts with an interval \([a,b]\) containing a root and does not require derivatives. Suppose \(f\) is continuous. What is a sufficient condition for \(f\) to have a root on \([a,b]\)?
Hint: Intermediate value theorem and its corollary, Bolzano’s theorem.
hasroot(f, a, b) = f(a) * f(b) < 0
f(x) = cos(x) - x
plot(f, label = "cos(x) - x")
Bisection Algorithm: a recursive algorithm#
function bisect(f, a, b, tol)
mid = (a + b) / 2
if abs(b - a) < tol
return mid
elseif hasroot(f, a, mid)
return bisect(f, a, mid, tol)
else
return bisect(f, mid, b, tol)
end
end
x0 = bisect(f, -1, 3, 1e-5)
x0
0.7390861511230469
# is it really a root? Let's check:
f(x0)
-1.7035832658995886e-6
How fast does it converge?#
Let’s look at how many iterations it took to find the root.
function bisect_hist(f, a, b, tol)
mid = (a + b) / 2
if abs(b - a) < tol
return [mid]
elseif hasroot(f, a, mid)
return prepend!(bisect_hist(f, a, mid, tol), [mid])
else
return prepend!(bisect_hist(f, mid, b, tol), [mid])
end
end
bisect_hist (generic function with 1 method)
bisect_hist(f, -1, 3, 1e-4)
17-element Vector{Float64}:
1.0
0.0
0.5
0.75
0.625
0.6875
0.71875
0.734375
0.7421875
0.73828125
0.740234375
0.7392578125
0.73876953125
0.739013671875
0.7391357421875
0.73907470703125
0.739105224609375
Iterative bisection: an optimized algorithm#
Data structures are often optimized for appending rather than prepending.
Saves stack space
function bisect_iter(f, a, b, tol)
hist = Float64[]
while abs(b - a) > tol
mid = (a + b) / 2
push!(hist, mid)
if hasroot(f, a, mid)
b = mid
else
a = mid
end
end
hist
end
bisect_iter (generic function with 1 method)
bisect_iter(f, -1, 3, 1e-4)
16-element Vector{Float64}:
1.0
0.0
0.5
0.75
0.625
0.6875
0.71875
0.734375
0.7421875
0.73828125
0.740234375
0.7392578125
0.73876953125
0.739013671875
0.7391357421875
0.73907470703125
Let’s plot the error#
where \(r\) is the true root, \(f(r) = 0\).
r = bisect(f, -1, 3, 1e-14) # what are we trusting?
hist = bisect_hist(f, -1, 3, 1e-10)
scatter( abs.(hist .- r), yscale=:log10)
ks = 1:length(hist)
plot!(ks, 4 * (.5 .^ ks))
Evidently the error \(e_k = x_k - x_*\) after \(k\) bisection iterations satisfies the bound
The absolute error is halved at each step so the method converges linearly. But is this all there is to it? We’ll see more next time.
Exercise 10.2#
Find an interval of length \(1\) with integer endpoints on which a solution to \(x^3=9\) lies. Then use the bisection code starting with this interval to find the solution within a \(10^{-4}\) tolerance.
Find an interval of length \(1\) with integer endpoints on which a solution to \(6 + \cos^2{x} = x\) lies. Then use the bisection code starting with this interval to find the solution within a \(10^{-4}\) tolerance.
Suppose we want to know what interest rate we would need to achieve (assuming constant interest rates) such that the initial deposit of \(\$50,000\) and annual contribution of \(\$10,000\) will grow to \(\$1,000,000\) in \(20\) years. Thus we need to solve \(1000000 = \left( 50000 + \frac{10000}{r} \right) e^{20r} - \frac{10000}{r}\) for the interest rate, \(r\). Find a reasonable interval on which the solution ought to lie and use the bisection code to find \(r\) to within \(10^{-6}\).
Find appropriate intervals and use the bisection code to find ALL the root(s) of \(|x|e^{x}= 0.25\).
Use the bisection code to find the root of \(27 x^3 - 27 x^2 + 9x = 1\) on \([0, 1]\) to within \(10^{-6}\).
Use the bisection code to find the root of \(64 x^3 - 48x^2 + 12x = 1\) on \([0, 1]\) to within \(10^{-6}\).