9) Conditioning#

Today:

  1. Recap from last time

  2. Condition numbers

using Plots
default(linewidth=3, legendfontsize=12, xtickfontsize=12, ytickfontsize=12)

1. Recap from last time#

Last time we saw:

Exact arithmetic, correctly rounded#

Take an elementary math operation \(*\) (addition, subtraction, multiplication, division), and the discrete operation that our computers perform, \(\circledast\). Then

\[x \circledast y := \operatorname{fl}(x * y)\]

with a relative accuracy \(\epsilon_{\text{machine}}\),

\[ \frac{|(x \circledast y) - (x * y)|}{|x * y|} \le \epsilon_{\text{machine}} . \]

How do operations compose?#

Is this true?

\[ \frac{\Big\lvert \big((x \circledast y) \circledast z\big) - \big((x * y) * z\big) \Big\rvert}{|(x * y) * z|} \le^? \epsilon_{\text{machine}} \]
f(x; y=1, z=-1) = (x+y)+z # The best arbitrary numbers are 0, 1, and -1
plot(x -> abs(f(x) - x)/abs(x), xlims=(-1e-15, 1e-15), label = "rel error")

Activity#

  • Use Jupyter and Julia

  • Look at how fast series converge when taking only finitely many terms

  • Explore instability, as is occuring for large negative x above, but not for the standard library expm1

function myexp(x, k)
    sum = 0
    term = 1
    n = 1
    # modify so at most k terms are summed
    while sum + term != sum
        sum += term
        term *= x / n
        # stop at first k terms
        if n == k
            break
        end
        n += 1
    end
    sum
end

myexp(-30, 1000), exp(-30) # stopping at the 1000th term Vs the Hulia built-in
(6.1030424788918156e-6, 9.357622968840175e-14)
rel_error(x, k) = abs(myexp(x, k) - exp(x)) / exp(x)
ks = 2 .^ (0:10)  # [1, 2, 4, ..., 1024];
11-element Vector{Int64}:
    1
    2
    4
    8
   16
   32
   64
  128
  256
  512
 1024
plot(ks, k -> rel_error(-10, k), xscale=:log10, yscale=:log10, label = "rel_error", markershape = :circle, markercolor = "brown")
plot(x -> rel_error(x, 1000) + 1e-17, xlims=(-20, 20), yscale=:log10)

What happened? Let’s look at the terms for positive and negative \(x\)

function expterms(x, k=50)
    term = 1.
    terms = [term]
    for n in 1:k
        term *= x / n
        push!(terms, term)
    end
    terms
end
x = -10
@show sum(expterms(x)) - exp(x)
@show (sum(expterms(x)) - exp(x)) / exp(x)
expterms(x)
sum(expterms(x)) - exp(x) = 1.2092180894291739e-12
(sum(expterms(x)) - exp(x)) / exp(x) = 2.6634800885273225e-8
51-element Vector{Float64}:
     1.0
   -10.0
    50.0
  -166.66666666666669
   416.66666666666674
  -833.3333333333335
  1388.8888888888891
 -1984.1269841269846
  2480.1587301587306
 -2755.7319223985896
  2755.7319223985896
 -2505.2108385441725
  2087.6756987868107
     ⋮
    -4.902469756513546e-8
     1.2256174391283866e-8
    -2.9893108271424062e-9
     7.117406731291443e-10
    -1.655210867742196e-10
     3.761842881232264e-11
    -8.359650847182808e-12
     1.81731540156148e-12
    -3.866628513960596e-13
     8.055476070751242e-14
    -1.64397470831658e-14
     3.28794941663316e-15
x = -10
@show exp(x)
bar(expterms(x), label = "expterms")
exp(x) = 4.5399929762484854e-5

Condition number#

What sort of functions cause small errors to become big?

Consider a function \(f: X \to Y\) and define the absolute condition number

\[ \hat\kappa = \lim_{\delta \to 0} \max_{|\delta x| < \delta} \frac{|f(x + \delta x) - f(x)|}{|\delta x|} = \max_{\delta x} \frac{|\delta f|}{|\delta x|}. \]
If \(f\) is differentiable, then \(\hat\kappa = |f'(x)|\).

Floating point offers relative accuracy, so it’s more useful to discuss relative condition number,

\[ \kappa = \max_{\delta x} \frac{|\delta f|/|f|}{|\delta x|/|x|} = \max_{\delta x} \Big[ \frac{|\delta f|/|\delta x|}{|f| / |x|} \Big] \]
or, if \(f\) is differentiable,
\[ \kappa = |f'(x)| \frac{|x|}{|f|} . \]

f(x) = x - 1; fprime(x) = 1
plot(x -> abs(fprime(x)) * abs(x) / abs(f(x)), xlims=(0, 2), label = "kappa for x-1")

Example:#

Back to \(f(x) = e^x - 1\)

f(x) = exp(x) - 1
fprime(x) = exp(x)
plot(x -> abs(fprime(x)) * abs(x) / abs(f(x)), label = "kappa for exp(x)")

What does it mean?

  • The function \(f(x) = e^x - 1\) is well-conditioned

  • The function \(f_1(x) = e^x\) is well-conditioned

  • The function \(f_2(x) = x - 1\) is ill-conditioned for \(x \approx 1\)

The algorithm is unstable#

  • f(x) = exp(x) - 1 is unstable

  • Algorithms are made from elementary operations

  • Unstable algorithms do something ill-conditioned

A stable algorithm#

We used the series expansion previously.

  • accurate for small \(x\)

  • less accurate for negative \(x\) (see activity)

  • we could use symmetry to fix

  • inefficient because we have to sum lots of terms

Standard math libraries define a more efficient stable variant, \(\texttt{expm1}(x) = e^x - 1\).

expm1(-40) + 1 # this way, using the built-in, stable variant, it is exactly 0
0.0
exp(-40) # almost zero
4.248354255291589e-18
plot([x -> exp(x) - 1,
      x -> expm1(x)],
    xlims = (-1e-15, 1e-15), label = ["y = exp(x) - 1" "y = expm1(x)"])

Another example:#

\(\log(1 + x)\)

What is the condition number of \(f(x) = \log(1 + x)\) for \(x \approx 0\)?

plot([x -> log(1 + x),
      x -> log1p(x)],
    xlims = (-1e-15, 1e-15), label = ["y = log(1 + x)" "y = log1p(x)"])
condlog1p(x) = abs(1/(1+x) * x / log1p(x))
condlog(x) = abs(1/x * x / log(x))
plot([condlog1p condlog], xlims=(-1, 2), ylims=(0, 100), label = ["cond of log1p" "cond of log"])

Reliable = well-conditioned and stable algorithm#

Mathematical functions \(f(x)\) can be ill-conditioned (big \(\kappa\))#

  • Modeling is how we turn an abstract question into a mathematical function

  • We want well-conditioned models (small \(\kappa\))

  • Some systems are intrinsically sensitive (chaotic): fracture, chaotic systems, combustion

Algorithms f(x) can be unstable#

  • Unreliable, though sometimes practical

  • Unstable algorithms are constructed from ill-conditioned parts

Exercise 9.1:#

  • Find a function \(f(x)\) that models something you’re interested in

  • Plot its condition number \(\kappa\) as a function of \(x\)

  • Plot the relative error (using single or double precision; compare using Julia’s big)

  • Is the relative error ever much bigger than \(\kappa \epsilon_{\text{machine}}\)?

  • If so, can you find what caused the instability in your algorithm?

Further reading: FNC Introduction#

Exercise 9.2:#

How big are these condition numbers?

  • \(f(x) = x+c\)

  • \(f(x) = cx\)

  • \(f(x) = x^p\)

  • \(f(x) = e^x\)

  • \(f(x) = \log x\)

  • \(f(x) = \sin x\)