Differential equations | ||||||
---|---|---|---|---|---|---|
Scope | ||||||
Fields
|
||||||
Classification | ||||||
Types
|
||||||
Relation to processes | ||||||
Solution | ||||||
Existence and uniqueness | ||||||
General topics | ||||||
Solution methods | ||||||
People | ||||||
List | ||||||
In numerical analysis, the order of convergence and the rate of convergence of a convergent sequence are quantities that represent how quickly the sequence approaches its limit. A sequence ( x n ) {\displaystyle (x_{n})} that converges to L {\displaystyle L} is said to have order of convergence q ≥ 1 {\displaystyle q\geq 1} and rate of convergence μ {\displaystyle \mu } if
lim n → ∞ | x n + 1 − L | | x n − L | q = μ . {\displaystyle \lim _{n\rightarrow \infty }{\frac {\left|x_{n+1}-L\right|}{\left|x_{n}-L\right|^{q}}}=\mu .}The rate of convergence μ {\displaystyle \mu } is also called the asymptotic error constant. Note that this terminology is not standardized and some authors will use rate where this article uses order (e.g., ).
In practice, the rate and order of convergence provide useful insights when using iterative methods for calculating numerical approximations. If the order of convergence is higher, then typically fewer iterations are necessary to yield a useful approximation. Strictly speaking, however, the asymptotic behavior of a sequence does not give conclusive information about any finite part of the sequence.
Similar concepts are used for discretization methods. The solution of the discretized problem converges to the solution of the continuous problem as the grid size goes to zero, and the speed of convergence is one of the factors of the efficiency of the method. However, the terminology, in this case, is different from the terminology for iterative methods.
Series acceleration is a collection of techniques for improving the rate of convergence of a series discretization. Such acceleration is commonly accomplished with sequence transformations.
Suppose that the sequence ( x k ) {\displaystyle (x_{k})} converges to the number L {\displaystyle L} . The sequence is said to converge with order q {\displaystyle q} to L {\displaystyle L} , and with a rate of convergence of μ {\displaystyle \mu } , if
lim k → ∞ | x k + 1 − L | | x k − L | q = μ {\displaystyle \lim _{k\to \infty }{\frac {|x_{k+1}-L|}{|x_{k}-L|^{q}}}=\mu } | (Definition 1) |
for some positive constant μ ∈ ( 0 , ∞ ) {\displaystyle \mu \in (0,\infty )} if q > 1 {\displaystyle q>1} , and μ ∈ ( 0 , 1 ) {\displaystyle \mu \in (0,1)} if q = 1 {\displaystyle q=1} . It is not necessary, however, that q {\displaystyle q} be an integer. For example, the secant method, when converging to a regular, simple root, has an order of φ ≈ 1.618.
Convergence with order
A practical method to calculate the order of convergence for a sequence generated by a fixed point iteration is to calculate the following sequence, which converges to q {\displaystyle q} :
q ≈ log | x k + 1 − x k x k − x k − 1 | log | x k − x k − 1 x k − 1 − x k − 2 | . {\displaystyle q\approx {\frac {\log \left|\displaystyle {\frac {x_{k+1}-x_{k}}{x_{k}-x_{k-1}}}\right|}{\log \left|\displaystyle {\frac {x_{k}-x_{k-1}}{x_{k-1}-x_{k-2}}}\right|}}.}For numerical approximation of an exact value through a numerical method of order q see
Q-convergence definitionsIn addition to the previously defined Q-linear convergence, a few other Q-convergence definitions exist. Given Definition 1 defined above, the sequence is said to converge Q-superlinearly to L {\displaystyle L} (i.e. faster than linearly) in all the cases where q > 1 {\displaystyle q>1} and also the case q = 1 , μ = 0 {\displaystyle q=1,\mu =0} . Given Definition 1, the sequence is said to converge Q-sublinearly to L {\displaystyle L} (i.e. slower than linearly) if q = 1 , μ = 1 {\displaystyle q=1,\mu =1} . The sequence ( x k ) {\displaystyle (x_{k})} converges logarithmically to L {\displaystyle L} if the sequence converges sublinearly and additionally if
lim k → ∞ | x k + 1 − x k | | x k − x k − 1 | = 1. {\displaystyle \lim _{k\to \infty }{\frac {|x_{k+1}-x_{k}|}{|x_{k}-x_{k-1}|}}=1.} Note that unlike previous definitions, logarithmic convergence is not called "Q-logarithmic."In the definitions above, the "Q-" stands for "quotient" because the terms are defined using the quotient between two successive terms.: 619 Often, however, the "Q-" is dropped and a sequence is simply said to have linear convergence, quadratic convergence, etc.
R-convergence definitionThe Q-convergence definitions have a shortcoming in that they do not include some sequences, such as the sequence ( b k ) {\displaystyle (b_{k})} below, which converge reasonably fast, but whose rate is variable. Therefore, the definition of rate of convergence is extended as follows.
Suppose that ( x k ) {\displaystyle (x_{k})} converges to L {\displaystyle L} . The sequence is said to converge R-linearly to L {\displaystyle L} if there exists a sequence ( ε k ) {\displaystyle (\varepsilon _{k})} such that
| x k − L | ≤ ε k for all k , {\displaystyle |x_{k}-L|\leq \varepsilon _{k}\quad {\text{for all }}k\,,} and ( ε k ) {\displaystyle (\varepsilon _{k})} converges Q-linearly to zero. The "R-" prefix stands for "root". : 620Consider the sequence
( a k ) = { 1 , 1 2 , 1 4 , 1 8 , 1 16 , 1 32 , … , 1 2 k , … } . {\displaystyle (a_{k})=\left\{1,{\frac {1}{2}},{\frac {1}{4}},{\frac {1}{8}},{\frac {1}{16}},{\frac {1}{32}},\ldots ,{\frac {1}{2^{k}}},\dots \right\}.} It can be shown that this sequence converges to L = 0 {\displaystyle L=0} . To determine the type of convergence, we plug the sequence into the definition of Q-linear convergence, lim k → ∞ | 1 / 2 k + 1 − 0 | | 1 / 2 k − 0 | = lim k → ∞ 2 k 2 k + 1 = 1 2 . {\displaystyle \lim _{k\to \infty }{\frac {\left|1/2^{k+1}-0\right|}{\left|1/2^{k}-0\right|}}=\lim _{k\to \infty }{\frac {2^{k}}{2^{k+1}}}={\frac {1}{2}}.} Thus, we find that ( a k ) {\displaystyle (a_{k})} converges Q-linearly and has a convergence rate of μ = 1 / 2 {\displaystyle \mu =1/2} . More generally, for any c ∈ R , μ ∈ ( − 1 , 1 ) {\displaystyle c\in \mathbb {R} ,\mu \in (-1,1)} , the sequence ( c μ k ) {\displaystyle (c\mu ^{k})} converges linearly with rate | μ | {\displaystyle |\mu |} .The sequence
( b k ) = { 1 , 1 , 1 4 , 1 4 , 1 16 , 1 16 , … , 1 4 ⌊ k 2 ⌋ , … } {\displaystyle (b_{k})=\left\{1,1,{\frac {1}{4}},{\frac {1}{4}},{\frac {1}{16}},{\frac {1}{16}},\ldots ,{\frac {1}{4^{\left\lfloor {\frac {k}{2}}\right\rfloor }}},\,\ldots \right\}} also converges linearly to 0 with rate 1/2 under the R-convergence definition, but not under the Q-convergence definition. (Note that ⌊ x ⌋ {\displaystyle \lfloor x\rfloor } is the floor function, which gives the largest integer that is less than or equal to x {\displaystyle x} .)The sequence
( c k ) = { 1 2 , 1 4 , 1 16 , 1 256 , 1 65 , 536 , … , 1 2 2 k , … } {\displaystyle (c_{k})=\left\{{\frac {1}{2}},{\frac {1}{4}},{\frac {1}{16}},{\frac {1}{256}},{\frac {1}{65,\!536}},\ldots ,{\frac {1}{2^{2^{k}}}},\ldots \right\}} converges superlinearly. In fact, it is quadratically convergent.Finally, the sequence
( d k ) = { 1 , 1 2 , 1 3 , 1 4 , 1 5 , 1 6 , … , 1 k + 1 , … } {\displaystyle (d_{k})=\left\{1,{\frac {1}{2}},{\frac {1}{3}},{\frac {1}{4}},{\frac {1}{5}},{\frac {1}{6}},\ldots ,{\frac {1}{k+1}},\ldots \right\}} converges sublinearly and logarithmically. Linear, linear, superlinear (quadratic), and sublinear rates of convergenceThis section needs additional citations for verification. Please help improve this article by adding citations to reliable sources in this section. Unsourced material may be challenged and removed. (August 2020) (Learn how and when to remove this message) |
This section may require cleanup to meet Wikipedia's quality standards. The specific problem is: There appears to be a mixture of defining convergence with regards to grid points n {\displaystyle n} and with step size h {\displaystyle h} . Section should be modified for consistency and include an explanation of alternative (equivalent?) definitions. Please help improve this section if you can. (August 2020) (Learn how and when to remove this message) |
A similar situation exists for discretization methods designed to approximate a function y = f ( x ) {\displaystyle y=f(x)} , which might be an integral being approximated by numerical quadrature, or the solution of an ordinary differential equation (see example below). The discretization method generates a sequence y 0 , y 1 , y 2 , y 3 , . . . {\displaystyle {y_{0},y_{1},y_{2},y_{3},...}} , where each successive y j {\displaystyle y_{j}} is a function of y j − 1 , y j − 2 , . . . {\displaystyle y_{j-1},y_{j-2},...} along with the grid spacing h {\displaystyle h} between successive values of the independent variable x {\displaystyle x} . The important parameter here for the convergence speed to y = f ( x ) {\displaystyle y=f(x)} is the grid spacing h {\displaystyle h} , inversely proportional to the number of grid points, i.e. the number of points in the sequence required to reach a given value of x {\displaystyle x} .
In this case, the sequence ( y n ) {\displaystyle (y_{n})} is said to converge to the sequence f ( x n ) {\displaystyle f(x_{n})} with order q if there exists a constant C such that
| y n − f ( x n ) | < C h q for all n . {\displaystyle |y_{n}-f(x_{n})|<Ch^{q}{\text{ for all }}n.}This is written as | y n − f ( x n ) | = O ( h q ) {\displaystyle |y_{n}-f(x_{n})|={\mathcal {O}}(h^{q})} using big O notation.
This is the relevant definition when discussing methods for numerical quadrature or the solution of ordinary differential equations (ODEs).
A practical method to estimate the order of convergence for a discretization method is pick step sizes h new {\displaystyle h_{\text{new}}} and h old {\displaystyle h_{\text{old}}} and calculate the resulting errors e new {\displaystyle e_{\text{new}}} and e old {\displaystyle e_{\text{old}}} . The order of convergence is then approximated by the following formula:
q ≈ log ( e new / e old ) log ( h new / h old ) , {\displaystyle q\approx {\frac {\log(e_{\text{new}}/e_{\text{old}})}{\log(h_{\text{new}}/h_{\text{old}})}},}which comes from writing the truncation error, at the old and new grid spacings, as
e = | y n − f ( x n ) | = O ( h q ) . {\displaystyle e=|y_{n}-f(x_{n})|={\mathcal {O}}(h^{q}).}The error e {\displaystyle e} is, more specifically, a global truncation error (GTE), in that it represents a sum of errors accumulated over all n {\displaystyle n} iterations, as opposed to a local truncation error (LTE) over just one iteration.
Consider the ordinary differential equation
d y d x = − κ y {\displaystyle {\frac {dy}{dx}}=-\kappa y}with initial condition y ( 0 ) = y 0 {\displaystyle y(0)=y_{0}} . We can solve this equation using the Forward Euler scheme for numerical discretization:
y n + 1 − y n h = − κ y n , {\displaystyle {\frac {y_{n+1}-y_{n}}{h}}=-\kappa y_{n},}which generates the sequence
y n + 1 = y n ( 1 − h κ ) . {\displaystyle y_{n+1}=y_{n}(1-h\kappa ).}In terms of y ( 0 ) = y 0 {\displaystyle y(0)=y_{0}} , this sequence is as follows, from the Binomial theorem:
y n = y 0 ( 1 − h κ ) n = y 0 ( 1 − n h κ + n ( n − 1 ) h 2 κ 2 2 + . . . . ) . {\displaystyle y_{n}=y_{0}(1-h\kappa )^{n}=y_{0}\left(1-nh\kappa +n(n-1){\frac {h^{2}\kappa ^{2}}{2}}+....\right).}The exact solution to this ODE is y = f ( x ) = y 0 exp ( − κ x ) {\displaystyle y=f(x)=y_{0}\exp(-\kappa x)} , corresponding to the following Taylor expansion in h κ {\displaystyle h\kappa } for h κ ≪ 1 {\displaystyle h\kappa \ll 1} :
f ( x n ) = f ( n h ) = y 0 exp ( − κ n h ) = y 0 n = y 0 ( 1 − h κ + h 2 κ 2 2 + . . . . ) n = y 0 ( 1 − n h κ + n 2 h 2 κ 2 2 + . . . ) . {\displaystyle f(x_{n})=f(nh)=y_{0}\exp(-\kappa nh)=y_{0}\left^{n}=y_{0}\left(1-h\kappa +{\frac {h^{2}\kappa ^{2}}{2}}+....\right)^{n}=y_{0}\left(1-nh\kappa +{\frac {n^{2}h^{2}\kappa ^{2}}{2}}+...\right).}In this case, the truncation error is
e = | y n − f ( x n ) | = n h 2 κ 2 2 = O ( h 2 ) , {\displaystyle e=|y_{n}-f(x_{n})|={\frac {nh^{2}\kappa ^{2}}{2}}={\mathcal {O}}(h^{2}),}so ( y n ) {\displaystyle (y_{n})} converges to f ( x n ) {\displaystyle f(x_{n})} with a convergence rate q = 2 {\displaystyle q=2} .
The sequence ( d k ) {\displaystyle (d_{k})} with d k = 1 / ( k + 1 ) {\displaystyle d_{k}=1/(k+1)} was introduced above. This sequence converges with order 1 according to the convention for discretization methods.
The sequence ( a k ) {\displaystyle (a_{k})} with a k = 2 − k {\displaystyle a_{k}=2^{-k}} , which was also introduced above, converges with order q for every number q. It is said to converge exponentially using the convention for discretization methods. However, it only converges linearly (that is, with order 1) using the convention for iterative methods.
The case of recurrent sequences x n + 1 := f ( x n ) {\displaystyle x_{n+1}:=f(x_{n})} which occurs in dynamical systems and in the context of various fixed-point theorems is of particular interest. Assuming that the relevant derivatives of f are continuous, one can (easily) show that for a fixed point f ( p ) = p {\displaystyle f(p)=p} such that | f ′ ( p ) | < 1 {\displaystyle |f'(p)|<1} , one has at least linear convergence for any starting value x 0 {\displaystyle x_{0}} sufficiently close to p. If | f ′ ( p ) | = 0 {\displaystyle |f'(p)|=0} and | f ″ ( p ) | < 1 {\displaystyle |f''(p)|<1} , then one has at least quadratic convergence, and so on. If | f ′ ( p ) | > 1 {\displaystyle |f'(p)|>1} , then one has a repulsive fixed point and no starting value will produce a sequence converging to p (unless one directly jumps to the point p itself).
Many methods exist to increase the rate of convergence of a given sequence, i.e. to transform a given sequence into one converging faster to the same limit. Such techniques are in general known as "series acceleration". The goal of the transformed sequence is to reduce the computational cost of the calculation. One example of series acceleration is Aitken's delta-squared process. These methods in general (and in particular Aitken's method) do not increase the order of convergence, and are useful only if initially the convergence is not faster than linear: If ( x n ) {\displaystyle (x_{n})} convergences linearly, one gets a sequence ( a n ) {\displaystyle (a_{n})} that still converges linearly (except for pathologically designed special cases), but faster in the sense that lim ( a n − L ) / ( x n − L ) = 0 {\displaystyle \lim(a_{n}-L)/(x_{n}-L)=0} . On the other hand, if the convergence is already of order ≥ 2, Aitken's method will bring no improvement.
The simple definition is used in
The extended definition is used in
The Big O definition is used in
The terms Q-linear and R-linear are used in
Differential equations | |||||||
---|---|---|---|---|---|---|---|
Classification |
| ||||||
Solutions | |||||||
Examples | |||||||
Mathematicians |