319-332

Statistical Theory of Extremes

Extremes in Random Sequences and Stochastic Processes

José Tiago de Fonseca Oliveira 1

1.Academia das Ciências de Lisboa (Lisbon Academy of Sciences), Lisbon, Portugal.

23-06-2017
28-12-2016
28-12-2016
28-12-2016

Graphical Abstract

Highlights

Abstract

The chapter analyses the development in time of some physical system subject to random influences, by means of a discrete or continuously registering instrument. The behaviour of maxima in random sequences and stochastic processes having as a target to obtain, under convenient conditions, the same limiting distributions is discussed in this chapter. Extremes in stochastic sequences, global extremes of Gaussian stationary processes, global extremes in Wiener-Lévy processes and local extremes in Gaussian stationary processes are given with some results. Some scattered examples of extreme value properties for other kinds of stochastic processes are also discussed.

Keywords

Global extremes , Local extremes , Wiener-Lévy processes , Gaussian stationary processes , Stochastic processes , Random sequence

1 . Introduction

There are various physical applications modelled by a family of random variables \(\mathrm{ X ( t ) }\) depending on a discrete or continuous parameter \(\mathrm{ t }\), usually interpreted as time. This situation arises when we study the development in time of some physical system subject to random influences, by means of a discrete or continuously registering instrument. Our observations can take the form of a set of discrete ordinates or of a curve in a diagram with the time \(\mathrm{ t }\), as abscissa, and the observed value \(\mathrm{ X ( t ) }\) as ordinates. This cloud of points, or curve, can be regarded as a realization, a version, a trajectory or a sample function of the basic random sequence or stochastic process \(\mathrm{ X ( t ) }\). The random variables \(\mathrm{ X ( t ) }\) and \(\mathrm{ X ( u ) }\), for different time moments \(\mathrm{ t }\) and \(\mathrm{ u }\) will, in general, be correlated. We will not deal with random fields.

We will, then, successively, consider the behaviour of maxima in random sequences and stochastic processes having as a target to obtain, under convenient conditions, the same limiting distributions. It is intuitive that some waning out of the dependence must be imposed. Let us clarify this by a simple example. Consider a sequence of i.i.d. standard normal random variables \(\mathrm{ Y_{0},Y_{1},Y_{2}, \cdots ,Y_{n} }\)and consider the (composed) sequence \(\mathrm{ X_{i}=\sqrt[]{ \rho }~Y_{0}+\sqrt[]{1- \rho }~Y_{i},i \geq 1,0 \leq \rho \leq 1 }\). It is evident that the sequence \(\mathrm{ \{ X_{i} \} }\) is a multinormal sequence with standard margins and (constant) correlation \(\mathrm{ \rho }\) (the inverse is also true). It is evident that \(\mathrm{ \begin{array}{c}\mathrm{ n} \\ \mathrm{max } \\ \mathrm{ 1 } \end{array} ~X_{i}=\sqrt[]{ \rho }~Y_{0}+\sqrt[]{1- \rho }~ \begin{array}{c}\mathrm{ n} \\ \mathrm{max } \\ \mathrm{ 1 } \end{array}~Y_{i} }\). As by the Weak Law of Large Numbers \(\mathrm{ \begin{array}{c}\mathrm{ n} \\ \mathrm{max } \\ \mathrm{ 1 } \end{array} ~Y_{i}-\sqrt[]{2~log~n}\stackrel {P}\rightarrow0 }\)we see that \(\mathrm{ ( \begin{array}{c}\mathrm{ n} \\ \mathrm{max } \\ \mathrm{ 1 } \end{array} ~X_{i}-\sqrt[]{1- \rho }\sqrt[]{2~log~n} ) /\sqrt[]{ \rho }-Y_{0}\stackrel {P}\rightarrow0 }\), so that the asymptotic distribution of \(\mathrm{ ( \begin{array}{c}\mathrm{ n} \\ \mathrm{max } \\ \mathrm{ 1 } \end{array}\, X_{i}-\sqrt[]{1- \rho }\sqrt[]{2~log~n} ) /\sqrt[]{ \rho } }\) is standard normal. Thus, if the correlation is constant \(\left( and>0 \right) \) and the process is gaussian, the asymptotic distribution is normal and not extremal. As said, some conditions must be imposed on the dependence to obtain the Extremal Limit Theorem: essentially that, in some sense, dependence must converge adequately to zero.

2 . Some applications

Consider first the following situation which may occur in some Reliability studies. Suppose that the output of some electronic device should, ideally, be a given function \(\mathrm{ f \left( t \right) }\) of time, but the observed output will be \(\mathrm{ X \left( t \right) =f \left( t \right) + \varepsilon \left( t \right) }\), where the deviation \(\mathrm{ \varepsilon \left( t \right) }\) may be regarded as the noise, produced by various internal and external disturbances. In some cases, when the probability structure of the noise \(\mathrm{ \varepsilon \left( t \right) }\) is known, at least in general terms, it may be possible to find the probability that, during a given observation period \(\mathrm{ t=1,2,\dots,n~or~0 \leq t \leq T,X ( t ) }\)will always stay within some limit(s), possibly functions of time. We may even be prepared to accept exceedances of some level, provided that they do not occur too often, or perhaps that the excursion of \(\mathrm{ X ( t ) }\) outside the limit(s) will be a small fraction of the total observation time \(\mathrm{ \{ 1,2,\dots,n \} ~or~[ 0,T ] }\). We are, then, interested in the occurrence and frequency of certain extreme values of \(\mathrm{ X ( t ) }\), particularly the ones falling outside the limit(s). Also it may be important to find the probability that the maximum or minimum of \(\mathrm{ X ( t ) }\) during the observation period will over-or underpass some value.

An example from the statistical analysis of random loads: an airplane travelling through gusty winds, or a car driving along a rough road, or a ship in a rough sea, will be exposed to loads fluctuating in a random way. To estimate the service life of such structures, it may be important to study the extreme values of the load curve \(\mathrm{ X ( t ) }\) to avoid ruptures, in particular by studying the frequency and the size of the peaks of the load curve. In some of these applications, it will be important to investigate the random length \(\mathrm{ t_{2}-t_{1} }\) and the random size \(\mathrm{ X \left( t_{2} \right) -X \left( t_{1} \right) }\) of the fluctuations, \(\mathrm{ t_{1} }\) corresponding to a maximum of \(\mathrm{ X ( t ) }\) while \(\mathrm{ t_{2} }\) denotes the following minimum.

Analogous questions of the sizes of fluctuations in a random the construction of ships, antennas, etc., where the height and length of storm-generated ocean waves, gusts of wind, etc., are important, and may be investigated by probabilistic methods.

Analogous problems of extreme values are encountered in Meteorology, Hydrology and Hydrography, where \(\mathrm{ X ( t ) }\) may signify the wind velocity or the river discharges or the sea level at some given point. Once more it is important, for various practical problems, to have information about the height and the frequency of the peaks of the \(\mathrm{ X ( t ) }\) curve, and even the shape of the curve in the neighbourhood of the peaks.

Finally we can mention some economic applications. If \(\mathrm{ X ( t ) }\) denotes, for instance, a price index, or some measure of business activity at time \(\mathrm{ t }\), analogous extreme value problems may often be studied. A particularly interesting case occurs, as will be seen below, in insurance practice, in particular the ruin problem.

The pioneer writer in this field, from the point of view of Engineering applications, was Rice (1944/45). Although the area is still under intensive study we can cite, as basic, Cramér and Leadbetter (1967). Leadbetter, Lindgren and Rootzén (1983), and some of the papers in Tiago de Oliveira (1984) and references therein.

We will not give proofs, only sketch some of them, because we are going to refer to limiting results of theoretical type which, in fact, reduce to univariate distributions for applications. The next chapters will deal with actual sequences and processes of extremes.

3 . Extremes in stochastic sequences

Associated with the asymptotic behaviour of maxima (minima) in random sequences is the problem of “high” (“low”) exceedances of increasing (decreasing) levels \(\mathrm{ \{ u_{n} \} }\). We will deal, once more, only with maxima, minima being dealt with by symmetry.

A basic result is the following:

For the random sequence \( \{{ X_{n} }\},if~F_{1},\dots,n ( x_{1},\dots,x_{n} ) =Prob \{ X_{1} \leq x_{1},\dots,X_{n} \leq x_{n} \} \) is such that \( F_{1},\dots,n ( \lambda _{n}+ \delta _{n}x,\dots, \lambda _{n}+ \delta _{n}x ) \rightarrow L( x ) \)and also\( F_{1},\dots,n ( \lambda _{n}+ \delta _{n}x,\dots, \lambda _{n}+ \delta _{n}x ) -F_{1,\dots \left[ {n}/{k} \right] }^{k} ( \lambda _{n}+ \delta _{n}x,\dots, \lambda _{n}+ \delta _{n}x ) \rightarrow 0 \)then \( L \left( x \right) \) is max-stable, and thus must be one of the \(\Psi _{ \alpha } \left( x \right), \Lambda \left( x \right) ~or~ \Phi _{ \alpha } \left( x \right) \).

This condition, as said before, imposes for \(\{ F_{1},\dots,n ( x_{1},\dots,x_{n} ) \} \)a sort of independence-like limiting behaviour.

For stationary sequences the condition \(\mathrm{ D \left( u_{n} \right) }\) for the (real) sequence \(\mathrm{ u_{n} }\) which reads as

\(\mathrm{ \vert F_{i_{1},\dots,i_{p,}j_{1,\dots},j_{q}} \left( u_{n},\dots,u_{n} \right) -F_{i_{1},\dots,i_{p}} \left( u_{n},\dots,u_{n} \right) F_{j_{1},\dots,j_{q}} \left( u_{n},\dots,u_{n} \right) \vert \leq \alpha _{n,\mathit{l}} }\)for any integers \(\mathrm{ 1 \leq i_{1}<\dots<i_{p}<j_{1}<\dots<j_{q} \leq n }\) with \(\mathrm{ j_{1}-i_{p} \geq \mathit{l} }\) where \(\mathrm{ \alpha _{n,\mathit{l}} \rightarrow 0 }\) as \(\mathrm{ n \rightarrow \infty }\) and \(\mathrm{ \mathit{l}=\mathit{l}_{n}=0 \left( n \right) }\), is very important. This condition is, evidently, also a kind of asymptotic-like independence weaker than the one before, owing to the stationarity added. It is equivalent to impose \(\mathrm{ \alpha _{n, \left[ n~ \lambda \right] } \rightarrow 0 }\) for each \(\mathrm{ \lambda >0 }\).

It can be shown that:

If \(\{{ X_{n} }\} \) is a stationary sequence, \(\{{ \begin{array}{c} { n} \\ {max } \\ { 1 } \end{array} X_{i} }\} \) has a limiting distribution, and for all sequences \(\lambda _{n}+ \delta _{n}x \left( - \infty<x<+ \infty \right) \) the condition \(D \left( \lambda _{n}+ \delta _{n}~x \right)\) is valid for each \(x \), then the limiting distribution function is of one of the forms  \(\Psi _{ \alpha } \left( x \right) , \Lambda \left( x \right) ~or~ \Phi _{ \alpha } \left( x \right)\).

For stationary normal sequences \(\mathrm{ D \left( u_{n} \right) }\) is valid if  \(\mathrm{ \rho _{n}~log~n \rightarrow 0 }\) (Berman condition).

Concerning stationary sequences, we can go further as regards domains of attraction by comparison with i.i.d. sequences with the same margins \(\mathrm{ F=F_{1} }\). Denoting by \(\mathrm{ u_{n} \left( \tau \right) }\) a sequence such that \(\mathrm{ n ( 1-F ( u_{n} ( \tau ) ) ) \rightarrow \tau }\) and supposing that for some \(\mathrm{ \tau_{0} }\) condition \(\mathrm{ D \left( u_{n} \left( \tau_{0} \right) \right) }\) is valid and \(\mathrm{ prob\, \{{ \begin{array}{c} { n} \\ \mathrm{ {max } }\\ { 1 } \end{array} \,X_{i} }\leq u_n(\tau) \} }\) converges, then for any \(\mathrm{ \tau }\) we have \(\mathrm{ prob\, \{{ \begin{array}{c} { n} \\ \mathrm{ {max } }\\ { 1 } \end{array} \,X_{i} }\leq u_n(\tau) \}\rightarrow e^{- \theta \tau} }\) for a convenient \(\mathrm{ \theta \left( 0 \leq \theta \leq 1 \right) }\).

If for any \(\mathrm{ \tau }\), \(\mathrm{ prob\, \{{ \begin{array}{c} { n} \\ \mathrm{ {max } }\\ { 1 } \end{array} \,X_{i} }\leq u_n(\tau) \}\rightarrow e^{- \theta \tau},\theta }\), is called the extremal index and we can compare \(\mathrm{ \begin{array}{c}\mathrm{ n} \\ \mathrm{max } \\ \mathrm{ 1 } \end{array}\,X_i }\) with \(\mathrm{ \begin{array}{c}\mathrm{ n} \\ \mathrm{max } \\ \mathrm{ 1 } \end{array}\,\hat{X}_i }\) when \(\mathrm{ \{\hat{X}_i \} }\) is an i.i.d. sequence with margins \(\mathrm{ F_1 }\). It can be shown that:

If the stationary sequence \(\{{ X_{n} }\} \) has extremal index \(\theta \) and \(prob\,\{{ \begin{array}{c} { n} \\ {max } \\ { 1 } \end{array} \,\hat{X}_{i} \leq \lambda _{n}+ \delta _{n}~x \} \rightarrow L \left( x \right) } \)  then \(prob\{{ \begin{array}{c} { n} \\ {max } \\ { 1 } \end{array} \,X_{i} \leq \lambda _{n}+ \delta _{n}~x ) \rightarrow L^{ \theta } \left( x \right) } \)which is of the same type of \(L \left( x \right) \). The attraction coefficients used are the same.

Note that in many cases we have \(\mathrm{ \theta =1 }\); but if we take \(\mathrm{ Y_{1},\dots,Y_{n} }\) i.i.d. with \(\mathrm{ F \left( x \right) }\) and if we define \(\mathrm{ X_{i}=max ( Y_{i},Y_{i+1} ) }\) we have \(\mathrm{ Prob~ \{ X_{i} \leq x \} =F^{2} \left( x \right) }\) and \(\mathrm{ F^{2n} \left( \lambda _{n}+ \delta _{n}~x \right) \rightarrow L \left( x \right) }\) (i.e., if \(\mathrm{ \begin{array}{c}\mathrm{ n} \\ \mathrm{max } \\ \mathrm{ 1 } \end{array}\,\hat{X}_i }\) has limiting distribution function \(\mathrm{ L \left( x \right) }\) ) so that \(\mathrm{ prob\, \{{ \begin{array}{c} { n} \\ \mathrm{ {max } }\\ { 1 } \end{array} \,X_{i} } \leq \lambda _{n}+ \delta _{n}~x \} =F^{n+1} \left( \lambda _{n}+ \delta _{n}~x \right) \rightarrow L^{{1}/{2}} \left( x \right) ~and~ \theta =1/2 }\).

For details see Leadbetter, Lindgren and Rootzen (1983).

4 . Global extremes of Gaussian stationary processes

In many applications, as in those mentioned above, the random mechanism generating the stochastic process under observation can be assumed to have the dependence structure waning out, at least as long as we consider observation periods of reasonable length. A precise probabilistic characterization of this asymptotic independence property could be the following. Consider the random variables \(\mathrm{ X \left( t_{1}+h \right) ,X \left( t_{2}+h \right) ,\dots,X \left( t_{n}+h \right) }\), where the \(\mathrm{ t_{j} }\) are any time points, \(\mathrm{ h }\) is a real number and \(\mathrm{ \{ t_{j} \} }\) and \(\mathrm{ \{ t_{j} +h \} }\) belong to the observation time interval. If the joint distribution of any such group of random variables associated with the \(\mathrm{ X ( t ) }\) process is independent of \(\mathrm{ h }\), we say that the process is (strictly) stationary. If, moreover, all these distributions are multinormal, we are dealing with a normal or Gaussian stationary process. For this class of processes, some of the extreme value problems indicated above can be solved at least in part, and we shall give an account of some results so for obtained in this direction.

From the stationarity it follows immediately that the mean value \(\mathrm{ \mu \left( t \right) }\), the variance \(\mathrm{ \sigma ^{2} \left( t \right) }\), and all other moments of \(\mathrm{ X ( t ) }\), when they exist, are independent of \(\mathrm{ t }\). It also follows that the covariance \(\mathrm{ C ( X( t ) ,X ( u ) ) }\) is a function of the time difference \(\mathrm{ t-u }\) and thus an even function, so that we may write

\(\mathrm{ C \left( X \left( t \right) ,X \left( u \right) \right) =C \left( \vert t-u \vert \right) ,C \left( 0 \right) = \sigma ^{2} }\)

From now on we will suppose, for simplicity, that we are dealing with the standardized Gaussian process \( X \left( t \right) =(X \left( t \right) - \mu ) \vert \sigma \), denoted simply now by \( X \left( t \right) , \rho \left( t-u \right) =C \left( t-u \right) \vert \sigma ^{2} \)being the correlation function.

The correlation function \(\rho \left( t \right) \) has the spectral representation

\(\mathrm{ \rho \left( t \right) = \int _{0}^{ \infty}cos~ \lambda ~t~d~F( \lambda ) }\),  

where \(\mathrm{ F( \lambda) }\) is a distribution function. In the most important applications, \(\mathrm{ F( \lambda) }\) is absolutely continuous \(\mathrm{( f \left( \lambda \right) =F’ \left( \lambda \right) \geq 0 ) }\), so that we have

\(\mathrm{ \rho \left( t \right) = \int _{0}^{ \infty}cos~ \lambda ~t~f \left( \lambda \right) d~ \lambda }\).

The function \(\mathrm{ f \left( \lambda \right) }\) is called the spectral density of the process. The spectral moments of \(\mathrm{ \rho \left( t \right) }\)  

\(\mathrm{ \lambda _{k}= \int _{0}^{ \infty} \lambda _{k}~f \left( \lambda \right) d~ \lambda ( \lambda _{0}=1) }\)

are not necessarily finite. We shall here assume that \(\mathrm{ \lambda _{k} }\) is finite for \(\mathrm{ k \leq 4 }\). This implies that \(\mathrm{ \rho \left( t \right) }\) and its four first derivatives tend to zero as \(\mathrm{ t }\) tends to infinity; we shall further assume that \(\mathrm{ \rho \left( t \right) }\) itself tends to zero at least as rapidly as some negative power of \(\mathrm{ t }\). Then for small values of \(\mathrm{ t }\) we have the expansion

\(\mathrm{ \rho \left( t \right) =1-\frac{1}{2!} \lambda _{2}~t^{2}+\frac{1}{4!} \lambda _{4}~t^{4}+0( t^{4} ) }\).

Recall that \(\mathrm{ \rho \left( t \right) ( or~C \left( t \right) ) }\) being an even function, the odd moments are null. We can now discuss some extreme value problems for \(\mathrm{ X ( t ) }\) processes under the conditions given above. Suppose that the observation period is \(\mathrm{ \left[ 0,T \right] }\). As \(\mathrm{ X ( t ) }\) has mean value \(\mathrm{ 0 }\) it can take positive and negative values (supposing non-degeneracy \(\mathrm{ X \left( t \right) \neq 0 }\) ).

Let then \(\mathrm{ \mathit{u}> 0 }\) be some level. We can seek the probability that \(\mathrm{ X ( t ) }\) exceeds \(\mathrm{ }u\) at least once in \(\mathrm{ \left[ 0,T \right] }\). Or, denoting by \(\mathrm{ U \left( T \right) } \) the (random) number of upcrossings of \(\mathrm{ }u\) (i.e., crossings of \(\mathrm{ }u\) upwards), we can seek its properties, and even its distribution. We will give the expression of the mean value of \(\mathrm{ U \left( T \right) } \) and the asymptotic behaviour of \(\mathrm{ U \left( T \right) } \). Other results can be seen in Cramér and Leadbetter (1967) and in Leadbetter, Lindgren and Rootzén (1983).

The mean value of \(\mathrm{ U \left( T \right) } \), for the level \(\mathrm{ }u\), is given by

\(\mathrm{ M ( U \left( T \right) )} = ( \sqrt[]{ \lambda _{2}}/2 \pi ) \cdot \mathrm{T~e}^{-u^{2}/2} \).

From this expression we see that the mean number of upcrossings of the level \(\mathrm{ }u\) decreases quickly to zero, as \(\mathrm{ }u\) becomes large. For a negative level \({-u}\) , the same expression gives the mean number of downcrossings of the level \({-u}\) .

If we denote by \(\mathrm{ D \left( T \right) }\) and \(\mathrm{ C \left( T \right) }\) the number of downcrossing and of crossings ( \(\mathrm{ C \left( T \right) \geq U \left( T \right) +D \left( T \right) }\), because of tangencies) we have from the stationarity \(\mathrm{ M \left( D \left( T \right) \right) =M \left( U \left( T \right) \right) =\frac{1}{2}M \left( C \left( T \right) \right) }\). Every upcrossing of the level \(\mathrm{ }u\) corresponds to a peak higher than \(\mathrm{ }u\) for \(\mathrm{ X ( t ) }\). The formula above thus given the mean number of upcrossings or peaks, showing that they become scarce as \(\mathrm{ }u\) becomes large. Writing

\(\mathrm{ \mu =M \left( U \left( 1 \right) \right) }=\mathrm{( \sqrt[]{ \lambda _{2}}}/{2 \pi ) \cdot\mathrm{e}^-u^{2}/2} \)

we have

\(\mathrm{ M \left( U \left( 1/ \mu \right) \right) =1 }\),

and thus the average number of peaks higher than \(\mathrm{ }u\) in a time interval of length \(\mathrm{ 1/ \mu }\) is precisely one: the return period of these high peaks is \(\mathrm{ 1/ \mu }\). The return period of downcrossings is also \(\mathrm{ 1/ \mu }\) and that of the crossings is \(\mathrm{ 2/ \mu }\).

We can even give a much more specific result concerning the asymptotic time distribution of peaks higher than a level \(\mathrm{ }u\). As \(\mathrm{ }u\) tends to infinity, it can be shown that the time distance between two consecutive peaks higher than \(\mathrm{ }u\) has, asymptotically, an exponential distribution with probability density \(\mathrm{ \mu~e^{- \mu t} }\) (the return period \(\mathrm{ 1/ \mu }\) being the scale parameter), and that the successive time intervals between such peaks are asymptotically independent. The occurrence of such high peaks will, asymptotically, constitute a Poisson process and if \(\mathrm{ c>0 }\) is a number independent of \(\mathrm{ }u\) , the probability of exactly \(\mathrm{ m }\) peaks higher than \(\mathrm{ }u\) in a time interval of length \(\mathrm{ c/ \mu }\) converges, as \(u\,\mathrm{ \rightarrow \infty }\), to the Poisson limit  \(\mathrm{ c^{m}~e^{-c}/m! }\).

From this result, the limiting probability distribution for the largest ordinate of \(\mathrm{ X ( t ) }\) in a large observation period \(\mathrm{ \left[ 0,T \right] }\) is easily deduced. If we denote \(\mathrm{ \tilde{X} ( T ) =\mathrm{ \begin{array}{c}\ \\\mathrm{max } \\ \mathrm{ 0 \leq t \leq T} \end{array} } \{ X( t ) \} }\) , the largest value assumed by \(\mathrm{ X ( t ) }\) for \(\mathrm{ 0 \leq t \leq T }\), we have the following relation:

\(\mathrm{ Prob \{ \tilde{X} \left( T \right) \leq \sqrt[]{2\,log\,T}+\frac{z+log \left( \sqrt[]{ \lambda _{2}}/2 \pi \right) }{\sqrt[]{2\,log\,T}} \} \rightarrow \Lambda \left( z \right) =exp⁡ ( -e^{-z} ) }\)

as \(\mathrm{ T \rightarrow \infty }\). Note that this result is very similar to that given for independent normal sequences, the already classical case.

For the smallest value of \(\mathrm{ X ( t ) }\) in \(\mathrm{ \left[ 0,T \right] }\) there is, of course, a corresponding relation.

When \(\mathrm{ }u\) is large, any peak higher than \(\mathrm{ }u\) of \(\mathrm{ X ( t ) }\) will have the following appearance: the upcrossing of the level \(\mathrm{ }u\) will be followed, after an average time asymptotic to \(\mathrm{ ( \sqrt[]{2 \pi }/ \lambda _{2} )/\mathit{u} }\) , by a downcrossing of the same level. Half-way between an upcrossing and the next downcrossing there will be a maximum of the curve, and the excess of this maximum over the \(\mathrm{ }u\) level is small when \(\mathrm{ }u\) is large.

Let us deal, briefly, with the length and size of the fluctuations of \(\mathrm{ X ( t ) }\). Suppose first that it is known that \(\mathrm{ X ( t ) }\) has a local maximum at some point \(\mathrm{ t_1 }\). The probability distribution of the ordinate  \(\mathrm{ X \left( t_1 \right) }\), relative to this hypothesis, has an explicit analytic expression with a mean value equal to \(\mathrm{ \lambda _{2}/2\sqrt[]{2 \pi ~ \lambda _{4}} }\).

Let us now suppose that \(\mathrm{ X ( t ) }\) has two consecutive minima at \(\mathrm{ t= t_1 }\) and \(\mathrm{ t= t_3 }\), and a maximum at \(\mathrm{ t= t_2 }\), where \(\mathrm{t_{1}<t_{2}<t_{3} }\). The wave of \(\mathrm{ X ( t ) }\) in \(\mathrm{ \left[ t_{1},t_{3} \right] }\) has the length \(\mathrm{ t_{3}-t_{1}, }\) its height (or amplitude) being defined as \(\mathrm{ X \left( t_{2} \right) - [ X \left( t_{1} \right) +X \left( t_{3} \right) ] /2 }\). The probability distributions of the length and height of the wave are known and their mean values are:

\(\mathrm{ Mean~length=2 \pi \sqrt[]{ \lambda _{2}~ \lambda _{4}} }\)

\(\mathrm{ Mean~height= \lambda _{2}\sqrt[]{2 \pi ~ \lambda _{4}} }\).

As \(\mathrm{ \lambda _{0}=1 }\) we have \(\mathrm{ \lambda _{4} \geq \lambda _{2}^{2} }\). If the spectral density \(\mathrm{ f \left( \lambda \right) }\) differs appreciably from zero for large \(\mathrm{ \lambda }\), the spectrum of the process contains high frequencies, and \(\mathrm{ \lambda _{4} }\) will be large. The mean length and height of the waves will then be small, which means that \(\mathrm{ X ( t ) }\) will show a large number of fluctuations of small duration and size.

In the opposite extreme case, when \(\mathrm{ \lambda _{4} }\) only exceeds \(\mathrm{ \lambda _{2}^{2} }\) by a small amount, the spectrum will be concentrated in a small interval about some central frequency \(\mathrm{ \lambda ^{*} }\), and the length of the \(\mathrm{ X ( t ) }\) waves will then all be approximately equal to \(\mathrm{ 2~ \pi / \lambda ^{*} }\), while the size will be variable, with a mean of approximately \(\mathrm{ \sqrt[]{2~ \pi } }\).

5 . Global extremes in a Wiener-Lévy processes

Let \(\mathrm{ X ( t ) }\) be the Wiener-Lévy stochastic process, the first approach given to the theory of Brownian motion. We will seek the stochastic behaviour of the global maximum \(\mathrm{ \tilde{X} \left( t \right) =\mathrm{ \begin{array}{c}\ \\ \mathrm{max } \\ \mathrm{ 0 \leq t \leq T} \end{array} } X \left( t \right) }\). The result to be given is due to Lévy. We will briefly follow Papoulis (1965).

As is well known, the Brownian motion \(\mathrm{ X ( t ) }\) has \(\mathrm{ X \left( 0 \right) =0 }\) and \(\mathrm{ Prob \{ X ( t ) \leq x \} =N ( x/ \sigma \sqrt[]{t} ) }\) where \(\mathrm{ N \left( x \right) }\), as before, denotes the standard normal distribution function and \(\mathrm{ \sigma }\) is the dispersion parameter of the process (of \(\mathrm{ X \left( 1 \right) }\) ).

It can be shown that the distribution function of \(\mathrm{ \tilde{X} \left( t \right) }\) is the same as that of \(\mathrm{ \vert X ( t ) \vert }\), that is,

\(\mathrm{ Prob \{ \tilde{X} \left( t \right) \leq x \} =0 }\)     \(\mathrm{if~x<0}\)

\(\mathrm{ =2~N \left( x/ \sigma \sqrt[]{T} \right) }\)             \( \mathrm{if~x>0}\)

owing to the symmetry  \(\mathrm{ [N \left( x \right) +N \left( -x \right) =1 ] of~N \left( \cdot \right) }\).

We can roughly sketch the proof of this result. The basic fact is the reflection symmetry principle (of D. André); if a trajectory \(\mathrm{ X ( t ) }\) in any instant \(\mathrm{ \left( <T \right) }\) attains the level \(\mathrm{ d \left( >0\right) }\) the probabilities of \(\mathrm{ X ( t ) }\) being later, at some instant, in positions \(\mathrm{ d +a }\) or \(\mathrm{ d -a }\) are equal, or equivalently, that for each trajectory crossing the line \(\mathrm{ x=d \left( >0 \right) }\) there is a symmetrical (reflected) trajectory on this line. Then

\(\mathrm{ Prob \{{ \tilde{X} \left( T \right) \geq d } \}=Prob \{{ X \left( T \right) \geq d }\} +Prob \{ X \left( T \right) <d,\tilde{X} \left( T \right) \geq d \} }\)

\(\mathrm{ =2~Prob \{{ X \left( T \right) \geq d } \} }\)

because, by André’s principle. We have

\(\mathrm{ Prob \{{ X \left( T \right) <d,\tilde{X} \left( T \right) \geq d }\} =Prob \{{ X \left( T \right) \geq d,\tilde{X} \left( T \right) \geq d }\} =Prob \{ X \left( T \right) \geq d \} }\);

it is evident that for minima the result is symmetrical to the line \(\mathrm{ x=0 }\).

6 . Local extremes in Gaussian stationary processes

Cartwright and Longuet-Higgins (1956) have shown that if \(\mathrm{ X ( t ) }\) is a twice-differentiable Gaussian stationary process and \(\mathrm{ p \left( \xi , \eta , \zeta \right) }\) is the trinormal probability density of the random triple \(\mathrm{ \left( X,X’,X" \right) }\), the density of \(\mathrm{ X }\), known to be a maximum, is given by

\(\mathrm{ f \left( \xi \right) =\frac{ \int _{- \infty}^{0}d~ \zeta ~p \left( \xi ,0, \zeta \right) \zeta }{ \int _{- \infty}^{+ \infty}d~ \zeta ~ \int _{- \infty}^{0}d~ \zeta ~p \left( \xi ,0, \zeta \right) \zeta } }\).

The reasoning, based on Rice (1944/45), is roughly as follows: \(\mathrm{ let [ t,t+d~t ] }\) be an (infinitesimal) time interval. If there is a maximum in \(\mathrm{[ t,t+d~t ] }\) we must have \(\mathrm{ X’ \left( t’ \right) =0 }\) in that interval and \(\mathrm{ X"( t’ ) <0 }\). As the trajectories \(\mathrm{ X’ \left( t’ \right) }\) are almost linear, we have approximately \(\mathrm{ \vert X’ \left( t \right) \vert \leq \vert X" ( t ) \vert d~t }\) in the interval and, consequently, the straight line close to \(\mathrm{ X’ \left( t \right) }\), passing by \(\mathrm{ \eta }\) and with the slope \(\mathrm{ \zeta }\) , has a zero at \(\mathrm{ t- \eta / \zeta }\) so that \(\mathrm{ 0 \leq - \eta / \zeta \leq d~t }\). The density of \(\mathrm{ \zeta }\) , relative to this strip, is then \(\mathrm{ ( \zeta<0) }\)

\(\mathrm{ \frac{ \int _{- \infty}^{0}d~ \zeta ~ \int _{0}^{- \zeta ~d~t}d~ \eta ~p \left( \xi , \eta , \zeta \right) }{ \int _{- \infty}^{+ \infty}d~ \xi ~ \int _{- \infty}^{0}d~ \zeta ~ \int _{0}^{- \zeta ~d~t}d~ \eta ~p \left( \xi , \eta , \zeta \right) } }\)

whose limit, as \(\mathrm{ d~t \downarrow 0 }\), is given by

\(\mathrm{ \frac{ \int _{- \infty}^{0}d~ \zeta ~p \left( \xi ,0, \zeta \right) ~ \zeta }{ \int _{- \infty}^{+ \infty}d~ \xi ~ \int _{- \infty}^{0}d~ \zeta ~p \left( \xi ,0, \zeta \right) \zeta } }\).

This heuristic derivation — evidently not rigourous — gives a good idea of the behaviour of  \(\mathrm{ X’ \left( t \right) }\) on a local scale. A correct — and longer! — proof can be given.

For a Gaussian stationary process, with zero means, the covariance matrix of \(\mathrm{ \left( X,X’,X" \right) }\) is, as is well known,

\( \begin{bmatrix} {\sigma^2} & {0} & {-\sigma^2\lambda_2} \\[0.3em] 0 & {\sigma^2\lambda_2} & 0 \\[0.3em] {-\sigma^2\lambda_2} &0& {\sigma^2\lambda_4} \end{bmatrix}\)

where the \(\mathrm{ λ_j }\), as before, denote the jth moments of the spectral density of the standardized process \(\mathrm{ X \left( t \right) / \sigma }\). For the standard process \(\mathrm{ X \left( t \right) / \sigma }\), as the density \(\mathrm{ p \left( \xi , \eta , \zeta \right) }\) is the trinormal one with zero means and the covariance matrix given, the density of \(\mathrm{ X \left( t \right) / \sigma }\) is

\(\mathrm{ p \left( x \right) =\frac{1}{\sqrt[]{2 \pi }}~ \{ \varepsilon ~e^{-{x^{2}}/{2}~ \varepsilon ^{2}}+ \left( 1- \varepsilon ^{2} \right) ^{{1}/{2}}~x~e^{-{x^{2}}/{2}} \int_{- \infty}^{x ( 1-{ \varepsilon ^{2} ) ^{{1}/{2}}}/{ \varepsilon }}e^{-{x^{2}}/{2}}~d~x \} }\)

where \(\mathrm{ \varepsilon ^{2}=1- \lambda _{2}^{2}/ \lambda _{4} }\) ( \(\geq 0\) by Schwartz inequality) and \(\mathrm{ X \left( t \right) } \) evidently has the density

\(\mathrm{ f \left( \xi \right) =p \left( \xi / \sigma \right) / \sigma }\).

The distribution of the observed maximum then depends on only two parameters: a dispersion parameter \(\mathrm{ \sigma ~ }\)and a quantity \(\mathrm{ \varepsilon \left( 0 \leq \varepsilon \leq 1 \right) }\) which can be interpreted as the relative spectrum width. For \(\mathrm{ \varepsilon = 0 }\) we have the Rayleigh (minima) density

\(\mathrm{ p \left( x \right) =0~ }\) if \(\mathrm{ x<0~ }\)  

\(\mathrm{ =x~e^{-{x^{2}}/{2}} }\) if \(\mathrm{ x<0~ }\)

and for \(\mathrm{ \varepsilon = 1 }\) we obtain the normal density

\(\mathrm{ p \left( x \right) =\frac{1}{\sqrt[]{2 \pi }}~e^{-{x^{2}}/{2}}}\);

\(\mathrm{ \varepsilon }\) is thus a measure of non-normality.

For the reduced minimum \(\mathrm{ X ̰ }\) — which is easily obtained by reflection on \(\mathrm{ x=0 }\) — we have

\(\mathrm{ q \left( x \right) =\frac{1}{\sqrt[]{2 \pi }} \{ \varepsilon ~e^{-{x^{2}}/{2}~ \varepsilon ^{2}}- ( 1- \varepsilon ^{2} ) ^{{1}/{2}}~x~e^{-{x^{2}}/{2}} \int _{- \infty}^{-x( 1-{ \varepsilon ^{2} ) ^{{1}/{2}}}/{ \varepsilon }}e^{-{x^{2}}/{2}}~d~x \} }\).

7 . Some further results

As our main interest has been directed towards Gaussian processes, we will only give some scattered examples of extreme value properties for other kinds of stochastic processes.

Consider the class of processes \(\mathrm{ X \left( t \right) } \) with independent increments such that \(\mathrm{ X \left( 0 \right) =0 } \); for \(\mathrm{ t \geq 0 } \) the characteristic function of \(\mathrm{ X \left( t \right) } \) is

\(\mathrm{ M ( e^{iy~X \left( t \right) } ) =exp⁡ [ -\frac{1}{2} \sigma ^{2}~t~y^{2}+t \int _{- \infty}^{+ \infty} \left( e^{iyv}-1-i~y~v \right) d~G \left( v \right) ] } \),

where \(\mathrm{ \sigma } \) is a constant and \(\mathrm{ G \left( v \right) } \) is bounded and non-decreasing. If \(\mathrm{ G \left( v \right) } \) is identically zero, we have the Wiener-Lévy, or Brownian motion, process dealt with previously. On the other hand, if \(\mathrm{ \sigma=0 } \) and \(\mathrm{ G } \) is constant except for a jump of magnitude \(\mathrm{ 1 } \) at \(\mathrm{ v=1 } \), we have a centered Poisson process. For the general class of processes with independent increments, with the characteristic function given above, the same formula for \(\mathrm{ \tilde{X}(T) } \) holds asymptotically for large \(\mathrm{ T } \), with \(\mathrm{ \sigma ^{2} } \) replaced by \(\mathrm{ \sigma ^{2}+ \int _{- \infty}^{+ \infty}v^{2}~d~G \left( v \right) } \). In particular this result applies to the Poisson process.

Consider now a process with independent increments and a slightly modified form of the characteristic function:

\(\mathrm{ M ( e^{iy~X \left( t \right) } ) =exp⁡ [ -t \int _{0}^{ \infty} \left( e^{iyv}-1- \left( 1+ \lambda \right) i~y~v \right) ~d~G \left( v \right) ] } \)

where \(\mathrm{ \lambda >0 } \) is a constant, while \(\mathrm{ G \left( v \right) } \) is a distribution function with a mean value \(\mathrm{ 1 } \). This process occurs in Insurance Risk Theory, where \(\mathrm{ G } \) is the distribution function of the amount of a claim, while \(\mathrm{ \lambda } \) represents the “loading” of the risk premium. \(\mathrm{ X \left( t \right) } \) is here the total amount of loaded risk premiums paid during the time interval \(\mathrm{ \left[ 0,t \right] } \), minus the total amount of claims that have occurred during the same period. If we assume that \(\mathrm{ X \left( 0 \right) =u } \) is the initial capital of the risk business, and consider this business to be ruined if \(\mathrm{ X \left( t \right) } \) ever takes a value \(\mathrm{ \leq 0 } \), the ruin probability \(\mathrm{ \Psi \left( u \right) } \) is the probability that the minimum value of \(\mathrm{ X \left( t \right) } \) for some positive \(\mathrm{ t } \) will be \(\mathrm{ \leq 0 } \). A fundamental result of Risk Theory asserts that

\(\mathrm{ \Psi \left( u \right) \leq e^{-Ru} } \),

where \(\mathrm{ R } \) is the smallest positive root of the equation

\(\mathrm{ \int _{0}^{ \infty}e^{Rv}~d~G \left( v \right) =1+ \left( 1+ \lambda \right) ~R } \).

As a final example consider the integral of a stationary process. We write for \(\mathrm{ t>0 } \)  

\(\mathrm{ IX \left( t \right) = \int _{0}^{t}X \left( u \right) d~u } \),

where \(\mathrm{ X \left( u \right) } \) is a gaussian stationary process satisfying the conditions above. Moreover, we assume here that the spectral density \(\mathrm{ f \left( \lambda \right) } \) is continuous and positive at the point \(\mathrm{ \lambda=0 } \). We then have \(\mathrm{ M ( IX \left( t \right) )=0 }\) and \(\mathrm{ V \left( IX \left( t \right) \right) =M \left( IX^{2} \left( t \right) \right) \sim \pi ~t~f \left( 0 \right) ~as~t \rightarrow \infty }\).

For the normalized process \(\mathrm{ \bar{IX} \left( t \right) =IX \left( t \right) /\sqrt[]{V \left( IX \left( t \right) \right) } }\) it can then be shown that we have, with \(\mathrm{ \tilde{IX} \left( t \right) =\mathrm{ \begin{array}{c}\ \\\mathrm{max } \\ \mathrm{ 0 \leq t \leq T} \end{array} }~\bar{ IX} \left( t \right) }\),

\(\mathrm{ Prob \{ \sqrt[]{ \left( 1- \varepsilon \right) log~log~T}<\tilde{IX }\left( T \right) <\sqrt[]{ \left( 1+ \varepsilon \right) log~log~T} \} \rightarrow 1 }\)

for any \(\mathrm{ \varepsilon >0 }\) , as \(\mathrm{ T \rightarrow \infty }\) .

For other details see the papers by Berman (1962), (1964), (1964b), (1974) and (1982) and Pickands (1967a), (1967b), (1968), (1969a) and (1969b) as well as the references given in the first section.

References

7.

Cramér, H. and Leadbetter, M. R., 1967. Stationary and Related Stochastic Processes, Wiley, New York.

9.

Papoulis, A., 1965. Probability, Random Variables and Stochastic Processes, McGraw Hill, New York.

14.

Pickands, J. III., 1969b. Uperossing probabilities for stationary Gaussian process. Trans. Amer. Math. Soc., 145, 51-73.