Extremal Sequences and Processes: Basics and Statistics

1.Academia das Ciências de Lisboa (Lisbon Academy of Sciences), Lisbon, Portugal.
1.Academia das Ciências de Lisboa (Lisbon Academy of Sciences), Lisbon, Portugal.
This chapter reveals a definition of extremal (maxima) random sequences and stochastic processes. Extremal sequences are the usual sequences of maxima. It can be obtained as stochastic processes of independent extremes with extremal margins. Extremal sequences and processes (distribution and properties), limiting characterizations, second order properties of extremal processes, jumps of an extremal process, logarithmization of time and the ergodic theorem for extremal processes, parameterization of extremal processes and statistics for extremal sequences and processes are discussed with explanatory examples.
Extremal sequences , Extremal processes , Maxima , Independent extremes , Limiting characterizations , Parameterization , Ergodic theorem
We present a definition of extremal (maxima) random sequences and stochastic processes [the latter being distinct from the limiting constructions given by Dwass (1964) and Lamperti (1964)] following Tiago de Oliveira (1968); the insistence will be more on an actual characterization than on a limiting one. Extremal sequences are the usual sequences of maxima, and extremal processes will be obtained as stochastic processes (in continuous time) of independent extremes (maxima) with extremal (maxima) margins. The limiting approaches will be described in a special section.
The definition being the weak one (in law), as seen essentially in the identifiability problem, the properties described are, chiefly, the second order properties for the case studied, with Gumbel margins.
It will be clear that extremal processes are surely continuous to the right and almost surely with discontinuous trajectories.
We will always assume the marginal distributions of extremal sequences and processes to be the Gumbel distribution ( \(\mathrm{ \Lambda( z ) =(exp ( -e^{-z} ) ) }\), for reduced random variables). The other extremal sequences and processes are easily converted to Gumbel margins by logarithmic transformations if they are maxima and also using symmetry considerations when minima; its form will, naturally, be given in the limiting characterization section.
The definition of processes by a non-limiting technique is analogous to that of Mann (1953) for Brownian motion, using the know “duality”, for independence, between sums and maxima (characteristic functions versus distribution functions), now extended to stochastic processes with continuous time. We also obtain stochastic processes, connected with maxima, whose one-dimensional finite behaviour was the one described asymptotically by Newell (1962), for diffusion processes, like the Orstein-Uhlenbeck Brownian motion. The structure of the jump process is obtained as well as a natural representation.
Let us recall some results.
A general Gumbel random variable has the distribution function \(\mathrm{ \Lambda \left( \left( x- \lambda \right) / \delta \right) }\) where \(\mathrm{ \Lambda ( z ) =( exp( -e^{-z} ) ) }\) is the distribution function of a reduced random variable \(\mathrm{ \left( \lambda =0, \delta =1 \right) }\); we also have \(\mathrm{ \Lambda ’ ( z ) =e^{-z} \Lambda ( z ) }\). The first moments are \(\mathrm{ \mu = \lambda + \gamma ~ \delta ( \gamma =0.57722\dots, Euler’s ~constant) }\), \(\mathrm{ \sigma ^{2}= { \pi ^{2}}/{6 \cdot \delta ^{2}}=1.64493\dots \delta ^{2}, \sigma = \pi /6. \delta =1.28825\dots \delta , \beta _{1}=\mu_{3}/ \sigma ^{2}=1.13955 }\), and \(\mathrm{ \beta _{2}=\mu_{4}/ \sigma ^{4}=5.4 }\) ; see Gumbel (1958).
An extreme random pair \(\mathrm{ \left( Z_{1},Z_{2} \right) }\) with reduced Gumbel margins has the distribution function \(\mathrm{ \Lambda \left( x,y \right) =Prob \{ Z_{1} \leq x,Z_{2} \leq y \} =exp \{{ - \left( e^{-x}+e^{-y} \right) k \left( y-x \right) }\} = \{ \Lambda \left( x \right) \Lambda \left( y \right) \} ^{k \left( y-x \right) } }\)where the dependence function \(\mathrm{ k(.) }\) satisfies some conditions for \(\mathrm{ \Lambda \left( .,. \right) }\) to be a distribution function with reduced Gumbel margins; see Tiago de Oliveira (1962/63). In particular we have \(\mathrm{ \frac{max \left( 1,e^{w} \right) }{1+e^{w}} \leq k \left( w \right) \leq 1 }\), the lower bound corresponding to the diagonal case where we have \(\mathrm{ Prob \{ Y=X \} =1 }\) (for reduced margins) and \(\mathrm{ k \left( w \right) =1 }\) corresponding to independence.
The correlation coefficient has the expression
\(\mathrm{ \rho =-\frac{6}{ \pi ^{2}} \int _{- \infty}^{+ \infty}log~k \left( w \right)d~w \left( \geq 0 \right) }\),
\(\mathrm{ \rho =0 }\) being equivalent to independence and \(\mathrm{ \rho =1 }\) to the diagonal case.
The probability \(\mathrm{ D \left( w \right) =Prob \{ Z_{2}-Z_{1} \leq w \} , \left( Z_{1},Z_{2} \right) }\)being reduced Gumbel random variables, has the expression \(\mathrm{ D \left( w \right) =\frac{k’ \left( w \right) }{k \left( w \right) }+\frac{e^{w}}{1+e^{w}} }\) a.e., as \(\mathrm{ k’ \left( . \right) }\) exists a.e.,; in the independence case \(\mathrm{ \left( k \left( w \right) =1 \right) }\) we have the logistic distribution and in the diagonal case we have \(\mathrm{ D \left( w \right) =H \left( w \right) }\), where \(\mathrm{ H \left( w \right) }\) is the Heaviside jump function \(\mathrm{ \left( H \left( w \right) =0~if~w<0~and~H \left( w \right) =1~if~w \geq 0 \right) }\). Note that \(\mathrm{ \partial ^{2} \Lambda / \partial ~x~ \partial ~y }\) does not exist in all cases — as \(\mathrm{ k" ( . ) }\) does not always exist; this is the case in all stochastic processes (sequences) connected with extremes, as happens in extremal sequences and processes and in EMS and EME sequences.
In particular, for \(\mathrm{ k_{ \theta } \left( w \right) =1-\frac{min \left( \theta ,e^{w} \right) }{1+e^{w}} \left( 0 \leq \theta \leq 1 \right) }\), we have \(\mathrm{ D_{ \theta } \left( w \right) =0 }\) if \(\mathrm{ w<log~ \theta \left( <0 \right) \left( i.e.,Prob \{ Z_{2} \geq Z_{1}+log~ \theta \} =1 \right) }\) and \(\mathrm{ D_{ \theta } \left( w \right) =\frac{1}{1+ \left( 1- \theta \right) e^{w}} }\) if \(\mathrm{ w \geq log~ \theta }\) with a jump of \(\mathrm{ \theta }\) at \(\mathrm{ w = log~ \theta }\), the jump being the probability of \(\mathrm{ Y=X+log~ \theta }\); the correlation coefficient is
\(\mathrm{ \rho \left( \theta \right) =\frac{6}{ \pi ^{2}}R \left( \theta \right) }\)
where \(\mathrm{ R \left( \theta \right) = \int _{0}^{ \theta }\frac{-log \,t}{1-t}d~t,R’ \left( \theta \right) \geq 0 }\) with \(\mathrm{ R \left( 0 \right) =0,R \left( 1 \right) =\frac{6}{ \pi ^{2}} }\)so that for \(\mathrm{ \theta =0 }\) and \(\mathrm{ \theta =1 }\) we have, obviously, the independence and diagonal cases. The expression of \(\mathrm{ k_{ \theta } \left( w \right) }\) corresponds to the biextremal model.
The other coefficients are the difference-sign correlation \(\mathrm{ \tau \left( \theta \right) = \theta }\), the grade correlation \(\mathrm{ \chi \left( \theta \right) =3\, \theta / \left( 2+ \theta \right) }\), and the medial correlation \(\mathrm{ v \left( \theta \right) =2^{ \theta }-1 }\), all increasing from \(\mathrm{0 }\) to \(\mathrm{ 1 }\) as \(\mathrm{ \theta }\) increases from \(\mathrm{ \theta =0 }\) to \(\mathrm{ \theta =1 }\).
Let \(\mathrm{ E_{1},E_{2},\dots,E_{n},\dots }\) be a sequence of i.i.d. reduced Gumbel random variables and let \(\mathrm{ Z_{1}=E_{1},Z_{2}=max \left( Z_{1},E_{2} \right) ,\dots,Z_{n}=max \left( Z_{n-1},E_{1} \right) =max \left( E_{1},\dots,E_{n} \right) }\).
Evidently \(\mathrm{ Prob \{{ Z_{1} \leq z_{1}, \dots,Z_{n} \leq z_{n} }\} =Prob \{{ Z_{i} \leq z_{i},i=1,\dots,n }\} = }\) \(\mathrm{ Prob \{ E_{1} \leq \begin{array}{c}\mathrm{ n} \\ \mathrm{min } \\ \mathrm{ 2 } \end{array}z_{i} \} ,E_{2} \leq \begin{array}{c}\mathrm{ n} \\ \mathrm{min } \\ \mathrm{ 2 } \end{array}z_{i},\dots, \} = \begin{array}{c}\mathrm{ n} \\ \mathrm{\Pi} \\ \mathrm{ i=1 } \end{array} \Lambda ( min ( z_{i},\dots,z_{n} ) ) }\). This corresponds to putting \(\mathrm{ t_{1}=1,\dots,t_{n}=n }\) in the extremal processes, so we will only sketch some results. We can now consider the parameterized sequence \(\mathrm{ X_{n}= \lambda + \delta ~Z_{n} }\) with real \(\mathrm{ \lambda }\) and \(\mathrm{ \delta \left( >0 \right) }\). We have \(\mathrm{ Prob \{{ X_{i} \leq x_{1},\dots,X_{n} \leq x_{n} }\} =Prob \{ Z_{i} \leq \left( x_{i}- \lambda \right) / \delta ,\dots,Z_{n} \leq \left( x_{n}- \lambda \right) / \delta \} }\)\(\mathrm{ = \begin{array}{c}\mathrm{ n} \\ \mathrm{\Pi} \\ \mathrm{ i=1 } \end{array}\Lambda \left( \left( min \left( x_{i},\dots,x_{n} \right) - \lambda \right) / \delta \right) }\).
Let us now consider a periodic subsequence \(\mathrm{ X_{j}^{’}=X_{jp},p ( >0 ) }\) integer. We can show, in two ways, that \(\mathrm{ \{ X_{j}^{’} \} }\) and \(\mathrm{ \{ X_{j} \} }\) have the same distribution apart from the parameters, which are \(\mathrm{ \left( \lambda , \delta \right) }\) for the sequence \(\mathrm{ \{ X_{n} \} }\) and \(\mathrm{ \left( \lambda ’, \delta ’ \right) = \left( \lambda + \delta \, log\,p, \delta \right) }\) for the sequence \(\mathrm{ \{ X_{n}^{’} \} }\). A simple one is to consider the sequence \(\mathrm{ E_{1}^{’}=max \left( E_{1},\dots,E_{p} \right) -log\,p,E_{2}^{’}=max \left( E_{p+1},\dots,E_{2p} \right) -log\,p }\), etc. It is immediate that the \(\mathrm{ E_{j}^{’} }\) are independent random variables with the reduced Gumbel distribution. If we denote by \(\mathrm{\{ ( Z_{i}^{’}=E_{1}^{’},Z_{2}^{’}=max \left( E_{1}^{’},E_{2}^{’} \right) ,\dots,Z_{n}^{’}=max \left( E_{1}^{’},\dots,E_{n}^{’} \right)\} }\), we see that
\(\mathrm{ Z_{j}^{’}=Z_{jp}-log\,p }\) and so \(\mathrm{ X_{j}^{’}= \lambda + \delta ( Z_{j}^{’}+log\,p ) = \lambda ’+ \delta\, Z_{j}^{’} }\),
and then \(\mathrm{ Prob \{ X_{j}^{’} \leq x_{1},\dots,X_{n}^{’} \leq x_{n} \} =\begin{array}{c}\mathrm{ n} \\ \mathrm{\Pi} \\ \mathrm{ i=1 } \end{array} \Lambda ( ( \begin{array}{c}\mathrm{ n} \\ \mathrm{min } \\ \mathrm{ 1 } \end{array} \left( x_{i},\dots,x_{n} \right) - \lambda ’ ) / \delta ) }\), which is the same as that of \(\mathrm{ \{ X_{j} \} }\) with the new parameters.
Thus a change of timescale — or a periodic selection at instants \(\mathrm{ p,2p,\dots }\) — leads to extremal sequences with the same properties. In particular we have \(\mathrm{ M \left( X_{j} \right) = \lambda + \delta \left( \gamma +log\,j \right) ,C \left( X_{j},X_\mathit{l} \right) = \delta ^{2}R ( \frac{min \left( j,\mathit{l} \right) }{max \left( j ,\mathit{l} \right) } ) ,M ( X_{j}^{’} ) }\)\(\mathrm{ =M \left( X_{jp} \right) = \lambda + \delta \left( \gamma +log \left( p_{j} \right) \right) = \lambda ’+ \delta\, log\,j,and~C ( X_{j}^{’},X_\mathit{l}^{’} ) = \delta ^{2}R ( \frac{min \left( j,\mathit{l} \right) }{max \left( j,\mathit{l} \right) } ) }\). Periodic sub-sampling does not alter the structure of the \(\mathrm{ \{ X_{j} \} }\) apart from the location effect of the period.
Another proof is analogous to the one given for extremal processes where this effect is integrated in a general change of the time scale.
An ergodic (type) theorem, analogous to the one for extremal processes, can be written. It is \(\mathrm{ b/a \rightarrow \infty }\),
\(\mathrm{ \frac{1}{log \left( b/a \right) } \sum _{a+1}^{b}\frac{X_{k}}{k}- \delta\, log\sqrt[]{a\,b}\stackrel {m\,s}\rightarrow \lambda + \gamma \, \delta }\).
The proof is not important being analogous to the one to follow and statistically irrelevant.
A stochastic process \(\mathrm{ Z \left( t \right) \left( t \geq 0 \right) }\) is called an extremal process (of maxima) if:
1) \(\mathrm{ Z \left( t+ \theta \right) =max \left( Z \left( t \right) ,E \left( t, \theta \right) \right) }\), where \(\mathrm{ X \left( t \right) }\) and \(\mathrm{ ~E \left( t, \theta \right) }\) are independent random variables \(\mathrm{ \left( \theta \geq 0 \right) }\);
2)\(\mathrm{ ~E \left( t, \theta \right) }\) and \(\mathrm{ E \left( t’, \theta ’ \right) }\) are independent random variables if the time intervals \(\mathrm{ ] t,t+ \theta ] }\) and \(\mathrm{ ] t’,t’+ \theta ’] }\) are disjoint;
3) the processes is time-homogeneous, that is, the distribution of \(\mathrm{ E \left( t, \theta \right) }\) depends only on \(\mathrm{ \theta }\) and is independent of \(\mathrm{ t }\) ;
4) \(\mathrm{ X \left( 0 \right) =- \infty }\).
If \(G \left( x \right) \) is the distribution function of \(X \left( 1 \right) \) we have
\(Prob \{ Z \left( t \right) \leq x \} =G^{t} \left( x \right) ~and~Prob \{ E \left( t, \theta \right) \leq x \} =G^{ \theta } \left( x \right) \).
Let \(\mathrm{ F \left( x,t \right) }\) be the distribution function of \(\mathrm{ X \left( t \right) }\) and denote by \(\mathrm{ F_{1} \left( x, \theta \right) }\) the distribution function of \(\mathrm{ E \left( t, \theta \right) }\), independent of \(\mathrm{ t }\) by assumption (3).
Thus we have, by (1) and (2),
\(\mathrm{ F \left( x,t+ \theta + \theta ’ \right) =F \left( x,t \right) F_{1} \left( x, \theta + \theta ’ \right) =F \left( x,t+ \theta \right) F_{1} \left( x, \theta ’ \right) }\)
\(\mathrm{ =F \left( x,t \right) F_{1} \left( x, \theta \right) F_{1} \left( x, \theta ’ \right) }\)
so that
\(\mathrm{ F_{1} \left( x, \theta + \theta ’ \right) =F_{1} \left( x, \theta \right) F_{1} \left( x, \theta ’ \right) }\)
As the solution of this functional equation, for measurable functions, is the exponential function, we have
\(\mathrm{ F_{1} \left( x, \theta \right) =F_{1}^{ \theta } \left( x,1 \right) }\)
As \(\mathrm{ F \left( x,t \right) =F \left( x,0 \right) F_{1} \left( x,t \right) =F \left( x,0 \right) F_{1}^{t} \left( x,1 \right) }\)and by (4). \(\mathrm{F \left( x,0 \right) =1}\) we obtain
\(\mathrm{ F \left( x,t \right) =F_{1}^{t} \left( x,1 \right) =G^{t} \left( x \right) }\)
as
\(\mathrm{ F \left( x,1 \right) =F_{1} \left( x,1 \right) =G \left( x \right) }\).
This result was obtained asymptotically by Newell (1962) under other conditions.
Let us finally introduce a new assumption, that one-dimensional margins have Gumbel distributions, by:
5) \(\mathrm{ G \left( x \right) = \Lambda \left( x \right) }\) (standardization of the time unit).
We have then
The one-dimensional distribution of \(Z \left( t \right) ~is ~\Lambda ^{t} \left( x \right) = \Lambda \left( x-log~t \right)\);
the two-dimensional distribution of \(\left( Z \left( t_{1} \right) ,Z \left( t_{2} \right) \right) \left( t_{1}<t_{2} \right)\) is
\(\Lambda ( min \left( x_{1},x_{2} \right) -log~t_{1} ) \cdot \Lambda ( x_{2}-log \left( t_{2}-t_{1} \right) ) \);
and, in general, the \( n\)-dimensional distribution of \(( Z \left( t_{1} \right) ,Z \left( t_{2} \right) ,\dots,Z \left( t_{n} \right) ) \)is
\(\begin{array}{c} { n} \\{\Pi} \\ { i=1 } \end{array} \Lambda ( min( x_{1},\dots,x_{n} ) -log \left( t_{i}-t_{i-1} \right) ) ( t_{0}=0<t_{1}<\dots<t_{n} )\).
It is also immediate that, with \(\mathrm{ H \left( w \right) }\) denoting the Heaviside distribution function of the almost sure random variable equal to zero, then
\(X \left( t \right) \) is a Markoff process with transition probability function
\(P \left( x_{1},t_{1};x_{2},t_{2} \right) =H \left( x_{2}-x_{1} \right) \Lambda ^{t_{2}-t_{1}} \left( x_{2} \right) \).
From this transition probability function we get, denoting by \(\mathrm{ q \left( x_{1},t_{1} \right) }\) and \(\mathrm{ Q \left( t_{1},x_{1},x_{2} \right) }\) the process intensity and the (relative) transition probability function (using formula (8.8.5) of Fisz (1962)),
\(\mathrm{ P \left( x_{1},t_{1};x_{2},t_{2} \right) = \left[ 1-q \left( x_{1},t_{1} \right) \left( t_{2}-t_{1} \right) \right] H \left( x_{2}-x_{1} \right) }\)
\(\mathrm{ +q \left( x_{1},t_{1} \right) Q \left( t_{1},x_{1},x_{2} \right) \left( t_{2}-t_{1} \right) +0 \left( t_{2}-t_{1} \right) }\),
the following result:
An extremal process is a purely discontinuous (step) Markoff process where
\(P \left( x_{1},t_{1};x_{2},t_{2} \right) = \left[ 1-e^{-x_{1}} \left( t_{2}-t_{1} \right) \right] H \left( x_{2}-x_{1} \right) +0 \left( t_{2}-t_{1} \right) \),
so that
\(q \left( x_{1},t_{1} \right) =e^{-x_{1}}\) and \(Q \left( t_{1},x_{1},x_{2} \right) =H \left( x_{2}-x_{1} \right) \left( 1-e^{x_{1}-x_{2}} \right) \).
For this purely discontinuous Markoff process we can immediately write the classical Kolmogoroff-Chapman integral equations.
Let \(T \left( a \right) \) denote the upcrossing time of the level \(a \). As \(\mathrm{ Prob \{{ T \left( \mathit{a} \right) >t }\} =Prob \{ Z \left( t’ \right) <\mathit{a},t’ \leq t \} = \Lambda ^{t} \left( \mathit{a} \right) = \Lambda \left( \mathit{a}-log~t \right) }\), we can say that:
The distribution function of the upcrossing time is
\(Prob \{{ T \left( a \right) \leq t}\} =1- \Lambda \left( a-log~t \right) \).
Some more results can also be given.
A first one, relating the chance experiments between points of time, is that for \(\mathrm{ s \leq u \leq t }\) we have \(\mathrm{ E \left( s,t \right) =max( E \left( s,u \right) ,E \left( u,t \right) ) }\), as follows from the two expressions of \(\mathrm{ Z \left( t \right) }\) in \(\mathrm{ Z \left(s \right) }\), directly and by way of \(\mathrm{ Z \left(u \right) }\).
As \( Z \left( t \right) -log~t=E \left( 0,t \right) -log~t \)is a reduced Gumbel random variable we see that \(\frac{Z \left( t \right) }{log~t}\stackrel{ms} \rightarrow1 \).
Finally the least squares predictor of \(\mathrm{ Z \left( t \right) }\), knowing \(\mathrm{ Z \left( t_{1} \right) =z_{1},\dots,Z \left( t_{n} \right) =z_{n} \left( 0<t_{1}<t_{2}<\dots<t_{n}<t \right) }\), as \(\mathrm{ Z \left( t \right) }\) is markovian, it depends only on \(\mathrm{ z_{n} }\) is given by
\(\mathrm{ p \left( t;z_{n},t_{n} \right) =z_{n}+ \int _{z_{n}}^{+ \infty} \left( 1- \Lambda \left( z-log \left( t-t_{n} \right) \right) \right) d~z \left( \geq z_{n} \right) }\)
whose mean-square error is
\(\mathrm{ \int _{- \infty}^{+ \infty} [ \int _{z_{n}}^{+ \infty} \left( 1- \Lambda \left( y-log \left( t-t_{n} \right) \right) \right) d~y^{2}- \left[ p^{2} \left( t;z_{n},t_{n} \right) -z_{n}^{2} \right] d~ \Lambda ^{t-t_{n}} \left( z_{n} \right) ] }\).
From the practical point of this is not very rewarding and it is simpler to use the linear predictor \(\mathrm{ Z \left( t \right) =z_{n}+log~t/t_{n} }\).
Notice that, as \(\mathrm{ Z \left( t \right) }\) is not decreasing, we know that \(\mathrm{\tilde{ Z} \left( t \right) =\begin{array}{c}\\\mathrm{ max} \\ \mathrm{ 0 \leq s \leq t } \end{array}Z \left( s \right) =Z \left( t \right) }\).
We will sketch another characterization of extremal processes; for details and enlargements see Dwass (1964, 1966, 1973), Lamperti (1964), Resnick (1973, 1975) and Resnick and Rubinovitch (1973); for other approaches see references in Galambos (1978) and Leadbetter, Lindgren and Rootzén (1983).
Let \(\mathrm{ \{{ X_{j} }\} ,j=1,2,\dots }\) be a sequence of i.i.d. random variables with distribution function \(\mathrm{ F \left( x \right) }\) and such that there exist \(\mathrm{ \lambda _{n} }\) and \(\mathrm{ \delta _{n} \left( >0 \right) }\) such that \(\mathrm{ Prob \{ \begin{array}{c}\mathrm{ n} \\ \mathrm{max } \\ \mathrm{1 } \end{array} \left( X_{i}- \lambda _{n} \right) / \delta _{n} \leq x \} =F^{n} \left( \lambda _{n}+ \delta _{n}~x \right)\stackrel {w}\rightarrow L \left( x \right) }\), where the proper and non-degenerate limiting distribution is necessarily a Gumbel distribution function \(\mathrm{ \Lambda \left( x \right) =exp \left( -e^{-x} \right) }\), or a Fréchet distribution function \(\mathrm{ \Phi _{ \alpha } \left( x \right) =0~if~x \leq 0, \Phi _{ \alpha } \left( x \right) =exp \left( -e^{- \alpha } \right) ~if~x \geq 0~with~ \alpha >0 }\), or even a Weibull distribution function \(\mathrm{ \Psi _{ \alpha } \left( x \right) =exp ( -( - \left( x \right) ^{ \alpha } ) ~if~x \leq 0, \Psi _{ \alpha } \left( x \right) =1~if~x \geq 0~with~ \alpha >0 }\).
Let us define, for each \(\mathrm{ n=1,2,\dots}\) and \(\mathrm{ t \geq 0 }\), stochastic processes
\(\mathrm{ M_{n} \left( t \right) = {(max ( X_{1},\dots,X_{ \left[ nt \right] } ) - \lambda _{n}})/{ \gamma _{n}}~for~t> {1}/{n} }\)
\(\mathrm{ = \left( X_{1}- \lambda _{n} \right) / \delta _{n} }\) for \(\mathrm{ 0 \leq t \leq 1/n }\) ,
\(\mathrm{ \left[ u \right] }\) denoting as usual the largest integer \(\mathrm{ \leq u }\) .
As \(\mathrm{ Prob \{{ M_{n} \left( t \right) \leq x }\} =F^{ \left[ nt \right] } \left( \lambda _{n}+ \delta _{n}x \right) = \left[ F^{ \left[ n \right] } \left( \lambda _{n}+ \delta _{n}x \right) \right] ^{ { \left[ nt \right] }/{n}} \rightarrow L^{t} \left( x \right) }\)
and more generally, for \(\mathrm{ 0<t_{1}<t_{2}<\dots t_{k} }\),
\(\mathrm{ Prob \{{ M_{n} \left( t_{1} \right) \leq x_{1},M_{n} \left( t_{2} \right) \leq x_{2},\dots,M_{n} \left( t_{k} \right) \leq x_{n} }\} \rightarrow }\)
\(\mathrm{ L^{t_{1}} ( \begin{array}{c}\mathrm{ k} \\ \mathrm{min } \\ \mathrm{1 } \end{array} \left( x_{i} \right) ) L^{t_{2}-t_{1}} ( \begin{array}{c}\mathrm{ k} \\ \mathrm{min } \\ \mathrm{2 } \end{array}\left( x_{i} \right) ) L^{t_{3}-t_{2}} ( \begin{array}{c}\mathrm{ k} \\ \mathrm{min } \\ \mathrm{3 } \end{array} \left( x_{i} \right) ) …L^{t_{k}-t_{k-1}} \left( x_{k} \right) }\)
we define an extremal reduced process \(M \left( t \right) \) as a limit of \(M_{n} \left( t \right)\) which extends to Fréchet and Weibull margins the extremal process defined with Gumbel margins.
It is evident that all “margin-free” properties such as the jump process described below. (to be Poisson-distributed) and all non-parametric correlation coefficients have the same behaviour as in extremal process with Gumbel margins, which will continue to be considered in what follows. Note, in particular, that second order properties are not always valid for extremal processes with Fréchet or Weibull margins.
The second order properties of extremal processes (with Gumbel margins, to be dealt with in the rest of the chapter) depend on the form of the covariance (or correlation) function, as the margins have mean value and variance.
From the expression of the two-dimensional distribution function of \(\mathrm{ \left( Z \left( t_{1} \right) ,Z \left( t_{2} \right) \right) }\) passing to the reduced margins \(\mathrm{ \xi _{1}=x_{1}-log~t_{1} }\) and \(\mathrm{ \xi _{2}=x_{2}-log~t_{2} }\), we see that the mean value function of \(\mathrm{ Z \left( t \right) }\) is \(\mathrm{ \mu \left( t \right) = \gamma +log~t }\) and the dependence function \(\mathrm{ \left( t_{1} \leq t_{2} \right) }\) is the biextremal one
\(\mathrm{ k \left( w;t_{1}/t_{2} \right) =\frac{ \left( 1-t_{1}/t_{2} \right) +max \left( e^{w},t_{1}/t_{2} \right) }{1+e^{w}}=1-\frac{min \left( t_{1}/t_{2},e^{w} \right) }{1+e^{w}} }\),
with parameter \(\mathrm{ \theta =t_{1}/t_{2} \left( 0 \leq \theta \leq 1 \right) }\). Consequently :
An extremal (Gumbel) location-dispersion free process has the mean value \(\gamma +log~t\), variance \(\pi ^{2}/6 \) and the correlation function
\(\mathrm{ \rho \left( t_{1},t_{2} \right) = \rho \left( {t_{1}}/{t_{2}} \right) =1+\frac{6}{ \pi ^{2}} \int _{{t_{1}}/{t_{2}}}^{1}\frac{log~ \alpha }{1- \alpha }d~ \alpha =-\frac{6}{ \pi ^{2}} \int _{0}^{{t_{1}}/{t_{2}}}\frac{log~ \alpha }{1- \alpha }d~ \alpha =\frac{6}{ \pi ^{2}}R \left( {t_{1}}/{t_{2}} \right) }\)
the processes is integrable, continuous and indefinitely differentiable in mean square except at origin \(\left( 0,0 \right) \) where \(\rho\) is not continuous.
The mean-square continuity is, evidently, consistent with the almost sure discontinuity, the discontinuities being moving (random) discontinuities; the graph of the process is a non-decreasing random step function.
As we showed, the transition probability function is
\(\mathrm{ P \left( x_{1},t_{1};x_{2},t_{2} \right) =H \left( x_{2}-x_{1} \right) \Lambda ^{t_{2}-t_{1}} \left( x_{2} \right) \left( t_{1}<t_{2} \right) }\).
Consequently the probability of zero jumps in the time interval \(\mathrm{ ] t_{1},t_{2} ] }\) is given by
\(\mathrm{ p_{0} \left( t_{1},t_{2} \right) = \int _{- \infty}^{+ \infty} \Lambda ^{t_{2}-t_{1}} \left( x \right) d~ \Lambda ^{t_{1}} \left( x \right) ={t_{1}}/{t_{2}} }\),
because the probability of zero jumps in \(\mathrm{ ] t_{1},t_{2} ]~if~Z \left( t_{1} \right) =x_{1} }\) (which implies \(\mathrm{ Z \left( t_{1} \right) =x_{1} }\) is \(\mathrm{ \Lambda ^{t_{2}-t_{1}} \left( x_{1} \right) }\); the probability that we have at least one jump is then
\(\mathrm{ p_{>0} \left( t_{1},t_{2} \right) =1-t_{1},t_{2} }\).
Notice that for every \(\mathrm{ t>0 }\) we have \(\mathrm{ Prob \{ Z \left( t \right) >- \infty \} =p_{>0} \left( 0,t \right) =1 }\) so that:
For \(t>0,Z \left( t \right) \) is finite with probability one.
If \(\mathrm{ PN \left( t_{1},t_{2} \right) }\) denotes the probability of \(\mathrm{ N }\) jumps in the interval \(\mathrm{ ] t_{1},t_{2} ] }\), the probability of zero jumps in the interval \(\mathrm{ ] t_{1},t_{2} +\theta] }\) is given by
\(\mathrm{ p_{0} \left( t_{2},t_{2}+ \theta \right) =\frac{t_{2}}{t_{2}+ \theta }=1-\frac{ \theta }{t_{2}}+0 \left( \theta \right) }\),
and as
\(\mathrm{ p_{>0} \left( t_{2},t_{2}+ \theta \right) =\frac{ \theta }{t_{2}}+0 \left( \theta \right) }\),
as is natural and will be proved later, then
\(\mathrm{ p_{1} \left( t_{2},t_{2}+ \theta \right) =p_{>0} \left( t_{2},t_{2}+ \theta \right) +0 \left( \theta \right) }\).
The basic equation is
\(\mathrm{ PN+1 \left( t_{1},t_{2}+ \theta \right) =PN+1 \left( t_{1},t_{2} \right) \cdot p_{0} \left( t_{2},t_{2}+ \theta \right) }\)
\(\mathrm{ +PN \left( t_{1},t_{2} \right) \cdot p_{1} \left( t_{2},t_{2}+ \theta \right) +0 \left( \theta \right) }\).
which, as derivatives exist, can be written as
\(\mathrm{ PN+1 \left( t_{1},t_{2} \right) +\frac{ \partial~ PN+1 \left( t_{1},t_{2} \right) }{ \partial ~t_{2}} \theta =PN+1 \left( t_{1},t_{2} \right) ( 1-\frac{ \theta }{t_{2}} )+PN(t_{1},t_{2})\frac{ \theta }{t_{2}}+0({\theta}) }\)
and gives, for \(\mathrm{ \theta \rightarrow 0 }\),
\(\mathrm{ \frac{ \partial ~PN+1 \left( t_{1},t_{2} \right) }{ \partial ~t_{2}}=\frac{1}{t_{2}} \left( PN \left( t_{1},t_{2} \right) -PN+1 \left( t_{1},t_{2} \right) \right) }\).
If we denote by \(\mathrm{ \Delta \left( z;t_{1},t_{2} \right) = \sum _{0}^{ \infty}PN \left( t_{1},t_{2} \right) z^{N} }\)the generating function of the probabilities, then \(\mathrm{ \Delta \left( 1;t_{1},t_{2} \right) =1~and~ \Delta \left( z;t_{1},t_{1} \right) =1 }\).
The generating function \(\mathrm{ \Delta }\), from the difference-differential equation for the probabilities, satisfies the partial differential equation
\(\mathrm{ \frac{1}{z} ( \frac{ \partial ~ \Delta }{ \partial ~t_{2}}-\frac{ \partial ~p_{0} \left( t_{1},t_{2} \right) }{ \partial ~t_{2}} ) =\frac{1}{t_{2}} ( \Delta \frac{ \Delta -p_{0} \left( t_{1},t_{2} \right) }{z}) }\).
From \(\mathrm{ p_{0} \left( t_{1},t_{2} \right) =t_{1}/t_{2} }\) we obtain
\(\mathrm{ \frac{1}{ \Delta }\frac{ \partial ~ \Delta }{ \partial ~t_{2}}=\frac{z-1}{t_{2}} }\),
whose solution is evidently
\(\mathrm{ \Delta \left( z;t_{1},t_{2} \right) = ( \frac{t_{2}}{t_{1}} ) ^{z-1}=e^{ \left( z-1 \right) log \left( t_{2}/t_{1} \right) } }\)
owing to the known values of \(\mathrm{ \Delta \left( 1;t_{1},t_{2} \right) =1 }\) and \(\mathrm{ \Delta \left( z;t_{1},t_{1} \right) =1 }\). But \(\mathrm{ \Delta }\) is the generating function of the Poisson distribution with parameter \(\mathrm{ v=log \left( t_{2}/t_{1} \right) }\), and so
The jumps of \(Z(t)\) in the interval, \(] t_{1},t_{2} ]\) have a Poisson distribution with parameter \(v=log~ \left( t_{2}/t_{1} \right) \left( >0 \right)\); the mean number and variance of jumps increase logarithmically with time.
Thus in the finite interval \(\mathrm{ ]0,t] }\) we must expect infinity of jumps; the natural science interpretation is easy: in the process beginning at \(\mathrm{ -~ \infty \left( Z \left( 0 \right) =-~ \infty~ \right) }\) we should except an infinite number of jumps to obtain any finite value.
Let us show, finally, that
\(\mathrm{ p_{1} \left( t,t+ \theta \right) =\frac{ \theta }{t}+0 \left( \theta \right) }\); as \(\mathrm{ p_{>0} \left( t,t+ \theta \right) =\frac{ \theta }{t}+0 \left( \theta \right) }\), it is sufficient to show that \(\mathrm{ p_{>1} \left( t,t+ \theta \right) =0 \left( \theta \right) }\). As we have two or more jumps on \(\mathrm{ ]t ,t+θ\,] }\), let \(\mathrm{ \tilde{θ } }\) denote the mid-point of the instants of the first and last jump and \(\mathrm{ \pi( \tilde{θ }) }\) its distribution function. We can clearly write
\(\mathrm{ p_{>1} \left( t,t+ \theta \right) = \int _{0}^{ \theta } p_{>0}~ ( t,t+ \tilde{θ })~ p_{>0}~( t+ \tilde{θ } ,t+ \theta )~ d~ \pi ~( \tilde{θ } ) }\)
\(\mathrm{ = \int _{0}^{ \theta }\frac{ \tilde{θ } }{t+ \tilde{θ }} \cdot \frac{ \theta - \tilde{θ } }{t+ \theta }d~ \pi ( \tilde{θ } ) \leq \frac{ \theta ^{2}}{t^{2}} \int _{0}^{ \theta }\frac{ \tilde{θ } }{ \theta } \cdot \frac{ \theta - \tilde{θ }}{ \theta }d~ \pi ( \tilde{θ } ) \leq \frac{ \theta ^{2}}{t^{2}} }\)
with \(\mathrm{ p_{>1} \left( t,t+ \theta \right) =0 \left( \theta \right) }\) as we wished to show.
The following representation (i.e., a stochastic process with the same finite-dimensional distributions) can be given:
For \( t \geq t_{0} \) the extremal process \(\mathrm{ Z \left( t \right) }\) is equivalent in law to
\(\mathrm{ Z \left( t \right) =Z \left( t_{0} \right) for~t \in [ t_{0},t_{1} [ }\) ,
\(\mathrm{ Z \left( t \right) =Z \left( t_{1} \right) for~t \in [ t_{1},t_{2} [ }\)
\( \dots\)
where \(t_{1}<t_{2}< \dots \) are the jump times after \(\mathrm{t_{0} }\), having the densities \(\mathrm{ t_{0}/t_{1}^{2} }\) for \(\mathrm{ t_{1} \geq t_{0} }\), \(\mathrm{ t_{1}/t_{2}^{2} }\) for \(\mathrm{ t_{2} \geq t_{1},\dots }\) and \(\mathrm{ Z \left( t_{1} \right) =x_{1},X \left( t_{2} \right) =x_{2,}\dots }\) are the values at the jump times \(t_{1}<t_{2}< \dots \) with the distribution functions
\(\mathrm{ \frac{ \Lambda ^{t_{1}-t_{0}} \left( x_{1} \right) - \Lambda ^{t_{1}-t_{0}} \left( x_{0} \right) }{1- \Lambda ^{t_{1}-t_{0}} \left( x_{0} \right) }~for~x_{1}>x_{0} }\),
\(\mathrm{ \frac{ \Lambda ^{t_{2}-t_{1}} \left( x_{2} \right) - \Lambda ^{t_{2}-t_{1}} \left( x_{1} \right) }{1- \Lambda ^{t_{2}-t_{1}} \left( x_{1} \right) }~for~x_{2}>x_{1} }\),
\( \dots\)
The proof is immediate because the probability of no jumps in the time interval \(\mathrm{ [ t_{0},t_{1} [ is~t_{0}/t_{1} }\) and the probability that, as there is a jump, its value is smaller than \(\mathrm{ x_{1} }\) is given by the transition probability function truncated at \(\mathrm{ x_{0} }\).
The study of the times of the jumps, as will be seen, is not helpful for statistical decision making.
As \(\mathrm{ log\, { t}}\) appears frequently in the expressions connected with extremal processes, we can consider the use of a new time variable \(\mathrm{ \tilde{t}= log~t }\) (logarithmic time) defining a new process \(\mathrm{ \tilde{Z}( \tilde{ t} ) =Z( e^{ \tilde{t}} ) }\), whose characterization could be transposed from that of \(\mathrm{ Z \left( t \right) }\). We will give a different one, equivalent but more natural for the case. Let us just remark that \(\mathrm{ Prob \{ \tilde{Z} ( \tilde{{t}} ) \leq x \} = \Lambda ( x-\tilde{t} ) }\).
Consider then the (transposed) extremal process:
1) \(\mathrm{ \tilde{Z} \left( \tilde{{t}} \right) }\) is defined over all the real (time) line;
2) \(\mathrm{ \tilde{Z} ( \tilde{{t_2}} ) =max ( \tilde{Z}( \tilde{{t_1}} ) , \tilde{E} ( \tilde{t_1},\tilde{{t_2}}) ) }\) where \(\mathrm{ \tilde{Z} \left( \tilde{{t_1}} \right) }\) and \(\mathrm{ \tilde{E} \left( \tilde{t_1},\tilde{{t_2}} \right) }\) are independent random variables;
3) \(\mathrm{ \tilde{Z} \left( \tilde{{t}} \right) }\) has the Gumbel distribution \(\mathrm{ \Lambda \left( x-\tilde{t} \right) }\)for convenient standardization of the time (or change of origin) and change of scale (i.e., a linear change of the clock).
Note immediately, that from (3) we have \(\mathrm{ \tilde{Z} \left( - \infty \right) =- \infty }\) with probability one.
Let us denote by \(\mathrm{ \Psi ( x_{1},\tilde{t}_{1};x_{2},\tilde{t}_{2} ) =Prob \{ \tilde{Z} \left( \tilde{t}_{1} \right) \leq x_{1} ) , \tilde{Z} \left( \tilde{t}_{2} \right) \leq x_{2} \} }\); we have, \(\mathrm{ \Psi ( x_{1},\tilde{t}_{1};x_{2},\tilde{t}_{2} ) =Prob \{ \tilde{Z} \left( \tilde{t}_{1} \right) \leq x_{1} ) , \tilde{Z} \left( \tilde{t}_{2} \right) \leq x_{2} ,\tilde{E} \left( \tilde{t}_{1},\tilde{t}_{2} \right) \leq x_{2} \} }\) \(\mathrm{ = \Lambda( min \left( x_{1},x_{2} \right) -\tilde{t}_{1} ) \cdot G ( x_{2};\tilde{t}_{1},\tilde{t}_{2} ) }\)using hypothesis (3) for the distribution of \(\mathrm{ \tilde{Z} \left( \tilde{{t_1}} \right) }\).
With hypothesis (3) we have also, putting \(\mathrm{ x_{1}=+ \infty, }\)
\(\mathrm{ \Lambda ( x_{2},\tilde{t}_{2} ) = \Lambda ( x_{2},\tilde{t}_{1} ) \cdot G( x_{2};\tilde{t}_{1},\tilde{t}_{2} ) }\)
and so
\(\mathrm{ \Psi ( x_{1}, \tilde{t}_{1};x_{2}, \tilde{t}_{2} ) = \Lambda ( min( x_{1},x_{2} ) - \tilde{E}_{1} ) \Lambda ( x_{2}-log ( e^{ \tilde{t}_{2}}-e^{ \tilde{t}_{1}} ) ) }\), as could be expected from the relation \(\mathrm{ t= e^{ \tilde{t}} . }\)
Thus the mean value function is \(\mathrm{ \gamma + \tilde{t} }\), the variance \(\mathrm{ \frac{ \pi ^{2}}{6} }\) and the correlation function
\(\mathrm{ \tilde{ \rho} ( \tilde{t}_{2}-\tilde{t}_{1} ) = \rho ( {e^{\tilde{t}_{2}}}/{e^{\tilde{t}_{1}}} ) =\frac{6}{ \pi ^{2}}~R ( e^{- \vert \tilde{t}_{2}-\tilde{t}_{1} \vert } ) }\),
so \(\mathrm{ \tilde{Y}( \tilde{t} ) =\tilde{Z} ( \tilde{t} ) - \gamma -\tilde{t} }\) is a weakly stationary process.
\( \tilde{Y} \left( \tilde{t} \right) \) is thus an ergodic process (of second order) and we know that \( \frac{1}{\tilde{b}-\tilde{a}} \int _{\tilde{a}}^{\tilde{b}} \tilde{Y} ( \tilde{t} ) \tilde{d}\,t\stackrel {ms}\rightarrow0~as~\tilde{b}-\tilde{a} \rightarrow + \infty \).
Passing to the original clock \(\mathrm{ t=e^{\tilde{ t}},b=e^{\tilde{ b}},a=e^{\tilde{ a}} } \), we get
\(\mathrm{ \frac{1}{log \left( b/a \right) } \int _{a}^{b}\frac{Z \left( t \right) - \gamma -log~t}{t}d~t\stackrel{ms}\rightarrow0~as\,{b}/{a} \rightarrow + \infty }\)
or, in a simpler form,
\(\mathrm{ \frac{1}{log \left( {b}/{a} \right) } \int _{a}^{b}\frac{Z \left( t \right) }{t}d~t-log\sqrt[]{a~b}\stackrel{m\,s}\rightarrow\gamma }\)
when \(\mathrm{ {b}/{a} \rightarrow + \infty }\). We have for the variance of \(\mathrm{ \frac{1}{log \left( {b}/{a} \right) } \int _{a}^{b}\frac{Z \left( t \right) }{t}d~t }\) the expression
\(\mathrm{ R \left( {a}/{b} \right) +\frac{2}{ \left( log \left( {b}/{a} \right) \right) ^{2}} \int _{a/b}^{1} ( log\frac{b}{a}+\frac{log~u}{2} ) \frac{ \left( log~u \right) ^{2}}{1-u}d~u }\).
Notice that if the margins are Fréchet or Weibull, the correlation coefficient between \(\mathrm{ Z \left( t_{1} \right) }\) and \(\mathrm{ Z \left( t_{2} \right) \left( 0<t_{1}<t_{2} \right) }\) also depends only on \(\mathrm{ t_{1}/t_{2} }\) (and \(\mathrm{ \alpha }\) ), when the second moments exist.
Let \(\mathrm{ Z \left( t \right) ,t \geq 0 }\) be a (reduced) extremal process and consider the two (general) extremal processes
\(\mathrm{ X_{1} \left( t \right) = \lambda + \delta\, Z \left( t \right) \left( \lambda ~real, \delta >0 \right) }\)and
\(\mathrm{ X_{2} \left( t \right) = \delta ~Z \left( \beta ~t \right) \left( \delta , \beta >0 \right) }\).
Their mean values \(\mathrm{ \mu _{1} \left( t \right) = \lambda + \delta \left( \gamma +log~t \right) }\) and \(\mathrm{ \mu _{2} \left( t \right) = \delta \left( \gamma +log~ \left( \beta ~t \right) \right) }\)are equal if \(\mathrm{ \lambda = \delta ~log~ \beta }\); also the covariance function in both processes is \(\mathrm{ R ( \frac{min \left( s,t \right) }{max \left( s,t \right) } ) \delta ^{2} }\). This suggests that \(\mathrm{ X_{1} \left( t \right) }\) and \(\mathrm{ X_{2} \left( t \right) }\) can have identical distribution. In fact we have (with \(\mathrm{ t_{0}=0 }\) )
\(\mathrm{ Prob \{ { X_{1} \left( t_{i} \right) \leq x_{i} } \} =Prob \{ { Z \left( t_{i} \right) \leq ( {x_{i} - \lambda ) }/{ \delta } } \} }\)
\(\mathrm{ =\begin{array}{c}\mathrm{ n} \\ \mathrm{\Pi} \\ \mathrm{ i=1 } \end{array} \Lambda ( \begin{array}{c}\mathrm{ n} \\ \mathrm{min } \\ \mathrm{j=i } \end{array}\frac{x_{j}- \lambda }{ \delta } ) -log \left( t_{i}-t_{i-1} \right) }\)
\(\mathrm{ =\begin{array}{c}\mathrm{ n} \\ \mathrm{\Pi} \\ \mathrm{ i=1 } \end{array} \Lambda \{( \begin{array}{c}\mathrm{ n} \\ \mathrm{min } \\ \mathrm{j=i } \end{array}{x_{j}- \lambda })/{ \delta }-log ( t_{1}-t_{i-1}) \} }\)
and also
\(\mathrm{ Prob \{{ X_{2} \left( t_{i} \right) \leq x_{i} }\} =Prob \{{ Z \left( \beta ~t_{i} \right) \leq {x_{i}}/{ \delta } }\} } \)
\(\mathrm{ =\begin{array}{c}\mathrm{ n} \\ \mathrm{\Pi} \\ \mathrm{ i=1 } \end{array} \Lambda ( \begin{array}{c}\mathrm{ n} \\ \mathrm{min } \\ \mathrm{j=i } \end{array}\frac{x_{j}}{ \delta }-log \left( \beta \left( t_{i}-t_{i-1} \right) \right) ) }\)
\(\mathrm{ =\begin{array}{c}\mathrm{ n} \\ \mathrm{\Pi} \\ \mathrm{ i=1 } \end{array} \Lambda ( ( \begin{array}{c}\mathrm{ n} \\ \mathrm{min } \\ \mathrm{j=i } \end{array}x_{j}- \delta ~log \beta ) / \delta -log \left( t_{i}-t_{i-1} \right) ) }\)
which are equal if \(\mathrm{ \lambda = \delta ~log~ \beta } \) as said before.
Evidently any splitting \(\mathrm{ \left( \lambda _{1}, \beta _{1} \right) with~ \lambda = \delta ~log~ \beta = \lambda _{1}+ \delta ~log~ \beta _{1} } \)is admissible.
Owing to linearity we will prefer the parameterization in \(\mathrm{ \left( \lambda , \delta \right) } \); any change of time unit (scale) is given by \(\mathrm{ \lambda = \delta ~log~ \beta ~ } \).
Let \(\mathrm{ \{ Z_{n},n \geq 1 \} } \) be a extremal sequence, as described before, with mean values \(\mathrm{ \mu _{i}=M \left( Z_{i} \right) = \gamma +log~i } \) and covariance function \(\mathrm{ \sigma _{ij}= \left( Z_{i},Z_{j} \right) =R ( \frac{min ( i,j ) }{max ( i,j ) }) } \) with \(\mathrm{ R \left( u \right) =- \int _{0}^{u}\frac{log~t}{1-t}d~t ( R \left( 0 \right) =0,R^{’} \left( u \right) >0,R \left( 1 \right) = \pi ^{2}/6 } \), and consider the sequence \(\mathrm{ \{{ X_{n},n \geq 1 }\} ,with~X_{n}= \lambda + \delta ~Z_{n}, \lambda ) } \)real and \(\mathrm{ \delta ~>0 } \); as said before we can have a subsequence \(\mathrm{ \{ X_{np} \} } \) of \(\mathrm{ \{ X_{n } \} } \) with \(\mathrm{ p \left( >0 \right) } \) an integer, obtained from the original by periodic sampling, with the convenient change of parameters (the location one, in this case). We know then that \(\mathrm{ M \left( X_{i} \right) = \lambda + \delta ~ \mu _{i}~and~C ( X_{i},X_{j} ) = \delta ^{2} \sigma _{ij}= \delta ^{2}~R ( \frac{min ( i,j ) }{max ( i,j ) } ) } \).
As \(\mathrm{ X_{i}= \lambda + \delta ~Z_{i}= \lambda + \delta ~ \mu _{i}+ \varepsilon _{i} } \)with \(\mathrm{ \varepsilon _{i}= \delta \left( Z_{i}- \mu _{i} \right) } \) where \(\mathrm{ M \left( \varepsilon _{i} \right) =0 } \) and \(\mathrm{ C \left( \varepsilon _{i}, \varepsilon _{j} \right) = \delta ^{2} \sigma _{ij}= \delta ^{2}~R ( \frac{min \left( i,j \right) }{max \left( i,j \right) } ) } \), we know that the least-squares (unbiased) estimators of \(\mathrm{ \left( \lambda , \delta \right) } \) are given by
\(\mathrm{ \mathrm{ \begin{bmatrix} \lambda^* \\[0.3em] \delta^* \end{bmatrix} }= ( A^{T} \sum ^{-1}A ) ^{-1}A^{T} \sum ^{-1} [ X ] } \)
where \(\mathrm{ A^{T}= \mathrm{ \begin{bmatrix} 1\dots1 \\[0.3em] \mu_1 \dots\mathrm{\mu_n} \end{bmatrix} }, \sum _{}^{}= \delta ^{2} [ \sigma _{ij} ] }\) (variance-covariance matrix), \(\mathrm{ \left[ X \right] ^{T}= [ X_{1}, \dots ,X_{n} ] }\) (the transposed sample vector) and \(\mathrm{ det ( A^{T} \sum ^{-1}A ) }\) assumed to be \( \neq0\); see Silvey (1975). The least squares estimator of the quantile \(\mathrm{ \varphi= \lambda + \chi ~ \delta }\) is, as known, \(\mathrm{ \varphi^*= \lambda^* + \chi ~ \delta^* }\), which is unbiased \(\mathrm{ \left( M \left( \varphi ^{*} \right) =\varphi \right) }\) and has the variance \(\mathrm{ V \left( \varphi^{*} \right) = \delta ^{2} \left[ 1~ \chi \right] ( A^{T} \sum ^{-1}A) ^{-1} \mathrm{ \begin{bmatrix} 1 \\[0.3em] \chi \end{bmatrix} }=V \left( \lambda ^{*} \right) +2~ \chi ~C \left( \lambda ^{*}, \delta ^{*} \right) + \chi ^{2}V \left( \delta ^{*} \right) }\)with \(\mathrm{ ( A^{T} \sum ^{-1}A ) ^{-1} ) ^{-1}= \mathrm{ \begin{bmatrix} V(\lambda^*)& \mathrm{C(\lambda^*,\delta^*)} \\[0.3em] \mathrm{C(\lambda^*,\delta^*)}& \mathrm{V(\delta^*)} \end{bmatrix} } }\).
From the form of \(\mathrm{ \sum = [ R \left( min \left( i,j \right) /max \left( i,j \right) \right)] }\), the least squares estimators \(\mathrm{ \left( \lambda ^{*},~ \delta ^{*} \right) }\) do not seem very easy to manage.
Another pair of (unbiased) estimators is given by minimizing
\(\mathrm{ \sum _{1}^{n} \in \begin{array}{c}\mathrm{ 2}\\ \mathrm{i } \end{array} = \left[ X- \lambda - \delta ~ \mu \right] ^{T} \left[ X- \lambda - \delta ~ \mu \right] }\) as
\(\mathrm{ \mathrm{ \begin{bmatrix} \lambda^{**} \\[0.3em] \delta^{**} \end{bmatrix} }= \left( A^{T}A \right) ^{-1}~A^{T} \left[ X \right] =\mathrm{ \begin{bmatrix}n&\mathrm{ \sum_i \mu_i } \\[0.3em] \mathrm{ \sum_i \mu_i } &\mathrm{ \sum_{i} \mu_{i}^2 } \end{bmatrix} }^{-1} \mathrm{ \begin{bmatrix} 1\dots1 \\[0.3em] \mu_1 \dots\mathrm{\mu_n} \end{bmatrix} } \left[ \begin{matrix}X_{1}\\\dots\\\mathrm{X_{n}}\\\end{matrix} \right] }\)
\(=\mathrm{ \begin{bmatrix}n&\mathrm{ \sum_i \mu_i } \\[0.3em] \mathrm{ \sum_i \mu_i } &\mathrm{ \sum_{i} \mu_{i}^2 } \end{bmatrix} }^{-1} \mathrm{ \begin{bmatrix}\mathrm{ \sum_i X_i } \\[0.3em] \mathrm{ \sum_i \mu_i X_i } \end{bmatrix} } \)
\(\mathrm{ =\frac{1}{n \sum _{i}^{} \mu _{i}^{2}- \left( \sum _{i}^{} \mu _{i} \right) ^{2}} \mathrm{ \begin{bmatrix}\mathrm{ \sum_i \mu_{i}^{2}\cdot \sum_i X_i -\sum_i \mu_i \cdot\sum_i \mu_i X_i} \\[0.3em] \mathrm{- \sum_i \mu_i \cdot \sum_i X_i+n \sum_i\mu_i X_i } \end{bmatrix} } }\).
The estimator of \(\varphi\) is, then, \(\varphi^{**}=\lambda^{**}+\chi\, \delta^{**}\) and
\(\mathrm{ V \left( \varphi^{**} \right) = \left[ 1~ \chi \right] \left( A^{T}A \right) ^{-1} \left( A^{T} \sum A \right) \left( A^{T}A \right) ^{-1} \mathrm{ \begin{bmatrix} 1 \\[0.3em] \chi \end{bmatrix} } }\).
The efficiency of \(\varphi^{**}\) with respect to \(\varphi^{*}\) is given by
\(\mathrm{ ~\frac{V \left( \lambda ^{*} \right) +2~ \chi ~C \left( \lambda ^{*},~ \delta ^{*} \right) + \chi ^{2}~V \left( \delta ^{*} \right) }{V \left( \lambda ^{**} \right) +2~ \chi ~C \left( \lambda ^{*},~ \delta ^{*} \right) + \chi ^{2}~C \left( \lambda ^{*}, ~\delta ^{*} \right) } }\)
where
\(\mathrm{ \begin{bmatrix} \mathrm{V(\lambda^{**})}& \mathrm{C(\lambda^{**},\delta^{**})} \\[0.3em] \mathrm{C(\lambda^{**},\delta^{**)}}& \mathrm{V(\delta^{**)}} \end{bmatrix} = ( A^{T}A) ^{-1} ( A^{T} \sum A ) ( A^{T}A ) ^{-1} }\).
The general efficiency of the system \(\mathrm{ \left( \lambda ^{**}, \delta ^{**} \right) }\) with respect to \(\mathrm{ \left( \lambda ^{*},~ \delta ^{*} \right) }\) can be defined in two ways:
a) as the value of
\(\mathrm{ \frac{det ( A^{T} \sum ^{-1}A ) ^{-1}}{det \{ ( A^{T}A ) ^{-1} ( A^{T} \sum A ) ( A^{T}A ) ^{-1} \} }=\frac{det^{2}( A^{T}A ) }{det ( A^{T} \sum A) ~det ( A^{T} \sum ^{-1}A ) } }\);
see Cramér (1946);
b) as the smallest root of
\(\mathrm{ det \left( \begin{bmatrix} \mathrm{V(\lambda^*)}& \mathrm{C(\lambda^*,\delta^*)} \\[0.3em] \mathrm{C(\lambda^*,\delta^*)}& \mathrm{V(\delta^*)} \end{bmatrix} - \xi \begin{bmatrix} \mathrm{V(\lambda^{**})}& \mathrm{C(\lambda^{**},\delta^{**})} \\[0.3em] \mathrm{C(\lambda^{**},\delta^{**)}}& \mathrm{V(\delta^{**)}} \end{bmatrix} \right) =0 }\);
see Tiago de Oliveira (1972) and (1982).
Both definitions impose a lot of numerical calculation, beginning with the inversion of \(\mathrm{ \sum = [ R \left( min \left( i,j \right) /max \left( i,j \right) \right)] }\), and do not seem easy to manipulate. If \(\mathrm{ n=2 }\) we get for both estimators.
\(\mathrm{ \mathrm{ \begin{bmatrix} \lambda^{*} \\[0.3em] \delta^{*} \end{bmatrix} } = \mathrm{ \begin{bmatrix} \lambda^{**} \\[0.3em] \delta^{**} \end{bmatrix} } =\frac{1}{log~2} \mathrm{ \begin{bmatrix} (\gamma+log\,2)\,X_1-\gamma\,X_2 \\[0.3em] \mathrm{X_2-X_1} \end{bmatrix} } }\)
and both efficiencies are equal to 1.
In general we use the estimators:
\(\mathrm{ \lambda ^{*}=\frac{ ( \gamma \sum _{i}^{}log~i+ \sum _{i}^{} \left( log~i \right) ^{2}) \cdot \sum _{i}^{}X_{i}- \left( n~ \gamma + \sum _{i}^{}log~i \right) \cdot \sum _{i}^{} \left( log~i \right) X_{i}}{n \sum _{i}^{} \left( log~i \right) ^{2}- \left( \sum _{i}^{}log~i \right) ^{2}} }\)
\(\mathrm{ \delta ^{*}=\frac{n \sum _{i}^{} \left( log~i \right) X_{i}- \sum _{i}^{}log~i \cdot \sum _{i}^{}X_{i}}{n \sum _{i}^{} \left( log~i \right) ^{2}- \sum _{i}^{} \left( log~i \right) ^{2}} }\).
whose variance-covariance matrix was given above.
Consider now the extremal process \(\mathrm{ Z \left( t \right) \left( t \geq 0 \right) }\) (Tiago de Oliveira, 1968) whose mean value function is \(\mathrm{ \mu \left( t \right) = \gamma +log~t }\), covariance function \(\mathrm{ \sigma \left( s,t \right) =C \left( Z \left( s \right) ,Z \left( t \right) \right) =R \left( min \left( s,t \right) /max \left( s,t \right) \right) }\)where, as before, \(\mathrm{ R \left( u \right) =- \int _{0}^{u}\frac{log~t}{1-t}d~t,R \left( 0 \right) =0,R’ \left( 0 \right) >0,R \left( 1 \right) = \pi ^{2}/6 }\); the correlation function is \(\mathrm{ \rho \left( s,t \right) =\frac{6}{ \pi ^{2}}\,R \left( s,t \right) }\).
Let us consider the extremal process \(\mathrm{ X \left( t \right) = \lambda + \delta\, Z \left( t \right) ,t \geq 0 }\). It is well known (see above) that there is an essential unidentifiability problem between the use of \(\mathrm{ X \left( t \right) ,X_{1} \left( t \right) = \lambda _{1}+ \delta ~Z \left( \beta _{1}\,t \right) }\)and \(\mathrm{ X_{2} \left( t \right) = \delta ~Z \left( \beta _{2}\,t \right) }\) with the equalities \(\mathrm{ \lambda = \lambda _{1}+ \delta ~log~ \beta _{1}= \delta ~log~ \beta _{2} }\). We have chosen the present formulation because then the parameters are linear. We will suppose that \(\mathrm{ X \left( t \right) }\) is known (observed) for the interval \(\mathrm{ \left[ a,b \right] ~or~ ] a,b [ }\), which is irrelevant, with \(\mathrm{ 0<a<b }\) .
We could approach the problem by an analogous procedure to the least squares, substituting the sums by integrals, i.e., as \(\mathrm{ X \left( t \right) = \lambda + \delta ~Z \left( t \right) = \lambda + \delta \left( \gamma +log~t \right) + \varepsilon \left( t \right) \varepsilon \left( t \right) = \delta \left( Z \left( t \right) - \gamma -log~t \right) }\), with mean value zero), by minimizing \(\mathrm{ \int _{a}^{b} \int _{}^{} \left( X \left( t \right) - \lambda - \delta \left( \gamma -log~s \right) \right) \bar{\sigma} \left( s,t \right) \left( X \left( t \right) - \lambda - \delta \left( \gamma +log~t \right) \right) d~s~d~t }\)where \(\mathrm{ \bar{\sigma} \left( s,t \right) }\) is the “inverse” of the covariance function \(\mathrm{ \sigma \left( s,t \right) }\) analogous to the relation between \(\mathrm{ \sum ^{-1} }\) and \(\mathrm{ \sum }\) . The solution is
\(\mathrm{ \mathrm{ \begin{bmatrix} \lambda^{*} \\[0.3em] \delta^{*} \end{bmatrix} } =\frac{1}{ \sigma _{0}~ \sigma _{2}- \sigma _{1}^{2}}\mathrm{ \begin{bmatrix} \mathrm{ \int _{a}^{b} \int (\bar{\sigma}_2- \bar{\sigma}_1\,\,\mu(t)) ~ \bar{\sigma} \left( s,t \right) \,X(s)~d~s~~d~t }\\[0.3em] \mathrm{ \int _{a}^{b} \int (\bar{\sigma}_0\,\,\mu(t)- \bar{\sigma}_1) ~ \bar{\sigma} \left( s,t \right) \,X(s)~d~s~~d~t }\end{bmatrix} } }\)
where \(\mathrm{ \bar{\sigma} _{0}= \int _{a}^{b} \int \bar{\sigma} \left( s,t \right) d~s~~d~t, \bar{\sigma}_{1}= \int _{a}^{b} \int \bar{\sigma} \left( s,t \right) \mu \left( s \right) d~s~~d~t }\)and \(\mathrm{ \bar{\sigma} _{2}= \int _{a}^{b} \int \bar{\sigma} \left( s,t \right) \mu \left( s \right) \mu \left( t \right) d~s~~d~t; \left( \lambda ^{*}, \delta ^{*} \right) }\)are unbiased estimators of \(\mathrm{ \left( \lambda , \delta \right) }\). The same result can be obtained be seeking the function \(\mathrm{ f \left( t \right) }\) for which \(\mathrm{ \int _{a}^{b}f \left( t \right) X \left( t \right) d~t }\) is the least-squares (unbiased) estimator of \(\mathrm{ \varphi= \lambda + \chi ~ \delta }\), which can be shown, by a variational method, to be \(\mathrm{ \varphi^{*}= \lambda ^{*}+ \chi ~ \delta ^{*} }\). But the result is unhelpful because the determination of \(\mathrm{ \bar{\sigma} \left( s,t \right) }\) is practically impossible.
As before, for extremal sequences, it is natural to seek the estimators \(\mathrm{ \left( \lambda ^{**}, \delta ^{**} \right) }\) such that the (empirical) mean-square distance
\(\mathrm{ \int _{a}^{b} \left( X \left( t \right) - \lambda - \delta ~ \mu \left( t \right) \right) ^{2} \left( d~t \right) = \int _{a}^{b} \left( X \left( t \right) - \lambda - \delta \left( \gamma +log\,t \right) \right) ^{2}d~t }\)
is a minimum. We get the also unbiased estimators
\(\mathrm{ \mathrm{ \begin{bmatrix} \lambda^{**} \\[0.3em] \delta^{**} \end{bmatrix} } =\frac{1}{ \int _{a}^{b}d~t \cdot \int _{a}^{b} \mu ^{2} \left( t \right) d~t- ( \int _{a}^{b} \mu \left( t \right) d~t ) ^{2}} }\)
\( \mathrm{ \begin{bmatrix} { \int _{a}^{b} \mu ^{2} \left( t \right) d~t \cdot \, \int _{a}^{b}\,X(t)\,d~t- \int _{a}^{b} \mu \left( t \right) d~t\cdot \int _{a}^{b}\mu (t)\,X(t)d~t } \\[0.3em] \mathrm{ \int _{a}^{b} d~t \cdot \int _{a}^{b}\mu (t)X(t)d~t - \int _{a}^{b} \mu \left( t \right) d~t \cdot \int _{a}^{b}X(t)d~t } \end{bmatrix} }\)
Evidently we could choose two functions \( l\mathrm{ \left( t \right) }\) and \( \mathrm{ d \left( t \right) }\) and take \( \mathrm{( \int _{a}^{b}\mathit{l} \left( t \right) X \left( t \right) d~t, \int _{a}^{b}d \left( t \right) X \left( t \right) d~t ) }\)as estimators of \(\mathrm{ \left( \lambda , \delta \right) }\) with the condition of quasi-linearity or of unbiasedness; \(\mathrm{ \left( \lambda ^{**}, \delta ^{**} \right) }\), is one of these choices, optimization being practically impossible.
The complex structure of these estimators suggests the use of the ergodic theorem, when possible. As shown before,
\(\mathrm{ \frac{1}{log \left( b/a \right) } \int _{a}^{b}\frac{Z \left( t \right) }{t}d~t-log\sqrt[]{a~b}\stackrel{m~s}\rightarrow \gamma ~as\,\frac{b}{a} \rightarrow + \infty }\)
or, under a more general form,
\(\mathrm{ \frac{1}{log \left( b/a \right) } \int _{a}^{b}\frac{X \left( t \right) }{t}d~t- \delta \,log\sqrt[]{a~b}\stackrel{m~s}\rightarrow \lambda + \gamma ~ \delta }\).
If we take \(\mathrm{ log\sqrt[]{a~b}= \chi - \gamma ~with~{b}/{a} \rightarrow + \infty~or~a=e^{{2} \left( \chi - \gamma \right) /b},b \rightarrow + \infty }\)we see that
\(\mathrm{ \frac{1}{2 \left( log\,b- \left( \chi - \gamma \right) \right) } \int _{e^{{2} \left( \chi - \gamma \right) /b}}^{b}\frac{X \left( t \right) }{t}d~t\stackrel{m~s}\rightarrow \lambda + \chi ~ \delta ~when~b \rightarrow + \infty }\).
As, then, \(\mathrm{ {b}/{a}=b^{2}/e^{{2} \left( \chi - \gamma \right) }}\) increases with \(\mathrm{ {b} }\), if we know \(\mathrm{ X \left( t \right) }\) in an interval \(\mathrm{ \left[ A,B \right] }\) we can estimate \(\mathrm{ \varphi= \lambda + \chi ~ \delta }\) taking \(\mathrm{ b=B }\) and \(\mathrm{ a=e^{{2} \left( \chi - \gamma \right)} /B~if~a>A }\). Thus the estimation is obtained by taking \(\mathrm{ \varphi ^{*}=\frac{1}{2 \left( log\,B- \left( \chi - \gamma \right) \right) } \int _{e^{{2} \left( \chi - \gamma \right) /b}}^{b}\frac{X \left( t \right) }{t}d~t }\) when \(\mathrm{ AB<e^{{2} \left( \chi - \gamma \right)} ~or~ \chi \geq \gamma +log\sqrt[]{AB} }\), thus allowing, the estimators of large quantiles, which is important in the applications.
The variance of this estimator of \(\mathrm{ \lambda + \chi ~ \delta }\) is given by
\(\mathrm{ \frac{V \left( \varphi^{*} \right) }{ \delta ^{2}}=\frac{1}{4 \left( log~B- \left( \chi - \gamma \right) \right) ^{2}} \iint_{{e^{{2} \left( \chi - \gamma \right)} }/{B}}^{B} \int _{}^{}\frac{ \bar{\sigma} \left( s,t \right) }{s~t}d~s~d~t }\)
\(\mathrm{ =\frac{1}{2 \left( log~B- \chi + \gamma \right) ^{2}} \iint\limits _{e^{{2} \left( \chi - \gamma \right)} /B \leq s \leq t \leq B}^{} \int _{}^{}R \left( s/t \right) \frac{d~s~d~t}{s~t}}\)
\(\mathrm{ =\frac{1}{2~ \left( log~B- \chi + \gamma \right) ^{2}} \int _{e^{{2} \left( \chi - \gamma \right)} /B^{2}}^{1}\frac{R \left( u \right) }{u}d~u \int _{e^{{2} \left( \chi - \gamma \right)} /B~u}^{B}\frac{d~t}{t} }\)
\(\mathrm{ =\frac{1}{2 \left( log~B- \chi + \gamma \right) ^{2}} \int _{e^{{2} \left( \chi - \gamma \right)} /B^{2}}^{1}\frac{R \left( u \right) }{u}d~u~log\frac{B^{2}u}{e^{{2} \left( \chi - \gamma \right)} } }\)
\(\mathrm{ =\frac{1}{2 \left( log~B- \chi + \gamma \right) ^{2}} \int _{e^{{2} \left( \chi - \gamma \right)} /B^{2}}^{1}\frac{R \left( u \right) }{u} ( 2 \left( log~B- \chi + \gamma \right) +log~u ) d~u }\)
\(\mathrm{ =\frac{1}{2 \left( log~B- \chi + \gamma \right) ^{2}} \{ R \left( u \right) \left[ 2 \left( log~B- \chi + \gamma \right) log~u \right) +\frac{log^{2}u}{2} ] \vert _{{e^{{2} \left( \chi - \gamma \right)} }/{B^{2}}}^{1}- }\)
\(\mathrm{ - \int _{{e^{{2} \left( \chi - \gamma \right)} }/{B^{2}}}^{1} ( 2 \left( log~B- \chi + \gamma \right) +log~u+\frac{log^{2}~u}{2} ) R^{’} \left( u \right) ~d~u \} }\)
\(\mathrm{ =\frac{1}{2 \left( log~B- \chi + \gamma \right) ^{2}} \{ ( 2 ( logB- \chi + \gamma ) ^{2}R ( \frac{e^{{2} \left( \chi - \gamma \right)} }{B^{2}} ) }\)
\(\mathrm{ - \int _{{e^{{2} ( \chi - \gamma )} }/{B^{2}}}^{1} ( 2 ( log~B- \chi + \gamma ) +log\,u+\frac{log^{2}u}{2} ) R^{’} ( u ) d~u \} =R ( {e^{{2} ( \chi - \gamma ) }}/{B^{2}} ) }\)
\(\mathrm{ -\frac{1}{2 \left( log~B- \chi + \gamma \right) ^{2}}~ \int _{{e^{{2} \left( \chi - \gamma \right)} }/{B^{2}}}^{1} ( 2 ( log\,B- \chi + \gamma ) log~u+\frac{log^{2}u}{2} ) \frac{log~u}{1-u}d~u }\)
\(\mathrm{ =R( {e^{{2} ( \chi - \gamma )} }/{B^{2}} ) -\frac{1}{2 ( log~B- \chi + \gamma ) ^{2}} \int _{{e^{{2} ( \chi - \gamma ) }}/{B^{2}}}^{1} ( 2 ( logB- \chi + \gamma ) log~u+\frac{log^{2}~u}{2} ) \frac{log~u}{1-u}~d~u }\) \(\mathrm{ =R ( {e^{{2} \left( \chi - \gamma \right)} }/{B^{2}} ) - \int _{{e^{{2} \left( \chi - \gamma \right)} }/{B^{2}}}^{1}\frac{log^{2}u}{2}d~u-\frac{1}{4 \left( log~B- \chi + \gamma \right) ^{2}} \int _{{e^{{2} \left( \chi - \gamma \right)} }/{B^{2}}}^{1}\frac{log^{3}~u}{1-u}~d~u }\)
Analogously we could chose a point \(\mathrm{ C \left( 0<A<C<B \right) }\) and study the stochastic integrals
\(\mathrm{ \frac{1}{log\,C/A} \int _{A}\frac{\chi \left( t \right) }{t}d~t }\) and \(\mathrm{ \frac{1}{log\,B/C} \int _{C}^{B}\frac{\chi \left( t \right) }{t}d~t }\)
To have \(\mathrm{ min ( log\frac{C}{A},log\frac{B}{C}) }\) the largest possible we must take \(\mathrm{ C=\sqrt[]{AB}~with~\frac{C}{A}=\sqrt[]{B/A}=\frac{B}{C} }\). The convex combination of those two integrals with coefficients \(\mathrm{ \frac{log\sqrt[]{CB- \left( \chi - \gamma \right) }}{log\sqrt[]{B/A}} }\) and \(\mathrm{ \frac{ \left( \chi - \gamma \right) -log\sqrt[]{AC}}{log\sqrt[]{B/A}} }\) estimates, unbiasedly, and mean square, the quantile \(\mathrm{ \lambda + \chi ~ \delta }\) and consequently, for \(\mathrm{ \chi = \gamma +log\,t, \lambda ^{*}+ \left( \gamma +log~t \right) \delta ^{*} }\) is the natural linear predictor of \(\mathrm{ \chi \left( t \right) }\).
An important question in the integrals above is the assumed existence of jumps in the interval of observation \(\mathrm{ \left[ A,B \right] }\). Suppose that \(\mathrm{ \chi \left( t \right) }\) has no jumps in \(\mathrm{ \left[ A,B \right] }\) a fact that has probability \(\mathrm{ A/B }\). Noting by \(\mathrm{ \chi }\)the value of \(\mathrm{ \chi \left( t \right) }\) in the interval the estimator
\(\mathrm{( \int _{A}^{B}\ell \left( t \right) \chi \left( t \right) d~t, \int _{A}^{B}d \left( t \right) \chi \left( t \right) dt) = ( \chi \int _{A}^{B} \ell \left( t \right) ,\chi \int _{A}^{B}d \left( t \right) d~t ) = \left( \chi,0 \right) \left( ! \right) }\)as by the quasilinearity we must have \(\mathrm{ \int _{A}^{B}\ell \left( t \right) dt=1~and \int _{A}^{B}d \left( t \right) dt=0 }\).
The probability of being capable to estimate is, then, \(\mathrm{ l-A/B }\).
In that case is easy to show that \(\mathrm{ \frac{X \left( B \right) -X \left( a \right) }{log\,B/A}\stackrel{m\,s}\rightarrow \delta }\) if \(\mathrm{ {B}/{A} \rightarrow + \infty }\) but an analogous estimator of \(\mathrm{ \lambda }\)based in the extreme values \(\mathrm{ \chi \left( A \right) }\) and \(\mathrm{ \chi \left( B \right) }\) does not converge in m.s. to \(\mathrm{ \lambda }\).
The study of the jump process does not see to illuminate the structure of the location parameter \(\mathrm{ \left( \lambda \right) }\) or of the change of time unit \(\mathrm{ \left( \beta \right) }\). In fact, the extremal process \(\mathrm{ Z \left( t \right) }\) -and the processes \(\mathrm{ \lambda + \delta ~Z \left( t \right) ~or~ \delta ~Z \left( \beta ~t \right) }\) -have a random number of jumps \(\mathrm{ N \left( t_{1},t_{2} \right) }\) in the interval \(\mathrm{ \left[ t_{1},t_{2} \right] \left( 0<t_{1} \right) }\) with a Poisson distributions with parameter \(\mathrm{ \left( =mean~value=variance \right) log~\frac{t_{2}}{t_{1}};in \left[ 0,t \right] }\) there is an infinity of jumps with probability one. If \(\mathrm{ {t_{1}} }\) is a time instant (be it a jump instant or not) the probability that the next jump instant is \(\mathrm{ <t_{2} }\) is given by \(\mathrm{ 1-t_{1}/t_{2} \left( t_{1}<t_{2} \right) }\). In any case a change of time unit does not alter the structure.
The mean value of inter-jump times is
\(\mathrm{ M \left( t_{i+1},-t_{2} \right) = \int _{t_{i}}^{+ \infty} \left( t_{i+1},-t_{1} \right) d ( 1-t_{i} \left( t_{i+1} \right) =+ \infty) }\)
which shows that the averages of inter-jump times is not useful. Also the median of \(\mathrm{ t_{i+1} }\) is \(\mathrm{ 2\, t_{i} }\), independent of the time unit.
As \(\mathrm{ M \left( t_{i+1},-t_{i} \right) ^{ \alpha }=t_{i}^{ \alpha } \int _{0}^{+ \infty} \xi ^{ \alpha }~d ( \frac{ \xi }{1+ \xi } ) =t_{i}^{ \alpha }~\frac{ \alpha ~ \pi }{sin \pi } }\) if \(\mathrm{ \alpha <1 }\) and nothing is gained because the effect of a change of time unit is cancelled; variance exists if \(\mathrm{ \alpha <1 /2 }\). Also we have
\(\mathrm{ Prob \{ \frac{t_{i+1},-t_{i}}{t_{i}} \leq \xi \} =\frac{ \xi }{ \xi +1} }\).
For other details see Deheuvels (1981, 1983) and references therein.
Fisz, M., 1962. Probability Theory and Mathematical Statistics, Wiley, New York.
Galambos, J., 1978. The Asymptotic Theory of Extremes Order Statistics, Wiley, New -York; (1987), 2nd ed., Krieger Publ.
Gumbel, E. J., 1958. Statistics of Extremes, Columbia University Press, New York.
Mann, Henry B., 1953. An Introduction to the Theory of Stochastic Process with Continuous Parameter. Nat. Bur. Standards, Appl. Math. Series, (24), Washington.
Resnick, S. I., 1973. Extreme processes and record time values. J. Appl. Prob., 10, 863-868.
Resnick, S. I., 1975. Weak convergence to extremal processes. Ann. Prob., 3, 951-1960.
Selvey, S. D., 1975. Statistical Inference, Chapman and Hall, London
Tiago de Oliveira, J, 1972. Statistics for Gumbel and Fréchet distributions. Structural Safety and Reliability, A. M. Freudenthal ed., 91-105.
Tiago de Oliveira, J, 1982. A definition of estimator efficiency in k-parameter case. Ann. Inst. Statist. Math., 14,(3), A, 411-421, Tokyo.
Tiago de Oliveira, J., 1962/63. Structure theory of bivariate extremes: extensions. Estudos Mat., Estatist. Econometria, VII, 165-194, Lisboa.
Tiago de Oliveira, J., 1968. Extremal processes; definition and properties. Publ. Inst. Statist. Univ. Paris, XII, (2), 25-36.