1-10

Statistical Theory of Extremes

Introduction

José Tiago de Fonseca Oliveira 1

1.Academia das Ciências de Lisboa (Lisbon Academy of Sciences), Lisbon, Portugal.

23-06-2017
28-12-2016
28-12-2016
28-12-2016

Graphical Abstract

Highlights

  1. The purpose of STATISTICAL THEORY OF EXTREMES is to deal with, analyze and predict (or forecast) aspects of natural phenomena that correspond to the largest or smallest values of sampled data, or to over - or under passing some level that sometimes lead to natural disasters.
  2. Floods, large fire claims, heavy rains, gusts of wind, and large waves, are examples of maxima or largest values of samples, as well as droughts, breaking strength of materials, failure of equipment or apparatus, low temperatures, etc. are examples of minima or smallest values of samples.
  3. In fact, in data analysis, we can sometimes proceed through a sequence of models, adapted to the case.
  4. The applications will be based on the use of asymptotic distributions as to the description of extremes, useful for forecasting: so obtaining asymptotic distributions, as well some reference to the speed of convergence, plays an essential role in the book.
  5. Case studies, as in the last chapter and in the statistical chapters of the book, may be of help to practitioners.

Abstract

Statistical Theory of Extremes deals with analysis and predict the aspects of natural phenomena using univariate or multivariate sampled data. Natural phenomena, as floods, large fire claims, heavy rains, gusts of winds, and large waves analysed using suitable or available samples of random sequences or of stochastic processes. Interplay between order statistics and exceedances. Exact distribution of order statistics. Asymptotic behaviour of the sample quantiles, i.e. of central order statistic, as an approximation tool for efficient estimation of “extreme” quantiles of the design of structures, evaluation of extreme conditions or forecasting catastrophes. The book deals with behaviour of maxima with conversion to minima results immediate.

Keywords

Asymptotic behaviours , Asymptotic distribution , Exceedances , Order statistics , Weibull distribution , Stochastic Model , Experimental digression

1 . The general framework

The purpose of STATISTICAL THEORY OF EXTREMES is to deal with, analyse and predict (or forecast) aspects of natural phenomena that correspond to the largest or smallest values of sampled data, or to over - or under passing some level that sometimes lead to natural disasters. Floods, large fire claims, heavy rains, gusts of wind, and large waves, are examples of maxima or largest values of samples, as well as droughts, breaking strength of materials, failure of equipment or apparatus, low temperatures, etc. are examples of minima or smallest values of samples, of extremes of samples or extreme order statistics; many times not only the first but the second, third, etc. extremes are relevant.

Those problems can be analysed using suitable or available samples — according to the manageability of technique, error allowance for models, convenience and simplicity, etc.— as univariate samples, multivariate samples or excursion(s) of random sequences or of stochastic processes (if time is continuous). Evidently, from the philosophical point of view, as a rule, natural phenomena should be considered as multivariate random sequences or stochastic processes (according to the discreteness or the continuity of the time set). This procedure, however sound it may be, by its complication would not allow for a (practical) description of the observations (the past and/or present) but also to predict about the next observations (the future) to legitimate sound inference and/or decision design.

In fact, in data analysis, we can sometimes proceed through a sequence of models, adapted to the case. Case studies, as in the last chapter and in the statistical chapters of the book, may be of help to practitioners.

We have not tried to prove all results, to keep the book to a manageable size (though not necessarily completely balanced) and to allow for easier use for data analysis and design; essential references are given for the proved (sometimes in a new or different way) or unproved results.

Basically, the applications will be based on the use of asymptotic distributions as approximations (better or worse) to the description of extremes, useful for forecasting: so obtaining asymptotic distributions, as well some reference to the speed of convergence, plays an essential role in the book.

If the approximation is good we will, with good efficiency, estimate or predict “extreme” quantiles (for large or small probabilities) for the design of structures, evaluation of extreme conditions, forecasting catastrophes, etc.

As a rule, the behaviour of sums and of extremes (maxima and minima) of samples are different and not strongly related, although there are some connexion as regards convergence of averages and of extremes. But, as a general rule, in the i.i.d. case, characteristic functions for sums vs. distribution functions for maxima (or survival functions for minima), the linear combination aX+bY vs. the maximal combination max ( X+a, Y+b), seem to play a similar or dual role; but the duality almost stops here. The Central Limit Theorem leads to the normal distribution (at least when the variance exists) or more generally to the infinitely divisible distributions with a central role for the normal, but the Extremal Limit Theorem leads to three asymptotic distributions (which can be integrated in one form, with a shape parameter) where the Gumbel distribution plays a central role: not only are both distributions indefinitely differentiable but, also, sums of Gumbel random variables are asymptotically normal as well as maxima or minima of normal random variables have asymptotically the Gumbel distribution. It does not seem that we can extend the analogies and this “experimental” digression will play only a vague role — behind the scenes — in the book. But see the Annex 1, of Part 1, for more details that extend to bivariate distributions.

Let us consider one example to clarify the use of the theory of statistical extremes. Suppose that we are considering the levels (or discharges) of a river at some fixed point and that we are interested in the yearly floods or droughts at that point. As the flood (drought) is the maximum (minimum) of the discharges we would be led to consider the exact statistical behaviour of the maxima (minima). The difficulties of dealing with them and the fact the yearly maximum considered is obtained from a large number of (daily) observations suggest using the asymptotic theory of maxima, whose distribution is known except for two or three parameters. The validity of this approximate substitution will be considered and referred to. A set of 60 or more years of observations in general gives sufficient data to estimate those unknown parameters with adequate accuracy and allows subsequent use of the asymptotic distribution function of the maxima or minima.

For bivariate extremes the situation is similar, although more complicated. Suppose that we are measuring the discharges of a river at two different points every day, say, at noon. It is evident that the measurements are positively correlated, the correlation being stronger and stronger as the points are closer and closer. From this series of daily measurements (pairs) we can obtain a sample sequence of pairs of maxima (the floods). We will see later how to deal with those samples of (yearly) pairs of maxima, considered as random pairs.

As one year is a relatively large sample, we will use the large sample (asymptotic) distribution of pairs of maxima as a substitute for its very complex and unknown actual distribution, but the representation of the large sample distribution of pairs of maxima has some unknown parameters and a dependence function.

The sample of pairs of maxima obtained in 50, 60, ... , l00 years of observations can then be used to solve the statistical decision problems concerning the unknown elements of the representation, specifically those that are most important in engineering applications: estimation of the parameters and testing hypotheses about them, testing for the independence of the maxima, and estimation of the dependence function.

We will not refer specially to the behaviour of minima (droughts, for example) because it is similar to that of maxima; we will recall the technique of conversion of results concerning maxima into results concerning minima.

2 . The stochastic model

Let \(\{X_l,...,X_n\}\) be a set of multidimensional random vectors with the joint distribution function \(F_n(x_l,...,x_n) = Prob \{X_l\leq{}x_l,...,X_n\leq{}x_n\}\) and with survival function \(S_n(x_l,...,x_n) = Prob \{X_l>x_l,...,X_n>x_n\};\)the relations between \(F_n(x_l,...,x_n)\)  and \(S_n(x_l,...,x_n)\)   are known in the continuity points of the margins being extended afterwards by right continuity for \(F_n\) and \(S_n\).

We have

\(Prob \{ \begin{array}{c} \mathrm{n} \\ \mathrm{max} \\ \mathrm{1} \end{array} \ \{{\mathrm{X}}_{\mathrm{i}}\}\mathrm{\le }\mathrm{x}\}\mathrm{=}{\mathrm{F}}_{\mathrm{n}}\left(\mathrm{x,\dots ,x}\right)\mathrm{=}{\mathrm{F}}_{\mathrm{n}}\left(\mathrm{x}\right)\) and

\(Prob \{ \begin{array}{c} \mathrm{n} \\ \mathrm{max} \\ \mathrm{1} \end{array} \ \{{\mathrm{X}}_{\mathrm{i}}\}\mathrm{\le }\mathrm{x}\}\mathrm{=}{\mathrm{S}}_{\mathrm{n}}\left(\mathrm{x,\dots ,x}\right)\mathrm{=}{\mathrm{S}}_{\mathrm{n}}\left(\mathrm{x}\right)\) in any point  we see that

\(Prob \left\{ \begin{array}{c} \mathrm{n} \\ \mathrm{max} \\ \mathrm{1} \end{array} \mathrm{\ }{\mathrm{X}}_{\mathrm{i}}\mathrm{>x}\right\}\mathrm{=1-}{\mathrm{F}}_{\mathrm{n}}\left(\mathrm{x,\dots ,x}\right)\mathrm{=1-}{\mathrm{F}}_{\mathrm{n}}\left(\mathrm{x}\right)\ \)and

\(Prob \{ \begin{array}{c} \mathrm{n} \\ \mathrm{min} \\ \mathrm{1} \end{array} \mathrm{\ }{\mathrm{X}}_{\mathrm{i}}\mathrm{>x}\}\mathrm{=1-}{\mathrm{S}}_{\mathrm{n}}\left(\mathrm{x,\dots ,x}\right)\mathrm{=1-}{\mathrm{S}}_{\mathrm{n}}\left(\mathrm{x}\right)\)

Evidently as

\(\begin{array}{c} \mathrm{n+1} \\ \mathrm{max} \\ \mathrm{1} \end{array} {\mathrm{\ X}}_{\mathrm{i}}\mathrm{\ge } \begin{array}{c} \mathrm{n} \\ \mathrm{max} \\ \mathrm{1} \end{array} \ \ {\mathrm{X}}_{\mathrm{i}} \)  we have \( {\mathrm{\ \ \ \ \ \ \ \ \ F}}_{\mathrm{n}}\left(\mathrm{x}\right)\mathrm{\ge }{\mathrm{F}}_{\mathrm{n+1}}\mathrm{(x)}\)

and as

\(\begin{array}{c} \mathrm{n+1} \\ \mathrm{min} \\ \mathrm{1} \end{array} {\mathrm{X}}_{\mathrm{i}}\mathrm{\le } \begin{array}{c} \mathrm{n} \\ \mathrm{min} \\ \mathrm{1} \end{array} {\mathrm{\ \ X}}_{\mathrm{i}}\) we obtain \( {\mathrm{\ \ \ \ \ \ \ \ \ S}}_{\mathrm{n}}\left(\mathrm{x}\right)\mathrm{\ge }{\mathrm{S}}_{\mathrm{n+1}}\mathrm{(x)}\)

Note that if \(\mathrm{h}\left(.\right)\) is an increasing continuous function then

\(\begin{array}{c} \mathrm{n} \\ \mathrm{max\ } \\ \mathrm{1} \end{array} \{\mathrm{h}\left({\mathrm{X}}_{\mathrm{i}}\right)\}\mathrm{=h} \begin{array}{c} \mathrm{n} \\ \mathrm{(max} \\ \mathrm{1} \end{array} \mathrm{\ }{\mathrm{X}}_{\mathrm{i}})\) and \(\begin{array}{c} \mathrm{n} \\ \mathrm{min\ } \\ \mathrm{1} \end{array} \{\mathrm{h}\left({\mathrm{X}}_{\mathrm{i}}\right)\}\mathrm{=h} \begin{array}{c} \mathrm{n} \\ \mathrm{(mix} \\ \mathrm{1} \end{array} \mathrm{\ }{\mathrm{X}}_{\mathrm{i}})\);

if \(\mathrm{h}\left(.\right)\)  is a decreasing continuous function we obtain

\(\begin{array}{c} \mathrm{n} \\ \mathrm{max\ } \\ \mathrm{1} \end{array} \{\mathrm{h}\left({\mathrm{X}}_{\mathrm{i}}\right)\}\mathrm{=h} \begin{array}{c} \mathrm{n} \\ \mathrm{(max} \\ \mathrm{1} \end{array} \mathrm{\ }{\mathrm{X}}_{\mathrm{i}})\) and \(\begin{array}{c} \mathrm{n} \\ \mathrm{min\ } \\ \mathrm{1} \end{array} \{\mathrm{h}\left({\mathrm{X}}_{\mathrm{i}}\right)\}\mathrm{=h} \begin{array}{c} \mathrm{n} \\ \mathrm{(mix} \\ \mathrm{1} \end{array} \mathrm{\ }{\mathrm{X}}_{\mathrm{i}})\)

When we take h(u) = - u we obtain the essential equalities

\(\begin{array}{c} \mathrm{n} \\ \mathrm{max} \\ \mathrm{1} \end{array} \mathrm{\ }\mathrm{\{}{\mathrm{X}}_{\mathrm{i}}\}\mathrm{=-} \begin{array}{c} \mathrm{n} \\ \mathrm{min} \\ \mathrm{1} \end{array} \{{\mathrm{-}\mathrm{X}}_{\mathrm{i}}\}\) and \(\begin{array}{c} \mathrm{n} \\ \mathrm{min\ } \\ \mathrm{1} \end{array} \{{\mathrm{X}}_{\mathrm{i}}\}\mathrm{=-} \begin{array}{c} \mathrm{n} \\ \mathrm{max} \\ \mathrm{1} \end{array} \{\mathrm{-}{\mathrm{X}}_{\mathrm{i}}\}\).

These relations translate maxima results into minima results and vice -versa. In this book we will almost always deal with maxima results, the conversion to minima results following immediate; the exceptions will be, essentially, the study of the Weibull distribution for minima and the study of the dependence function with (standard) exponential margins in bivariate extremes.

Notice that these simple results in h(.) — a very general transformation — may after be helpful in dealing with data, by a convenient transformation.

Let us apply the results above to the i.i.d. case; we have \({\mathrm{F}}_{\mathrm{n}}\left(\mathrm{x}\right)\mathrm{=}{\mathrm{F}}^{\mathrm{n}}_{\mathrm{1}}\left(\mathrm{x}\right)\mathrm{=(1-}{\mathrm{S}}_{\mathrm{1}}\left(\mathrm{x}\right){\mathrm{)}}^{\mathrm{n}}\) and \( {\mathrm{S}}_{\mathrm{n}}\left(\mathrm{x}\right)\mathrm{=}{\mathrm{S}}^{\mathrm{n}}_{\mathrm{1}}\left(\mathrm{x}\right)\mathrm{=(1-}{\mathrm{F}}_{\mathrm{1}}\left(\mathrm{x}\right){\mathrm{)}}^{\mathrm{n}}\).

 

3 . Order Statistics and exceedances

Let \(\mathrm{(}{\mathrm{X}}_{\mathrm{1}}\mathrm{,\dots ,}{\mathrm{X}}_{\mathrm{n}})\) be an i.i.d. univariate sample where \(\mathrm{F(x)}\) is the distribution function of the \({\mathrm{X}}_{\mathrm{i}}\). Let us order the sample as \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{1,n}}\mathrm{\le }{\mathrm{X}}^{\mathrm{'}}_{\mathrm{2,n}}\mathrm{\le }\mathrm{\dots }\mathrm{\le }{\mathrm{X}}^{\mathrm{'}}_{\mathrm{n,n}}\) (ascending order statistics) or \({\mathrm{X}}^{\mathrm{"}}_{\mathrm{1,n}}\mathrm{\ge }{\mathrm{X}}^{\mathrm{"}}_{\mathrm{2,n}}\mathrm{\ge }\mathrm{\dots }\mathrm{\ge }{\mathrm{X}}^{\mathrm{"}}_{\mathrm{n,n}}\) (descending order statistics); note that \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{=}{\mathrm{X}}^{\mathrm{"}}_{\mathrm{n+1-k,n}}\) and that \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{1,n}}\mathrm{=}{\mathrm{X}}^{\mathrm{"}}_{\mathrm{n,n}}\) is the minimum of the  sample, \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{n,n}}\mathrm{=}{\mathrm{X}}^{\mathrm{"}}_{\mathrm{1,n}}\) is the maximum of the sample, \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{2,n}}\mathrm{=}{\mathrm{X}}^{\mathrm{"}}_{\mathrm{n-1,n}}\) and \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{n-1,n}}\mathrm{=}{\mathrm{X}}^{\mathrm{"}}_{\mathrm{2,n}}\) are second minimum and maximum (second extremes), \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{=}{\mathrm{X}}^{\mathrm{"}}_{\mathrm{n+1-k,n}}\) and \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{n+1-k,n}}\mathrm{=}{\mathrm{X}}^{\mathrm{"}}_{\mathrm{k,n}}\) are the k-th minimum and maximum. When possible and clear we will use the simple notations \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k}}\) and \({\mathrm{X}}^{\mathrm{"}}_{\mathrm{k}}\) .

The distribution function of   \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{=}{\mathrm{X}}^{\mathrm{"}}_{\mathrm{n+1-k,n}}\) is

\({\mathrm{F}}_{\mathrm{k,n}}\left(\mathrm{x}\right)\mathrm{=Prob}\mathrm{\{}{\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{\le }\mathrm{x}\mathrm{\}}\mathrm{=Prob\ }\mathrm{\{}\mathrm{k\ of\ more\ }{\mathrm{X}}_{\mathrm{i}}\mathrm{\le }\mathrm{x}\mathrm{\}}\mathrm{=\ } \sum^{\mathrm{n}}_{\mathrm{j=k}}{\left( \begin{array}{c} \mathrm{n} \\ \mathrm{j} \end{array} \right)\mathrm{F}{\left(\mathrm{x}\right)}^{\mathrm{j}}\mathrm{(1-F}\left(\mathrm{x}\right){\mathrm{)}}^{\mathrm{n-j}}} \)

by the binomial distribution.

If  \(\mathrm{F}\left(\mathrm{x}\right)\) has a probability density \(f(\mathrm{x}) =F\mathrm{'}\left(\mathrm{x}\right)\)  the probability density of \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{=}{\mathrm{X}}^{\mathrm{"}}_{\mathrm{n+1-k'}}\) is 

\({\mathrm{F}}^{\mathrm{'}}_{\mathrm{k,n}}\left(\mathrm{x}\right)\mathrm{=F}{\left(\mathrm{x}\right)}^{\mathrm{k-1}}\mathrm{(1-F}\left(\mathrm{x}\right){\mathrm{)}}^{\mathrm{n-k}}\ \mathrm{f(x)=k(} \begin{array}{c} \mathrm{n} \\ \mathrm{k} \end{array} \mathrm{)F}{\mathrm{(x)}}^{\mathrm{k-1}}{\mathrm{(1-F(x))}}^{\mathrm{n-k}} \)

Analogously the distribution function of \(\mathrm{(}{\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{\ ,}{\mathrm{X}}^{\mathrm{'}}_{\mathrm{m,n}}\mathrm{)=}\left({\mathrm{X}}^{\mathrm{"}}_{\mathrm{n+1-k,n}}\mathrm{\ ,}{\mathrm{X}}^{\mathrm{"}}_{\mathrm{n+1-m,n}}\right)\) can be written with \(1\mathrm{\le} k < m \mathrm{\le} n\) its probability density is immediately seen to be, for \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{=x\ ,\ }{\mathrm{X}}^{\mathrm{'}}_{\mathrm{m,n}}\mathrm{=y\ ,}\)

0=\(y\ \mathrm{\le }\mathrm{x}\)

\(=\ \frac{\mathrm{n!}}{\left(\mathrm{k-1}\right)\mathrm{!}\left(\mathrm{m-k-l}\right)\mathrm{!}\left(\mathrm{n-m}\right)\mathrm{!}}\mathrm{\ }{\mathrm{F}}^{\mathrm{k-1}}\left(\mathrm{x}\right)\mathrm{(F(y)-\ }{\mathrm{F(x)}}^{\mathrm{m-k-1}}{\left(\mathrm{1-F}\left(\mathrm{y}\right)\right)}^{\mathrm{n-m}}\mathrm{f(x)\ f(y)\ if\ y\ }\mathrm{\ge }\mathrm{\ x.\ }\)

From these distribution functions and their multivariate extensions — where concomitants can intervene, but which will not be dealt with in this book — we can obtain the asymptotic behaviour of the sample quantiles which arc central order statistics \({\mathrm{X}}^{\mathrm{'}}_{{\mathrm{k}}_{{\mathrm{n}}^{\mathrm{,n}}}}\) i.e., in a sequence of sample the kn-th ascending order statistic \({\mathrm{X}}^{\mathrm{'}}_{{\mathrm{k}}_{{\mathrm{n}}^{\mathrm{,n}}}}\)  is such that kn/n →p with 0< p < 1.

Let us now suppose that we are thinking of a second sample of m random variables with the same distribution function F(x). The probability that exactly J = j of the random variables of the second sample are smaller or equal to  \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}={\mathrm{X}}^{\mathrm{"}}_{\mathrm{n+1-k,n}}= \) x is evidently  \(\mathrm{(} \begin{array}{c} \mathrm{m} \\ \mathrm{j} \end{array} \mathrm{)} \mathrm{F}{\mathrm{(x)}}^{\mathrm{j}}\mathrm{\ (1\ -\ F}{\mathrm{(x))}}^{\mathrm{m-j}}\)  and, thus, the probability — before the first sample — that exactly j out of m observations are smaller or equal to the k-th ascending order statistic \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\) , of the first sample is

\(\mathrm{P(j,m}\mathrm{|}\mathrm{k,n)\ =}\int^{\mathrm{+}\mathrm{\infty }}_{\mathrm{-}\mathrm{\infty }}{\left( \begin{array}{c} \mathrm{m} \\ \mathrm{j} \end{array} \right)}{\mathrm{F}}^{\mathrm{j}}\mathrm{(x)\ (1\ -\ }{\mathrm{F(x))}}^{\mathrm{m-j}}\mathrm{\ \ k(} \begin{array}{c} \mathrm{n} \\ k \end{array} \mathrm{)\ }{\mathrm{F}}^{\mathrm{k-1}}\mathrm{(x)\ (1\ }--\mathrm{\ F}{\left(\mathrm{x}\right))}^{\mathrm{n-k}}\mathrm{\ f(x)dx} \)

\(\mathrm{=k}\left( \begin{array}{c} \mathrm{m} \\ \mathrm{j} \end{array} \right)\left( \begin{array}{c} \mathrm{n} \\ \mathrm{k} \end{array} \right)\int^{\mathrm{+}\mathrm{\infty }}_{\mathrm{-}\mathrm{\infty }}{{\mathrm{F}}^{\mathrm{j+k-1}}\mathrm{\ (x)(1-}{\mathrm{F(x))}}^{\mathrm{m-j+n-k}}\mathrm{\ f(x)dx}}\mathrm{\ } =\frac{\mathrm{k}\left( \begin{array}{c} \mathrm{n} \\ \mathrm{k} \end{array} \right)\left( \begin{array}{c} \mathrm{m} \\ \mathrm{j} \end{array} \right)}{\left(\mathrm{k+j}\right)\left( \begin{array}{c} \mathrm{n+m} \\ \mathrm{k+j} \end{array} \right)}\)

with

\(\sum^{\mathrm{m}}_{\mathrm{j=0}}{\frac{\mathrm{k}\left( \begin{array}{c} \mathrm{n} \\ \mathrm{k} \end{array} \right)\left( \begin{array}{c} \mathrm{m} \\ \mathrm{j} \end{array} \right)}{\left(\mathrm{k+j}\right)\left( \begin{array}{c} \mathrm{n+m} \\ \mathrm{k+j} \end{array} \right)}}=1. \)

Note the symmetry P(j,m \(\mathrm{|}\) k,n) = P(m – j, m \(\mathrm{|}\) n + 1 – k , n) and that this last probability is that exactly j observations of the second sample will be larger than or equal to \({\mathrm{X}}^"_{\mathrm{k}}={\mathrm{X}}^{\mathrm{'}}_{\mathrm{n+l-k'}}\) . These situations, for the second sample, are called exceedances: the underpassing of  \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{}\) or the overpassing of \({\mathrm{X}}^"_{\mathrm{k,n}}\ \).

Let us now compute the mean value and variance of the random number of exceedances J (below \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{}\)  or above \({\mathrm{X}}^"_{\mathrm{k,n}}\ \) ). We have

µ(m\(\mathrm{|}\) k,n)= M(J)\(\sum^{\mathrm{m}}_{\mathrm{j=0}}\) =  jP(j,m\(\mathrm{|}\) k,n) = \(\frac{\mathrm{m\ k}}{\mathrm{n+l}}\)

and 

 \(\sigma^2 \)(m\(\mathrm{|}\)k,n)= M(\({\mathrm{J}}^{\mathrm{2}} \))\(-\) \({\mathrm{M}}^{\mathrm{2}}\left(\mathrm{J}\right)=\mathrm{M}\left(\mathrm{J}\left(\mathrm{J-l}\right)\right)\mathrm{+M}\left(\mathrm{J}\right)-{\mathrm{M}}^{\mathrm{2}}\left(\mathrm{J}\right)\)

\(=\frac{\mathrm{m}\left(\mathrm{m+n+1\ }\right)}{{\mathrm{(n+l\ )}}^2\mathrm{\ (n+2)}}\mathrm{k\ (n+1\ -\ k)}\).

The variance is smaller for k = 1 or k= n and the ratio of the variances \(\sigma^2 \)(m\(\mathrm{|}\)1,n)=\(\sigma^2 \)(m\(\mathrm{|}\)n,n) to the variance at  \(\mathrm{k}=\frac{\mathrm{n+l}}{2}\) (the median if n is odd) is  \(\frac{\mathrm{4n}}{{\mathrm{(n+l\ )}}^2}\) ~ 4/n, which shows that, for exceedances, extremes are more reliable than central values, contrary to the usual belief.

Let us now consider, more generally, the random time T of the s-th exceedance (the underpassing of \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{}\) or the overpassing of \({\mathrm{X}}^"_{\mathrm{k,n}}\ \) ) in a (possibly) infinite second sample (m not fixed as before). Thus the random order T in which the exceedance happens does not relate to a fixed event — as in the (classical) return period situation — but to a random event, defined previously to the observation of the first sample of n random variables. Consequently we will not need the knowledge the distribution of  \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{}\) or of \({\mathrm{X}}^"_{\mathrm{k,n}}\ \), as was the case in the return period situation, because the set is defined previously to the observations: briefly speaking, we have an inverse binomial with an underlying random probability. The approach will, naturally, be non-parametric as for exceedances.

For the s-th underpassing of \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{k,n}}\mathrm{}\) at the T= j observation we have

P(j\(\mathrm{|}\)s,k,n)= Prob {T=  j\(\mathrm{|}\)s,k,n}= \(\int^{+\infty }_{-\infty }{[\ \left( \begin{array}{c} \mathrm{j-l} \\ \mathrm{s-l} \end{array} \right){\mathrm{F}}^{\mathrm{s}}}(\mathrm{x})\)  (1-F \({(\mathrm{x}))}^{\mathrm{j-s}}\)] x

[ n(\(\begin{array}{c} \mathrm{n} \\ \mathrm{k} \end{array} \)) F\({(\mathrm{x})}^{\mathrm{k-l}}{(\mathrm{l-F}\left(\mathrm{x}\right))}^{\mathrm{n-k\ }}\) f(x) ] dx,

where the meaning of the two brackets is clear from the previous study.

It is immediate that

\(\mathrm{P}\left(\mathrm{j}\mathrm{|}\mathrm{s,k,n}\right)\mathrm{=}\frac{\mathrm{k(} \begin{array}{c} \mathrm{n} \\ \mathrm{k} \end{array} \mathrm{)(} \begin{array}{c} \mathrm{j-1} \\ \mathrm{s-1} \end{array} \mathrm{)}}{\mathrm{(k+s)(} \begin{array}{c} \mathrm{n+j} \\ \mathrm{k+s} \end{array} \mathrm{)}} \)

with \(\sum^{\mathrm{+}\mathrm{\infty }}_{\mathrm{j=s}}{\mathrm{P}\left(\mathrm{j}\mathrm{|}\mathrm{s,k,n}\right)\mathrm{=1}}\) obviously. This probability is, clearly, also the probability that \({\mathrm{X}}^"_{\mathrm{k,n}}\ \) is overpassed at the j-th observation of the second sample the s-th time.

The mean value and variance of T are

\(\mu\left(\mathrm{s}\mathrm{|}\mathrm{k,n}\right)\mathrm{=M}\left(\mathrm{T}\right)\mathrm{=}\sum^{\mathrm{+}\mathrm{\infty }}_{\mathrm{j=s}}{\mathrm{j\ \ \ P}\left(\mathrm{j}\mathrm{|}\mathrm{s,k,n}\right)\mathrm{=}\frac{\mathrm{n\ \ s}}{\mathrm{k-1}}}\)

and

\(\sigma^2\left(\mathrm{s}\mathrm{|}\mathrm{k,n}\right)\mathrm{=M}\left({\mathrm{T}}^{\mathrm{2}}\right)\mathrm{-} {\mathrm{M}}^{\mathrm{2}}\left(\mathrm{T}\right)\mathrm{=M}\left(\mathrm{T}\left(\mathrm{T+1}\right)\right)\mathrm{-}\mathrm{M}\left(\mathrm{T}\right)\mathrm{-}{\mathrm{M}}^{\mathrm{2}}\left(\mathrm{T}\right) \)

\(=\frac{\mathrm{ns(n-k+1)(k+s-1)}}{{\left(\mathrm{k-1}\right)}^{\mathrm{2}}\mathrm{(k-2)}}\)

For \(\mathrm{k=1}\)  we have \(\mu \left(\mathrm{s}\mathrm{|1}\mathrm{,n}\right)\mathrm{=+}\mathrm{\infty }\)  which is easily understandable and for \(\mathrm{k=n}\)   we get  \(\mu \left(\mathrm{s}\mathrm{|}\mathrm{l,n}\right)\mathrm{=}\frac{\mathrm{n\ \ s}}{\mathrm{n-1}}\), results that could be obtained directly. Also for \(\mathrm{k=1}\) and \(\mathrm{k=n}\)  we have \(\sigma^2 \left(\mathrm{s}\mathrm{|1}\mathrm{,n}\right)\mathrm{=+}\mathrm{\infty }\) (as expected) and \(\sigma^2 \left(\mathrm{s}\mathrm{|2}\mathrm{,n}\right)\mathrm{=+}\mathrm{\infty }\).

It is convenient to compare the two different approaches of this section: exceedances and the random return period, both for the second sample, previously to the knowledge of the first sample, that, allowing for the use of non-parametric methods and overcomes the necessity of knowledge of the underlying continuous \(F(x)\). In the first case (exceedances) we study the number of underpassings or overpassings of some order statistics of the first sample in a finite number of observations as the second sample has an infinite size. For the random return period we accept, admit, or consider an infinite second sample (thus allowing for infinite i.i.d. observations) and we are dealing with a random stopping time. The analogy with the direct and inverse binomial is obvious.

4 . A note on the asymptotic behaviour of the sample quantiles

As it is well known the usual estimator of the unique theoretical quantile \(\chi_{\mathrm{p}}\left(0<\mathrm{p}<1\right)\)  is the order statistic \({\mathrm{X}}^{\mathrm{'}}_{\left[\mathrm{np}\right]\mathrm{+1}}\)  called the empirical or sample quatuile \(\chi^{\mathrm{*}}_{\mathrm{p}}\) which we will denote also by \({\mathrm{Q}}_{\mathrm{n}}\left(\mathrm{p}\right)\); note that \({\mathrm{X}}^{\mathrm{'}}_{\mathrm{1}}\mathrm{\le }{\mathrm{Q}}_{\mathrm{n}}\left(\mathrm{p}\right)\mathrm{\le }{\mathrm{X}}^{\mathrm{'}}_{\mathrm{n}}\ \mathrm{if\ }\left(0<\mathrm{p}<1\right)\). In the case of existence of a density \(\mathrm{f}\left(\mathrm{x}\right)\mathrm{=}{\mathrm{F}}^{\mathrm{'}}\mathrm{(x)}\) we know that \(\chi^{\mathrm{*}}_{\mathrm{p}}\mathrm{\ }{{\stackrel{\mathrm{P}}{\longrightarrow}}}\mathrm{\ }\chi_{\mathrm{p}}\)  and \(\sqrt{\mathrm{n}}\mathrm{\ f(}\chi_{\mathrm{p}}\mathrm{)}\frac{\chi^{\mathrm{*}}_{\mathrm{p}}\mathrm{-}\chi_{\mathrm{p}}}{\sqrt{\mathrm{p(1-p)}}}\)  is asymptotically standard normal if \(0<\mathrm{f}\left(\chi_{\mathrm{p}}\right)\mathrm{<}\mathrm{\infty }\); the joint distribution of the set of quantiles \(\left(\chi^{\mathrm{*}}_{{\mathrm{p}}_{\mathrm{1}}}\mathrm{,\ \dots ,\ }\chi^{\mathrm{*}}_{{\mathrm{p}}_{\mathrm{n}}}\right)={\mathrm{(Q}}_{\mathrm{n}}\left({\mathrm{p}}_{\mathrm{1}}\right)\mathrm{\dots ,\ }{\mathrm{Q}}_{\mathrm{n}}\left({\mathrm{p}}_{\mathrm{k}}\right))\) is such that \(\{\sqrt{\mathrm{n}}\mathrm{\ f}\left(\chi_{{\mathrm{p}}_{\mathrm{i}}}\right)\frac{\chi^{\mathrm{*}}_{{\mathrm{p}}_{\mathrm{i}}}\mathrm{-}\chi_{{\mathrm{p}}_{\mathrm{i}}}}{\sqrt{{\mathrm{p}}_{\mathrm{i}}\left(\mathrm{1-}{\mathrm{p}}_{\mathrm{i}}\right)}}\}\) is asymptotically multinormal with standard margins and correlation coefficients, for \({\mathrm{P }}_{\mathrm{i}}\mathrm{<}{\mathrm{P }}_{\mathrm{j}}\)\({\mathrm{\rho }}_{\mathrm{ij}}\sqrt{\frac{{\mathrm{P }}_{\mathrm{i}}}{{\mathrm{P }}_{\mathrm{j}}}}\frac{\mathrm{1-}{\mathrm{P }}_{\mathrm{j}}}{\mathrm{1-}{\mathrm{P }}_{\mathrm{i}}}\mathrm{\ }\left(\mathrm{>0}\right)\); for details see Cramér (1946).