301-310

Statistical Theory of Extremes

Multivariate Extremes

José Tiago de Fonseca Oliveira 1

1.Academia das Ciências de Lisboa (Lisbon Academy of Sciences), Lisbon, Portugal.

23-06-2017
28-12-2016
28-12-2016
28-12-2016

Graphical Abstract

Highlights

Abstract

The approach to multivariate extreme distributions is similar to that outlined previously for bivariate extreme distributions. The prosbabilistic theory of multivariate extremes and generalizations of bivariate models are discussed with some notes on linear regression. The models are important as the ways to predict future, possibly dangerous events like floods, high tides, gusts of winds, etc.

Keywords

Bivariate models , Correlation coefficients , Linear regression , Multivariate extreme distribution , Prosbabilistic theory , Variance

1 . Introduction

The theory of multivariate extremes is currently at a much less advanced stage than that of bivariate extremes with regard to both properties and modelling.

When thinking of floods at different points of the same river (or hydrographic basin), high tides in different ports of the same geographical area, or gusts of wind in different stations of a meteorological system, it is very easy to see how modelling is definitely needed but delayed — it must be recalled that there are still many open questions for bivariate extremes. The importance of models stems not only from their use in description but also from the important need of ways to predict future, possibly dangerous, events. As a simple example, think of a flood that can submerge river-bank villages, with the usual loss of lives and property, if warning is not given in time. It seems that we are still very far from this stage of knowledge.

Even so we will obtain the limiting forms (with Gumbel margins), obtain some inequalities, give independence conditions, and advance some models.

2 . The probabilistic theory of multivariate extremes

The approach to multivariate extreme distributions is similar to that outlined previously for bivariate extreme distributions.

Let \(\mathrm{ F ( x_{1}, \dots,x_{m} ) }\) be a multivariate distribution function of a random \(\mathrm{ m }\)-dimensional vector \(\mathrm{ \left( X_{1}, \dots,X_{m} \right) }\). The distribution function of the maximum of each coordinate, in a sample of \(\mathrm{ n }\) independent random vectors with the distribution function \(\mathrm{ F ( x_{1}, \dots,x_{m} ) }\), is

\(\mathrm{ F ( x_{1}, \dots,x_{m} ) }\) .

Evidently the point whose \(\mathrm{ m }\) coordinates are the maxima of each coordinates is a virtual point, not observed in general.

Suppose now that for the one-dimensional margins \(\mathrm{ F_{i} ( x_{i} ) =F ( + \infty,\dots,x_{i},\dots,+ \infty ) }\)there exist attraction coefficients \(\mathrm{ \lambda _{i}^{ \left( n \right) } }\) and \(\mathrm{ \delta _{i}^{ \left( n \right) } \left( >0 \right) }\) such that

\(\mathrm{ F_{i}^{n} ( \lambda _{i}^{ \left( n \right) }+ \delta _{i}^{ \left( n \right) }x_{i} ) \begin{array}{c}\\ \mathrm{ \rightarrow} \\ \mathrm{n \rightarrow \infty} \end{array}\Lambda \left( x_{i} \right) }\)

and that

\(\mathrm{ \Lambda ( x_{1},\dots,x_{m} ) =\begin{array}{c} \\ \mathrm{ lim} \\ \mathrm{n \rightarrow \infty} \end{array}F^{n} ( \lambda _{1}^{ ( n ) }+ \delta _{1}^{ ( n) }x_{1},\dots, \lambda _{m}^{ ( n ) }+ \delta _{m}^{ ( n ) }x_{m} ) }\)

exists.

It was proved in Tiago de Oliveira (1958) and Geffroy (1958/59), in a way similar the one used above, that:

\(\Lambda ( x_{1},\dots,x_{m} ) \) is a stable distribution function, i.e., a distribution function of the form

\(\Lambda ( x_{1},\dots,x_{m} ) =exp \{{ - \left( e^{-x_{1}}+\ldots+e^{-x_{m}} \right) k \left( x_{2}-x_{1},\dots,x_{m}-x_{1} \right) }\} \)

\(= \{ \Lambda \left( x_{1} \right) \ldots \Lambda \left( x_{m} \right) \} ^{k \left( x_{2}-x_{1},\dots,x_{m}-x_{1} \right) } \).

For other details see Gumbel (1961), Tiago de Oliveira (1962/63), and

Deheuvels (1984).

Evidently, if the margins are Weibull or Fréchet for maxima or Weibull, Gumbel or Fréchet for minima, the usual transformations can reduce them to the case above. Special mention can be made of the situation for minima with standard exponential margins. Putting \(\mathrm{ y_{i}=e^{-x_{i}} }\), as before, we have for the survival function  \(\mathrm{ \left( y_{i} \geq 0 \right) }\),

\(\mathrm{ S ( y_{1},\dots,y_{m}) =exp \{ - ( y_{1}+\ldots+y_{m} ) k ( log\frac{y_{1}}{y_{2}},\dots,log\frac{y_{1}}{y_{n}} ) \} }\)

which, by analogy with the \(\mathrm{ A }\) function of the bivariate extremes, can be written as

\(\mathrm{ S \left( y_{1},\dots,y_{m} \right) =exp \{ - \left( y_{1}+\ldots+y_{m} \right) A ( \frac{y_{2}}{y_{1}+y_{2}},\dots,\frac{y_{m}}{y_{1}+y_{m}} ) \} }\).

We can now obtain inequalities connecting \(\mathrm{ \Lambda ( x_{1}, \dots,x_{m} ) }\) with the margins. As the one-dimensional margins are \(\mathrm{ \Lambda \left( x_{i} \right) }\), from the Fréchet (1940) inequality

\(\mathrm{ max ( 0, \sum _{i}^{} \Lambda \left( x_{i} \right) -m+1 ) \leq \Lambda ( x_{1},\dots,x_{m} ) \leq min ( \Lambda \left( x_{1},\dots,x_{m} \right) ) }\),

using the stability condition

\(\mathrm{ \Lambda ^{k} ( x_{1}+log~k,\dots,x_{m}+log~k ) = \Lambda ( x_{1},\dots,x_{m} ) }\),

we get, with \(\mathrm{ k \rightarrow \infty }\), that

\(\Lambda ( x_{1},\dots,x_{m} ) \) verifies the double inequality

\( \Lambda ( x_{1} ) \dots \Lambda ( x_{m} ) \leq \Lambda ( x_{1},\dots,x_{m} ) \leq min⁡ [ \Lambda ( x_{1} ) \dots \Lambda \left( x_{m} \right) ] \).

To simplify notations we are noting   \(\mathrm{ \sum _{i}^{m}, \prod_{i}^{m}, \sum _{\substack{ i\neq j \\ \mathrm{ i,j=1} }}^{i,j=n},\dots by \sum _{i}^{}, \prod_{i}^{}, \sum _{i \neq j}^{},\dots } \).

This double inequality cannot be improved because the upper and lower bounds are respectively the asymptotic distributions of multivariate extremes, the diagonal and the independence case.

Suppose now that the bivariate margins

\(\mathrm{ \Lambda _{ij}( x_{i},x_{j} ) = \left[ \Lambda ( x_{i} \right) \Lambda \left( x_{j} \right) ] ^{k_{ij} ( x_{j},x_{i} ) } } \)          \(\mathrm{ \left( i \neq j \right) } \)

are known and compatible.

A lower bound is obtained from a Gumbel inequality  \(\mathrm{ \left( 1-S_{m} \right) ( \begin{array}{c} \mathrm{m-1} \\ \mathrm{2} \end{array} ) \leq (​​​​\begin{array}{c} \mathrm{m} \\ \mathrm{2} \end{array} ) -S_{2} } \)  — see Fréchet (1940) — giving

\(\mathrm{ \Lambda ( x_{1},\dots,x_{m} ) \geq 1- \sum _{i \neq j}^{} ( 1- \Lambda _{ij} ( x_{i},x_{j} ) ) /2 ( m-1 ) }\)

which, with stability, leads to

\(\mathrm{ \Lambda ( x_{1},\dots,x_{m} ) \geq \{ \prod_{i \neq j} \Lambda _{ij} \left( x_{i},x_{j} \right) \} ^{\frac{1}{2 \left( m-1 \right) }} }\).

As \(\mathrm{ \Lambda \left( x_{1},\dots , x_{m} \right) \leq \Lambda _{ij} \left( x_{i}, x_{j} \right) }\) we obtain also

\(\mathrm{ \Lambda( x_{1},\dots,x_{m} ) \leq \{ \prod_{i \neq j} \Lambda _{ij} ( x_{i},x_{j} ) \} ^{\frac{1}{m \left( m-1 \right) }} }\).

but a sharper inequality is

\(\mathrm{ \Lambda ( x_{1},\dots,x_{m} ) \leq {min}_{i \neq j} \{ \Lambda _{ij} ( x_{i},x_{j} ) \} }\).

A Bonferroni inequality gives

\(\mathrm{ \Lambda \left( x_{1},\dots,x_{m} \right) \leq \frac{1}{2} \sum _{i \neq j}^{} \Lambda _{ij} \left( x_{i},x_{j} \right) - \left( m-2 \right) \sum _{i}^{} \Lambda \left( x_{i} \right) +\frac{ \left( m-1 \right) \left( m-2 \right) }{2} }\)

and stability leads to

\(\mathrm{ \Lambda \left( x_{1},\dots,x_{m} \right) \leq \frac{ \{ \prod_{i \neq j}^{} \Lambda _{ij} \left( x_{i},x_{j} \right) \} ^{1/2}}{ ( \prod_{i}^{} \Lambda \left( x_{i} \right) ) ^{m-2}} }\).

As a consequence we can state that

\(\Lambda ( x_{1},\dots,x_{m} ) \) verifies the following inequalities, when the bivariate margins \( \Lambda _{ij} \left( x_{i},x_{j} \right) \) are known and compatible:

\(\{\prod_{i \neq j}^{} \Lambda _{ij} ( x_{i},x_{j}) \} ^{\frac{1}{2 ( m-1 ) }} \leq \Lambda ( x_{1},\dots,x_{m} )\);

\(\Lambda \left( x_{1},~…,~x_{m} \right) \leq {min}_{i \neq j}~ \{ \Lambda _{ij} \left( x_{i},~x_{j} \right) \} \leq \{ \prod_{i \neq j}^{}~ \Lambda _{ij} \left( x_{i},~x_{j} \right) \} ^{\frac{1}{~m \left( m-1 \right) }}~;\)

\(\Lambda \left( x_{1},\dots,x_{m} \right) \leq \frac{ \{ \prod_{i \neq j}^{} \Lambda _{ij} \left( x_{i},x_{j} \right) \} ^{1/2}}{ \left( \prod_{i}^{} \Lambda \left( x_{i} \right) \right) ^{m-2}} \).

An immediate consequence is that:

If the bivariate margins split in the product of the univariate margins then \(\Lambda\left( x_{1},\dots,x_{m} \right) = \Lambda \left( x_{1} \right) \dots \Lambda \left( x_{m} \right) \). Thus the necessary and sufficient condition for global independence is the independence of the bivariate margins.

This follows immediately substituting \(\mathrm{ \Lambda _{ij} \left( x_{i},x_{j} \right) = \Lambda \left( x_{i} \right) \Lambda \left( x_{j} \right) } \)in the first and third inequalities.

As a corollary we may verify Berman’s (1961) statement that Geffroy’s sufficient condition of independence of the two margins implies asymptotic independence. Thus we can state that

\( \frac{1-F_{ij} \left( x_{i},x_{j} \right) }{2-F_{i} ( x_{i })F_{i} ( x_{j} ) } \rightarrow 1~when~x_{i} \rightarrow \bar{w}_{i},x_{j} \rightarrow \bar{w}_{j} \) ( \(\bar{w}_{p}\) being the right-end points of the margins) is a sufficient condition for the asymptotic independence \(\Lambda\left( x_{1},\dots,x_{m} \right) = \Lambda \left( x_{1} \right) \dots \Lambda \left( x_{m} \right) \).

Finally let us suppose that the \(\mathrm{ (m-1) } \) -dimensional margins \(\mathrm{ \Lambda ^{ \left( i \right) } \left( x \right) = \Lambda ( x_{1},\dots,x_{i-1},+ \infty,x_{i+1},\dots,x_{n} ) } \)are known and compatible.

From the relation

\(\mathrm{ max \left( \sum _{i}^{} \Lambda ^{ \left( i \right) } \left( x \right) - \left( m-1 \right) ,0 \right) \leq \Lambda \left( x_{1},\dots,x_{m} \right) \leq\begin{array}{c}\\ \mathrm{min} \\ \mathrm{i} \end{array} \{ \Lambda ^{ \left( i \right) } \left( x \right) \} } \)

the use of stability leads to

\(\mathrm{ \prod_{i}^{} \Lambda ^{ \left( i \right) } \left( x \right) \leq \Lambda \left( x_{1},\dots,x_{m} \right) \leq\begin{array}{c}\\ \mathrm{min} \\ \mathrm{i} \end{array} \{ \Lambda ^{ \left( i \right) } \left( x \right) \} } \).

Another Gumbel inequality \(\mathrm{ \left( 1-S_{m} \right) \left( m-1 \right) \leq m-S_{m-1} } \)— see also Fréchet (1940) — leads to \(\mathrm{ \sum _{i}^{} \Lambda ^{ \left( i \right) } \left( x \right) -1 \leq \left( m-1 \right) \Lambda \left( x_{1},\dots,x_{m} \right) } \)and stability implies

\(\mathrm{ \prod_{i}^{} \Lambda ^{ \left( i \right) } \left( x \right) \leq \{ \prod_{i}^{} \Lambda ^{ \left( i \right) } \left( x \right) \} ^{\frac{1}{m-1}} \leq \Lambda \left( x_{1},\dots,x_{m} \right) } \) .

Another relation derived similarly from

\(\mathrm{ \Lambda \left( x_{1},\dots,x_{m} \right) \leq \frac{ \sum _{i}^{} \Lambda ^{ \left( i \right) } \left( x \right) }{m}~is~ \Lambda \left( x_{1},\dots,x_{m} \right) \leq \{ \prod_{i}^{} \Lambda ^{ \left( i \right) } \left( x \right) \} ^{1/m} } \),

this upper bound being greater than \(\begin{array}{c}\\ \mathrm{min} \\ \mathrm{i} \end{array}\mathrm{\{ \Lambda ^{ \left( i \right) } \left( x \right) \}}\) obtained previously.

Consequently we can state:

If the margins \(\Lambda ^{ \left( i \right) } \left( x \right) = \Lambda \left( x_{1},\dots,x_{i-1},+ \infty,x_{i+1},\dots,x_{m} \right) \)are known and compatible, \(\Lambda \left( x \right) \) verifies the following sequence of inequalities:

\(\prod_{i}^{} \Lambda ^{ \left( i \right) } \left( x \right) \leq \{ \prod_{i}^{} \Lambda ^{ \left( i \right) } \left( x \right) \} ^{\frac{1}{m-1}} \leq \Lambda \left( x_{1},\dots,x_{m} \right) \leq \{ \begin{array}{c}\\ {min} \\{i} \end{array} \{ \Lambda ^{ \left( i \right) } \left( x \right) \} \leq \prod_{i}^{} \Lambda ^{ \left( i \right) } \left( x \right) \} ^{1/m} \).

3 . Generalizations of bivariate models

The previous inequalities introduce some limitations on the formulation of multivariate extreme models.

Evidently, if the bivariate margins \(\mathrm{ \Lambda _{ij} \left( x \right) \left( i \neq j \right) }\) are known, as the upper bound \(\mathrm{ {min}_{i \neq j} \{ \Lambda _{ij} ( x_{i},x_{j}) \} }\) implies the non-existence of a planar density (and such a situation will be considered in the next section), we can try, as a generalization, the lower bound

\(\mathrm{ \prod_{i \neq j}^{} \Lambda _{ij} \left( x_{i},x_{j} \right) \} ^{1/2 \left( m-1 \right) } }\),

but the conditions for this function to be a distribution function should be verified in each case.

Here we will give only a few scattered models.

A triextremal model can be defined as follows:

Consider three independent reduced Gumbel random variables \(\mathrm{ \left( Z_{1},Z_{2},Z_{3} \right) }\)and consider the triple \(\mathrm{ \left( X_{1},X_{2},X_{3} \right) }\) defined as \(\mathrm{ X_{1}=Z_{1},X_{2}=max ( Z_{1}-a,Z_{2}-b) ,X_{3}=max⁡ ( X_{2}-c,Z_{3}-d ) }\). To force \(\mathrm{ \left( X_{1},X_{2} \right) }\) to have a biextremal distribution with reduced Gumbel margins \(\mathrm{ \Lambda ( x_{1},x_{2} \vert \alpha ) }\), we must have \(\mathrm{ e^{-a}+e^{-b}=1~and~e^{-a}= \alpha \left( 0 \leq \alpha \leq 1 \right) }\); to impose \(\mathrm{ \left( X_{2},X_{3} \right) }\) also to have a biextremal distribution with reduced Gumbel margins \(\mathrm{ \Lambda ( x_{2},x_{3} \vert \beta ) }\), we must have \(\mathrm{ e^{-c}+e^{-d}=1~and~e^{-c}= \beta \left( 0 \leq \beta \leq 1 \right) }\).

Thus the joint distribution of \(\mathrm{ \left( X_{1},X_{2},X_{3} \right) }\) is \(\mathrm{ \Lambda ( x_{1}, x_{2},x_{3}\vert \alpha , \beta ) =Prob \{{ X_{1} \leq x_{1},X_{2} \leq x_{2},X_{3} \leq x_{3} } \} }\)\(\mathrm{ =Prob \{ { Z_{1} \leq x_{1},Z_{1} \leq a+x_{2},Z_{1} \leq a+c+x_{3},Z_{2} \leq b+x_{2},Z_{2} \leq b+c+x_{3},Z_{3} \leq d+x_{3} }\}}\)\(\mathrm{ =exp\{ -max ( e^{-x_{1}},e^{-a}~e^{-x_{2}},e^{-a}~e^{-c}~e^{-x_{3}} ) -max ( e^{-b}~e^{-x_{2}},~e^{-b}~e^{-c}~e^{-x_{3}}) -e^{-d}~e^{-x_{3}} \} }\)\(\mathrm{ =exp⁡ \{ -max \left( e^{-x_{1}}, \alpha ~e^{-x_{2}}, \alpha ~ \beta ~e^{-x_{3}} \right) - \left( 1- \alpha \right) max \left( e^{-x_{2}}, \beta ~e^{-x_{3}} \right) - \left( 1- \beta \right) e^{-x_{3}} \} }\)and we obviously get \(\mathrm{ \Lambda \left( + \infty,x_{2},x_{3} \right) = \Lambda \left( x_{2},x_{3} \vert \beta \right) , \Lambda \left( x_{1},x_{2},+ \infty \right) = \Lambda \left( x_{1},x_{2} \vert \alpha \right) }\)and \(\mathrm{ \Lambda ( x_{1},+ \infty,x_{3} ) = \Lambda ( x_{1},x_{3} \vert \alpha\, \beta ) }\) so that naturally \(\mathrm{ 0 \leq \alpha ~ \beta \leq 1 }\).

For trivariate minima with standard exponential margins the survival function is ( for \(\mathrm { y_i\geq0}\) ) \(\mathrm{ S \left( y_{1},y_{2},y_{3} \right) =exp⁡ \{ -max \left( y_{1}, \alpha ~y_{2}, \alpha ~ \beta ~y_{3} \right) - \left( 1- \alpha \right) max \left( y_{2}, \beta ~y_{3} \right) - \left( 1- \beta \right) y_{3} \} }\).

The multiextremal distribution-defined analogously to the biextremal one ‒ which is important in the extremal processes that will follow is given by

\(\mathrm{ \Lambda \left( x_{1}, \cdots ,x_{m} \right) =exp⁡ \{ - \sum _{1}^{m}a_{i}~max⁡ ( \frac{e^{-x_{i}}}{a_{1}+ \dots +a_{i}}, \cdots ,\frac{e^{-x_{m}}}{a_{1}+ \cdots +a_{m}} ) \} }\)

with \(\mathrm{ a_{i} \geq 0 }\). If we consider the biextremal pair \(\mathrm{ \left( X_{p},X_{q} \right) \left( 1 \leq p<q \leq m \right) }\)it has the dependence parameter \(\mathrm{ \theta _{p,q}=\frac{a_{1}+ \ldots +a_{p}}{a_{1}+ \ldots +a_{q}} }\), asymptotic independence exists if all \(\mathrm{ \theta _{p,q} \rightarrow 0 }\).

The generalization of the Gumbel model is

\(\mathrm{ \Lambda \left( x_{1}, \dots ,x_{n} \right) =exp \{ -a_{1} \sum _{i}^{}e^{-x_{i}}-a_{2} \sum _{i<j}^{}max⁡ \left( e^{-x_{i}},e^{-x_{j}} \right) - \ldots - \ldots - }\) \(\mathrm{ a_{m}max( e^{-x_{1}}, \dots ,e^{-x_{m}} ) }\)  

These distributions appeared in Tiago de Oliveira (1980).

A generalization of the logistic model, as this one can be defined by the rule

\(\mathrm{ ( -log~ \Lambda ( x_{1},x_{2} ) ) ^{1/ ( 1- \theta) }= ( -log~ \Lambda ( x_{1} ) ) ^{1/ ( 1- \theta ) } +( -log~ \Lambda ( x_{2} ) ) ^{1/ ( 1- \theta ) } }\),

could be

\(\mathrm{ \left( -log~ \Lambda \left( x_{1}, \dots ,x_{m} \right) \right) ^{ {1}/{ \left( 1- \theta \right) }}= \sum _{i}^{} \left( -log⁡\, \Lambda \left( x_{i} \right) \right) ^{ {1}/{ \left( 1- \theta \right) }}= \sum _{i}^{}e^{-x_{i/ \left( 1- \theta \right) }} }\);

as the correlation coefficients between all margins would always be equal to \(\mathrm{ \theta \left( 2- \theta \right) }\), it does not seem useful to fit data. Another formal characterization of the bivariate logistic model could be extended to multivariate extremes and does not impose, as does the previous one, both symmetry and the quality of correlations between the margins, a fact which do not fit real (oceanographic, meteorological, hydrological, etc.) data.

A full generalization of the (bivariate) natural model, presented in Tiago de Oliveira (1987), is as follows: consider \(\mathrm{ N }\) independent reduced Gumbel random variables \(\mathrm{ Z_{1}, \dots ,Z_{N} }\) and \(\mathrm{ a( m \times N )\, matrix~A= [ a_{ij} ] ( i=1, \dots ,m;j=1, \dots ,N ) }\); now define the random vector \(\mathrm{ ( X_{1}, \dots ,X_{m} ) }\) by

\(\mathrm{ X_{i}= \mathrm{ \begin{array}{c}\mathrm{ N} \\ \mathrm{ max} \\ \mathrm{ j=1 } \end{array} } ( Z_{j}-a_{ij} ) }\) .

The condition   \(\mathrm{ Prob \{{ X_{i} \leq z }\} = \Lambda ( z ) }\) gives

\(\mathrm{ Prob \{ Z_{j} \leq a_{ij}+z \} =\mathrm{ \begin{array}{c}\mathrm{ N} \\ \mathrm{ \Pi} \\ \mathrm{ j=1 } \end{array} } \Lambda \left( a_{ij}+z \right) = \Lambda \left( z \right) ~or~\mathrm{ \begin{array}{c}\mathrm{ N} \\ \mathrm{ \Sigma } \\ \mathrm{ j=1 } \end{array} }e^{-a_{ij}}=1 }\),

which imposes \(\mathrm{ m }\) conditions on the \(\mathrm{ m\,N }\)  parameters \(\mathrm{ a_{ij} }\) (apart from being \(\mathrm{ \geq 0 }\) ); we thus have \(\mathrm{ m ( N-1 ) }\); non-negative parameters.

The joint distribution of \(\mathrm{ ( X_{1}, \dots ,X_{m} ) }\)  is

\(\mathrm{ Prob \{{ X_{i} \leq z_{1},…,X_{m} \leq z_{m} }\} =Prob \{ Z_{j} \leq a_{ij}+z_{i} \} = \begin{array}{c}\mathrm{ N} \\ \mathrm{ \Pi} \\ \mathrm{ j=1 } \end{array} \Lambda \,( \begin{array}{c}\mathrm{ N} \\ \mathrm{ min} \\ \mathrm{ j=1 } \end{array} ( a_{ij}+z_{i} ) ) }\)

\(\mathrm{ =exp \{ - \sum _{j=1}^{N}e^{\mathrm{ \begin{array}{c}\mathrm{ N} \\ \mathrm{-min } \\ \mathrm{ i=1 } \end{array} (a_{ij}+x_i) }} \} =exp \{ - \sum _{j=1}^{N}max \left( e^{-a_{ij}} \cdot e^{-x_{i}} \right) \} }\)  .

The choice of the matrix \(\mathrm{ A= [ a_{ij} ] }\) is equivalent, putting \(\mathrm{ b_{ij}=e^{-a_{ij}} }\), to the choice of a stochastic matrix \(\mathrm{ B= \left[ b_{ij} \right] }\) because  \(\mathrm{ \sum _{j=1}^{N}b_{ij}=1 \left( b_{ij} \geq 0 \right) }\).

It is evident that a permutation of the lines of \(\mathrm{ A }\) (or \(\mathrm{ B }\) ) leads to a vector \(\mathrm{ ( X_{1}^{’}, \dots ,X_{m}^{’} ) }\) with the same distribution of \(\mathrm{( X_{1}, \dots ,X_{m} ) }\) except for the permutation of the indices.

The bivariate distribution of the pair \(\mathrm{ ( X_{p},X_{q} ) }\) is

\(\mathrm{ \Lambda _{p,q}( x_{p},x_{q} ) =exp \{ - \sum _{j=1}^{N}e^{-min ( a_{pj}+x_{p},a_{qj}+x_{q}) } \} }\)

\(\mathrm{ =exp \{ - \sum _{j=1}^{N}max ( b_{pj}~e^{-x_{p}},b_{qj}~e^{-x_{q}} ) \} }\).

It can be shown that   \(\mathrm{ \begin{array}{c}\mathrm{ N} \\ \mathrm{ min} \\ \mathrm{ j=1 } \end{array} \left( a_{qj} \right) \leq X_{p}-X_{q} \leq \begin{array}{c}\mathrm{ N} \\ \mathrm{ min} \\ \mathrm{ j=1 } \end{array} \left( a_{pj} \right) }\).

The statistical problems for multivariate extremes have not yet been analysed with some depth except concerning pairwise behaviour (for instance, using correlation coefficients of different types) but certainly giving different estimates of the \(\mathrm{ a_{ij} }\) (or \(\mathrm{ b_{ij} }\)) if the identifiability problem is solved positively. In fact we can expect that the sample size will be much larger than the number \(\mathrm{ m(N-1) }\) of parameters.

4 . Some notes on linear regression

Consider once more the \(\mathrm{ m }\)-dimensional extreme value reduced margins distribution and let us discuss, briefly, the best least squares linear prediction of one of the components of \(\mathrm{ ( X_{1},X_{2}, \dots ,X_{m} ) }\)based on some other \(\mathrm{ p \left( <m \right) }\) components. For notational simplicity we will suppose that the prediction variables are the first \(\mathrm{ p }\) ones \(\mathrm{ \left( X_{1}, \dots ,X_{p} \right) }\) and the variable to be predicted is the last one \(\mathrm{ \left( X_{m} \right) }\).

We thus have to seek the coefficients \(\mathrm{ \alpha _{0}, \alpha _{1}, \dots , \alpha _{p} }\) such that the predictor \(\mathrm{ \hat{X}_{m}= \alpha _{0}+ \sum _{1}^{p} \alpha _{i}X_{i} }\) approaches best \(\mathrm{ X_{m} }\) in the mean-square sense, i.e., where

\(\mathrm{ MSE ( \hat{X}_{m} ) =M ( \hat{X}_{m}-X_{m} ) ^{2}=M ( X_{m}- \alpha _{0}- \sum _{1}^{p} \alpha _{i}X_{i}) ^{2}=min }\).

We get           \(\mathrm{ \alpha _{0}= \gamma ( 1- \sum _{1}^{p} \alpha _{i} ) }\)

\(\mathrm{ \rho _{mj}= \sum _{i=1}^{p} \alpha _{i}\, \rho _{ij} }\)

where, in the last \(\mathrm{ p }\) equations, the \(\mathrm{ ~ \rho _{ij} }\) denote the correlation coefficients between \(\mathrm{ X_{i} }\) and \(\mathrm{ X_{j} }\).

Thus we have \(\mathrm{ \hat{X}_{m}= \gamma + \sum _{i=1}^{p} \alpha _{i} ( X_{i}- \gamma) }\)with \(\mathrm{ ( \alpha _{1}, \dots , \alpha _{p} ) }\) given by the equations \(\mathrm{ \rho _{mj}= \sum _{i=1}^{p} \alpha _{i}\, \rho _{ij};\hat{X}_{m} }\) is obviously unbiased.

Clearly we are using reduced values; if we have margin parameters \(\mathrm{ \left( \lambda _{i}, \delta _{i} \right) }\) we will substitute \(\mathrm{ X_{i} }\) by \(\mathrm{( X_{i}- \hat{\lambda }_{i} ) / \hat{\delta} _{i} }\).

The MSE, with reduced margins assumed, is

\(\mathrm{ MSE=\frac{ \pi ^{2}}{6} ( 1- \sum _{i=1}^{p} \alpha _{i}~ \rho _{mi}) }\),

and so the relative reduction of variance is \(\mathrm{ \sum _{i=1}^{p} \alpha _{i} ~\rho _{mi} }\).

Using matrix notation, where \(\mathrm{ [ \rho _{m} ] = ( \rho _{m1}, \dots , \rho _{mp} ) ^{T} }\) is the vector of the correlations between the prediction variables and \(\mathrm{ X_{m}, [ \alpha ] = \left( \alpha _{1},\dots, \alpha _{m} \right) ^{T} }\) is the vector of coefficients, and \(\mathrm{ R= [ \rho _{ij} ] ( i,j=1, \dots p ) }\)  is the correlation matrix between the prediction variables, we have, from the equations \(\mathrm{ R [ \alpha ] = [ \rho _{m} ] , [ \alpha ] =R^{-1} [ \rho _{m} ] , MSE=\frac{ \pi ^{2}}{6} ( 1- [ \rho _{m} ] ^{T}~R^{-1} [ \rho _{m} ] ) }\)and relative reduction of variance \(\mathrm{[ \rho _{m}] R^{-1} [ \rho _{m}]}\).

References

3.

Fréchet, M., 1940.  Les Probabilites Associées a un Système ď Évenements Compatibles et Dépendents, Act. Scient. Ind., (859), Herman et Cie, Paris.

4.

Geffroy, J., 1958/59. Contribution a ĺ étude de la théorie des valeurs extremes. Publ. Inst. Statist. Univ. Paris, 7/8, 37-185, Thèse de Doctorat ď Etat.

5.

Gumbel, E. J., 1961. Multivariate extremal distributions. Bull. Int. Statist. Inst., 33rd Sess., Paris.

6.

Tiago de Oliveira, J., 1958. Extremal distributions. Rev. Fac. Ciências de Lisboa, 2 set, A, Mat, VII, 215-227.

7.

Tiago de Oliveira, J., 1962/63. Structure theory of bivariate extremes; extensions. Estudos Mat., Estatist.e  Econometria, Lisboa, VII, 165-194.

8.

Tiago de Oliveira, J., 1980. Bivariate extremes: foundations and statistics. Multivariate Analysis -V, P. R. Krishnaiah ed., 349-368, North-Holland, Amsterdam.

9.

Tiago de Oliveira, J., 1987. Comparaison entre les modèles bivariés logistique et naturel pour les maxima et extensions. C. R. Acad. Sc. Paris, 305, Ser I, 481-484.