DUAL THEORY OF CHOICE UNDER MULTIVARIATE RISKS ALFRED GALICHON†

MARC HENRY§

Abstract. We propose a multivariate extension of Yaari’s dual theory of choice under risk. We show that a decision maker with a preference relation on multidimensional prospects that preserves first order stochastic dominance and satisfies comonotonic independence behaves as if evaluating prospects with a weighted sum of quantiles. Both the notions of quantiles and of comonotonicity are extended to the multivariate framework using optimal transportation maps. Finally risk averse decision makers are characterized within this framework and their local utility function is derived.

Keywords: risk, non-expected utility theory, comonotonicity, optimal transportation. JEL subject classification: D81, C61

Introduction In his seminal paper [18], Menahem Yaari proposed a theory of choice under risk, which he called “dual theory of choice,” where risky prospects are evaluated with a weighted sum of quantiles. The resulting utility is less vulnerable to paradoxes such as Allais’ celebrated paradox [1]. The main ingredients in Yaari’s representation are the preservation of first order stochastic dominance, and insensitivity to hedging of comonotonic prospects. Both properties have strong normative and behavioral appeal once it is accepted that decision Date: First version is March 3, 2008. The present version is of February 22, 2010. Correspondence address: D´epartement d’Economie, Ecole polytechnique, 91128 Palaiseau, France. Both authors gratefully acknowledge support from Chaire Axa “Assurance des Risques Majeurs” and Chaire Soci´et´e G´en´erale “Risques Financiers”, and Galichon’s research is partly supported by Chaire EDF-Calyon “Finance and D´eveloppement Durable” and FiME, Laboratoire de Finance des March´es de l’Energie (www.fime-lab.org). The authors thank Guillaume Carlier, Rose-Anne Dana, Nizar Touzi, Ivar Ekeland, and participants at the RUD 2008 conference, and the seminar in Economic theory in the Collegio Carlo Alberto, Torino for helpful discussions and comments. 1

2

ALFRED GALICHON†

MARC HENRY§

makers care only about the distribution of risky prospects. The preservation of stochastic dominance is justified by the fact that decision makers prefer risky prospects that yield higher values in all states of the world, whereas comonotonicity captures the decision maker’s insensitivity to hedging comonotonic prospects. The dual theory has been used extensively as an alternative to expected utility in a large number of contexts. The main drawback of the dual theory is that it does not properly handle the case when the prospects of consumptions of different natures are not perfect substitutes. This situation is commonly met in Economics and Finance. In a two-country economy with floating exchange rates, the fact that Arrow-Debreu assets yielding consumption in different floating currencies are not perfectly substitutable is known as the Siegel paradox ; in the study of the term structure of interest rates, the fact that various maturities are (not) perfect substitutes is called the (failure of the) pure expectation hypothesis. To handle these situations, we need to be able to express utility derived from monetary consumption in different num´eraires, which is easily done with Expected Utility theory, but so far not covered by Yaari’s dual theory. Indeed, the latter applies only to risky prospects defined as univariate random variables, thereby ruling out choice among multidimensional prospects which are not perfect substitutes for each other, such as risks involving both a liquidity and a price risk, collection of payments in different currencies, payments at different dates, prospects involving different goods of different natures such as consumption and environmental quality, etc... We propose to lift this constraint with a multivariate extension of the dual theory to risky prospects defined as random vectors, and applicable as such to the examples listed above. The main challenge in this generalization is the definition of quantile functions and comonotonicity in the multivariate setting. Another challenge is to preserve the simplicity of the functional representing preferences, so that they can be parameterized and as efficiently computed as in the univariate case. Both challenges are met with an appeal to optimal transportation maps that allow the definition of “generalized quantiles,” their efficient computation, and the extension of comonotonicity as a notion of distribution free perfect correlation. With these notions of quantiles and comonotonicity at hand, we give a representation of a comonotonic independent preference relation as a weighted sum of

DUAL THEORY OF CHOICE

3

generalized quantiles. The main difference between the univariate case and the multivariate case is that comonotonicity and generalized quantiles are defined with respect to an objective baseline distribution, which features in the representation. This baseline distribution may be interpreted as a framing reference in the decision problem. We then turn to the representation of risk averse decision makers within this theory. Risk aversion is defined in the usual way as the preference for less risky prospects, where the notion of increasing risk is suitably generalized to multivariate risky prospects. We show, again in a direct generalization of the univariate case, that risk aversion is characterized by a special form of the quantile weights defined above: risk averse decision makers give more weight to low outcomes (low quantiles), and low weights to high outcomes (high quantiles). As a result, given the baseline distribution with respect to which comonotonicity is defined, risk averse decision makers are characterized by simple further restrictions on their utility functional, which makes this model as simple and as tractable as expected utility. Although we do not deal with computational issues in this paper, let us mention that these risk measures are conveniently numerically computable. Computational implementation is described in the companion paper to this one [4]. The paper is organized as follows. The next section gives a short exposition of the dual theory. The following section develops the generalized notions of quantiles and comonotonicity that are necessary to the multivariate extension, which is given in section 3. Finally, risk aversion is characterized in section 4, and the last section concludes. Notations and basic definitions. Let (S, F, P) be a non atomic probability space. Let X : S → Rd be a random vector; we denote the probability distribution of X by LX , hence LX = X#P, where X#P := PX −1 denotes the push-forward of probability measure P by X. The equidistribution class of X ∼ P , denoted indifferently equi(P ) or equi(X) is the set of random vectors with distribution with respect to P equal to LX (reference to P will be implicit unless stated otherwise). FX denotes the cumulative distribution function of distribution LX . QX denotes the its quantile function. In dimension 1, this is defined for all t ∈ [0, 1] by QX (t) = inf x∈R {Pr(X ≤ x) > t}. In larger dimensions, it is defined in definition 3 of section 2 below). We call L2d the set of square integrable F-measurable

ALFRED GALICHON†

4

MARC HENRY§

functions S → Rd . We denote by D the subset of L2d containing random vectors with a density relative to Lebesgue measure. Finally, for two elements X, Y of L2d , we write X =d Y to indicate equality in distribution, that is LX = LY . We also write X =d LX . A measure µ on Rd is said to give no mass to small sets if µ(A) = 0 where A has Hausdorff dimension at most d − 1. Note that small sets have Lebesgue measure zero, so that absolutely continuous measures give no mass to small sets (but the converse doesn’t always hold). A location scale map is a function φ defined for all x ∈ Rd by φ(x) = αx + x0 , for some α real positive and x0 ∈ Rd . For a convex lower semi-continuous function V : Rd 7→ R, we denote ∇V its gradient. 1. Dual theory of decision under risk In this section, we first revisit Yaari’s dual theory of decision under risk ([18]). As in the latter, we consider a problem of choice among risky prospects as modeled by random variables defined on an underlying probability space. The risky prospect X is interpreted as a gamble or a lottery that a decision maker might consider holding and the realizations of X are interpreted as payments. 1.1. Representation. We suppose the decision maker is characterized by a preference relation % on the set of risky prospects. X % Y indicates that the decision maker prefers prospect X to prospect Y , X Â Y stands for X % Y and not Y % X, whereas X ∼ Y stands for X % Y and Y % X. We first introduce a set of axioms satisfied by the preference relation which are those proposed by Yaari in [18]. Axiom 1. The preference relation % is reflexive, transitive, connected and continuous relative to the topology of weak convergence. Axiom 2. The preference % preserves first order stochastic dominance in the sense that if prospect X first order stochastically dominates prospect Y , then X % Y , and if X strongly first order stochastically dominates prospect Y , then X Â Y . Axiom 3. If X, Y, Z are pairwise comonotonic prospects, then for any α ∈ [0, 1], X % Y implies αX + (1 − α)Z % αY + (1 − α)Z.

DUAL THEORY OF CHOICE

5

With the first axiom (which corresponds to Axiom A2 and A3 in Yaari), we take the standard notion of preference as a continuous pre-order (reflexive and transitive binary relation) which is connected. Then, % can be represented by a continuous real valued function γ, in the sense that X % Y if and only if γ(X) ≥ γ(Y ). The second axiom is generally considered uncontroversial in a law invariant setting. Note that since two equidistributed prospects first order stochastically dominate one each other, it follows that X =d Y implies X ∼ Y , thus this axiom implies law invariance of the preference relation, or what [18] calls neutrality. Neutrality can be interpreted as the fact that the decision maker is indifferent to relabelings of the states of the world. Once neutrality is accepted, then axiom 2 is uncontroversial as it is equivalent to requiring that the decision maker prefer prospects that yield higher value in every state of the world. We shall see below that with suitable extensions of the concepts of monotonicity and stochastic dominance, this axiom remains uncontroversial in the multivariate extension of Yaari’s representation theorem. Finally, the third axiom is the crucial one in this framework, as it replaces independence by comonotonic independence. Recall that X and Y are comonotonic if (X(s) − X(s0 ))(Y (s) − Y (s0 )) ≥ 0 for all s, s0 ∈ S, and the absence of hedging opportunity between comonotonic prospects justifies the requirement of stability with respect to comonotonic mixtures embodied in this axiom. We can now state Yaari’s representation result. Proposition 1 (Yaari). A preference % on [0, 1]-valued prospects satisfies axioms 1-3 if and only if there exists a non decreasing function f defined on [0, 1], such that X % Y if R1 and only if γ(X) ≥ γ(Y ), where γ is defined for all X as γ(X) = 0 f (1 − FX (t))dt. This result is interpretable in terms of weighting of outcomes (through the weighting of quantiles). Assume all necessary invertibility and regularity and write by integration R1 R1 R1 R1 d by parts 0 f (1 − FX (t))dt = 0 f (1 − u)dQX (u) = 0 f (1 − u) du QX (u)du = 0 f 0 (1 − u)QX (u)du. Hence, calling φ(u) = f 0 (1 − u), we have the representation of % with the R1 functional 0 φ(t)QX (t)dt. Hence increasing f corresponds to positive φ, which can be interpreted as a weighting of quantiles of the prospect X.

6

ALFRED GALICHON†

MARC HENRY§

As noted in [18], the functional γ is positively homogeneous and constant additive, which implies linearity in payments and the property that γ(γ(X)) = γ(X), so that γ(X) is the certainty equivalent of X for the decision maker characterized by %. 1.2. Risk aversion. We now turn to the characterization of risk averse decision makers within those satisfying axioms 1-3. We define increasing risk as Rotschild and Stiglitz ([12]). The formulations in the first part of the definition below are equivalent by the BlackwellSherman-Stein theorem (see for instance chapter 7 of [15]). Definition 1 (Concave ordering, risk aversion). a) A prospect Y is dominated by X in the concave ordering, denoted Y ≤cv X, when the equivalent statements (i) or (ii) hold: (i) for all continuous concave functions, Ef (Y ) ≤ Ef (X). (ii) Y has the same distribution as Yˆ such that (X, Yˆ ) is a martingale, ie. E(Yˆ |X) = X (Yˆ is sometimes called a mean preserving spread of X). b) The preference relation % is called risk averse if X % Y whenever X ≥cv Y . Notice that x → x and x → −x are both continuous and concave function, therefore condition (i) in the first part of the definition implies that E[X] = E[Y ] is necessary for a concave ordering relationship between X and Y to exist. With this definition, we can recall the characterization of risk averse preferences satisfying axioms 1-3 as those with convex f (see section 5 of [18], or theorem 3.A.7 of [15]). Proposition 2. A preference relation % satisfying axioms 1-3 is risk averse if and only if the function f in the representation is convex. This monotonicity has the natural interpretation that risk averse decision makers evaluate prospects by giving high weights to low quantiles (corresponding to low values of the R1 prospect) and low weights to high quantiles. Indeed, with the formulation 0 φ(u)QX (u)du and the identification φ(u) = f 0 (1 − u), increasing convex f corresponds to positive decreasing φ, and therefore to a weighting scheme where low quantiles (corresponding to unfavourable outcomes) receive high weights and high quantiles (corresponding to favourable outcomes) receive low weights.

DUAL THEORY OF CHOICE

7

2. Generalized quantiles, stochastic dominance and comonotonicity 2.1. Generalized quantiles. By the rearrangement inequality of Hardy, Littlewood and P´olya, we have the following well known equality: Z 0

1

QX (t)td(t) = max {E[XU ] : U uniform on [0, 1]} ,

(2.1)

where the quantile function QX has been defined above. This variational characterization is crucial when generalizing to the multivariate setting. Indeed, consider now a random vector X on Rd and a baseline distribution µ on Rd , with U distributed according to µ. We introduce maximum correlation functionals, to generalize the variational formulation of (2.1). Definition 2 (Maximal correlation functionals). A functional %µ : L2d → R is called a maximal correlation functional with respect to a baseline distribution µ if for all X ∈ L2d , n o ˜] : U ˜ =d µ . %µ (X) := sup E[X · U

This functional has an important geometric interpretation as the support function of the equidistribution class of µ (cf. figure 1). It follows from the theory of optimal transportation (see theorem 2.12(ii) page 66 of [17]) that if µ gives no mass to small sets (which will be assumed throughout the rest of the paper), then there exists a convex lower semi-continuous function V : Rd → R, and a random vector U distributed according to µ, such that X = ∇V (U ) holds µ-almost surely, and %µ (X) = E[∇V (U ) · U ]. In that case, the pair (U, X) is said to achieve the optimal quadratic coupling of µ with the distribution of X. The function V is called the Kantorovitch potential of X with respect to µ, or transportation potential from µ to the probability distribution of X. Note that V is convex, hence differentiable except on a small set by Rademacher’s theorem, so that expression E[∇V (U ) · U ] above is well defined. This shows that the gradient of the convex function V thus obtained satisfies the multivariate analogue of equation (2.1). We therefore adopt ∇V as our notion of generalized quantile.

ALFRED GALICHON†

8

MARC HENRY§

Definition 3 (µ-quantile). The µ-quantile function of a random vector X on Rd with respect to a distribution µ on Rd that does not give mass to small sets, is defined as QX = ∇V , where V is the Kantorovitch potential of X with respect to µ. We now turn to an example in order to illustrate definitions 2 and 3. Recall that for Σ a definite positive matrix, Σ1/2 is the unique solution of equation S 2 = Σ which is also definite positive. Example 1 (Gaussian case). When the prospect X =d LX = N (0, Σ) is Gaussian, as well as the baseline scenario µ = N (0, Id ), then the transportation potential from µ to LX is ­ ® VX (u) = 12 Σ1/2 u, u ; the µ-quantile function is QX (u) = ∇VX (u) = Σ1/2 (u), and the ¡ ¢ maximum correlation functional is %µ (X) = Tr Σ1/2 .

X



equidistribution class of U Figure 1. Representation the support functional. The equidistribution ˜ is the L2 projection of X on the equidisclass of U is the circle, and U tribution class of U . The support function the equidistribution class of U at ˜ ] = [E[(X − U ˜ )2 ] − EX 2 − EU 2 ]/2. X is E[X · U

2.2. Generalized notion of comonotonicity. We now turn to an extension of the concept of comonotonicity. Two prospects X and Y are comonotonic if there is a prospect U and non decreasing maps TX and TY such that Y = TY (U ) and X = TX (U ) almost surely,

DUAL THEORY OF CHOICE

9

n o n o ˜] : U ˜ =d U and E[U Y ] = max E[Y U ˜] : U ˜ =d U . or equivalently E[U X] = max E[X U Comonotonicity is hence characterized by maximal correlation between the prospects over the equidistribution class. This variational characterization will be the basis for our generalized notion of comonotonicity. Definition 4 (µ-comonotonicity). Let µ be a probability measure on Rd with finite second moments. A collection of random vectors Xi ∈ L2d , i ∈ I, are called µ-comonotonic if one has %µ

à X i∈I

! Xi

=

X

%µ (Xi ).

i∈I

When µ does not give mass to small sets, it follows from the representation of %µ that the family Xi is µ-comonotonic if and only if there exists a vector U distributed according ˜ ], U ˜ =d µ} all i ∈ I. In other words, the Xi ’s can be to µ, we have U ∈ argmaxU˜ {E[Xi · U rearranged simultaneously so that they achieve maximal correlation with U . It is useful to see how our definition of µ-comonotonicity translates in the case of Gaussian risks. Example 2 (Gaussian case, continued). When the baseline scenario µ = N (0, Id ) is Gaussian, then two prospects Xi =d N (0, Σi ), i = 1, 2 are µ-comonotonic if and only if there exits 1/2

1/2

a random vector U =d N (0, Id ) such that X1 = Σ1 U and X2 = Σ2 U . Note that when the distributions of the random vectors are absolutely continuous, the concept of µ-comonoticity is transitive. Proposition 3. Suppose that X and Y are µ-comonotonic, and that Y and Z are µcomonotonic, with the distribution of Y assumed to be absolutely continuous. Then X and Z are µ-comonotonic. In dimension one, it is known since a seminal paper of Landsberger and Meilijson [5] that a risk-sharing allocation is Pareto efficient with respect to the concave order if and only if it is comonotonic. It turns that, as recently shown by Carlier, Dana and Galichon [3] the essence of this result can be extended to the multivariate case, when comonotonicity is replaced by multivariate comonotonicity, if one defines an allocation to be comonotonic in the

ALFRED GALICHON†

10

MARC HENRY§

multivariate sense if and only if it is µ-comonotonic for some measure µ with enough regularity. In our view, this result strongly supports the claim that our notion of comonotonicity is in some sense the “natural” multivariate extension of comonotonicity. In their paper [8], Puccetti and Scarsini also applied the theory of optimal transportation to generalize the notion of comonotonicity to the multivariate setting. They also review other possible multivariate extension of comonotonicity, including ours. But the notion they favor differs from ours in the sense that according to their notion of multivariate comonotonicity (which they call c-comonotonicity), two vectors X and Y are c-comonotonic if and only X and Y are in optimal quadratic coupling. In other words, c-comonotonic vectors X and Y are µ-comonotonic for µ = LX (or equivalently, for µ = LY ). Note that the notion of c-comonotonicity is not transitive: if is false in general that X and Y c-comonotonic and Y and Z c-comonotonic imply that X and Z be c-comonotonic. It also seems that c-comonotonicity is not related to efficient risk-sharing allocations. One can interpret geometrically the notion of µ-comonotonicity by seeing that two or more prospects are µ-comonotonic if they have the same L2 projection on the equidistribution class of µ, as shown in figure 2. But there is a important nuance between dimension one and above here. Indeed, in dimension one, one recovers the classical notion of comotonicity regardless of the choice of µ. However, in dimension greater than one, the comonotonicity relation crucially depends on the baseline distribution µ, unlike in dimension one. The following lemma (Lemma 10 in [4]) makes this precise: Lemma 1. Let µ and ν be probability measures on Rd that give no mass to small sets. Then: - In dimension d = 1, µ-comonotonicity always implies ν-comonotonicity. - In dimension d ≥ 2, µ-comonotonicity implies ν-comonotonicity if and only if ν = T #µ for some location-scale transform T (u) = λu + u0 where λ > 0 and u0 ∈ Rd . In other words, comonotonicity is an invariant of the location-scale family classes. 2.3. Stochastic orders. The remaining notion needed to extend Yaari’s representation to multivariate prospects is first order stochastic dominance (for axiom 2 and monotonicity) and mean preserving spreads (for risk aversion).

DUAL THEORY OF CHOICE

11

X

Y

equidistribution class of U Figure 2. The equidistribution class of U is the circle, and two µ˜ on comonotone random vectors X and Y have the same L2 projection U the equidistribution class of U with distribution µ. Given sufficient regularity, first order stochastic dominance can be characterized equivalently by point wise dominance of cumulative distribution functions or point wise dominance of quantile functions. It is the latter that we adopt for our multivariate definition. Definition 5 (µ-first order stochastic dominance). A prospect X µ-first order stochastically dominates prospect Y relative to the componentwise partial order ≥ on Rd if QX (t) ≥ QY (t) for almost all t ∈ Rd , where QX and QY are the generalized quantiles of X and Y with respect to a distribution µ on Rd . Note that for any U =d µ, we have QX (U ) =d X and QY (U ) =d Y , and µ-first order ˆ ≥ Yˆ almost surely stochastic dominance implies QX (U ) ≥ QY (U ) almost surely, hence X ˆ =d X and Yˆ =d Y , which is the “usual multivariate stochastic order” (see [15] for some X page 266). The converse does not hold in general. As for risk aversion, following [12], we generalize the concave ordering to the multivariate setting. Proposition 4 (Concave ordering). For any prospects X and Y whose respective distributions do not give mass to small sets, the following properties are equivalent. (a) For every bounded concave function f on Rd , Ef (X) ≥ Ef (Y )

ALFRED GALICHON†

12

MARC HENRY§

(b) Y =d Yˆ , with E[Yˆ |X] = X. (c) %µ (X) ≤ %µ (Y ) for every probability measure µ. (d) X belongs to the closure of the convex hull of the equidistribution class of Y . (e) For every upper semi-continuous law-invariant concave functional Φ : L2d → R, one has Φ(X) ≥ Φ(Y ). When any of the properties above hold, one says that Y is dominated by X in the concave ordering, denoted Y ≤cv X. Note that item (d) of this theorem provides an important geometric interpretation of the concave ordering, which is illustrated in figure 3.

Y X

equidistribution class of Y Figure 3. Y is a mean preserving spread of X when X belongs to the closure of the convex hull (illustrated by the shaded disk) of the equidistribution class of Y (illustrated by the circle).

3. Multivariate Representation Theorem Now that we have given a formalization of the notion of maximal correlation in a law invariant sense, that is suitable for multivariate extension, we can proceed to generalize Yaari’s representation result to the case of a preference relation among multivariate prospects. We consider prospects, which are elements of L2d . Axiom 1’. The preference % is represented by a continuous functional γ on L2d , which is Fr´echet differentiable in at least one point, with non zero derivative.

DUAL THEORY OF CHOICE

13

Let µ be a probability distribution on Rd that gives no mass to small sets. Axiom 2’. The preference % preserves µ-first order stochastic dominance in the sense that if prospect X µ-first order stochastically dominates prospect Y , then X % Y , and if X µ-first order strictly stochastically dominates prospect Y , then X Â Y . Axiom 3’. If X, Y, Z are µ-comonotonic prospects, then for any α ∈ [0, 1], X % Y implies αX + (1 − α)Z % αY + (1 − α)Z. Note first that although axiom 1’ is stated directly in terms of the representing functional for simplicity of exposition, it is a rather mild requirement, as it is satisfied when % is a non degenerate Lipschitz pre-order (see [6]). Indeed, a Lipschitz pre-order is represented by a Lipschitz functional, hence satisfies axiom 1’ by Rademacher’s theorem (theorem 2.4 in [9]). The key of the generalization of the dual theory to multivariate prospects is the extension of the comonotonicity axiom. The statement of axiom 3 is unchanged, but the notion of comonotonicity is now dependent on a baseline distribution µ, and X, Y, Z are comonotonic, or more precisely µ-comonotonic if they are all maximally correlated in the law invariant sense of (4) with a baseline U (where U has distribution µ).

3.1. Representation theorem. We are now in a position to give the multivariate extension of Yaari’s representation theorem. Theorem 1 (Multivariate Representation). A preference relation on square integrable multivariate prospects satisfies axioms 1’, 2’ and 3’ relative to a baseline probability measure µ if and only if there exists a function φ such that for U =d µ, φ(U ) is square integrable, φ(U ) ∈ (R− )d almost surely and such that for all pairs X, Y , X % Y if and only if γ(X) ≥ γ(Y ), where γ is defined for all X by γ(X) = E[QX (U ) · φ(U )], where QX is the µ-quantile of X.

Since γ in theorem 1 is positively homogeneous, γ(X) is the certainty equivalent of X, and since γ remains constant additive, % still satisfies linearity in payments, i.e. for any positive real number a, and any b ∈ Rd (identified with a constant multivariate prospect), γ(aX + b) = aγ(X) + b.

ALFRED GALICHON†

14

MARC HENRY§

4. Risk aversion, local utility function In this section we consider the question of representing those decision makers satisfying axioms 1’, 2’ and 3’ that are risk averse in the sense of definition 1; we then show that the local utility function in the sense of Machina (1982) is easily computable and provides an interpretation of the baseline distribution µ. 4.1. Risk aversion. As it turns out, imposing risk aversion on a preference relation that satisfies these axioms is equivalent to requiring the following property, sometimes called preference for diversification. Axiom 4. For any two preference equivalent prospects X and Y , i.e. such that X ∼ Y , convex combinations are preferred to either of the prospects, i.e. for any α ∈ [0, 1], αX + (1 − α)Y % X. This is formalized in the following theorem, which gives a representation for risk averse Yaari decision makers. Theorem 2. In dimension d ≥ 2, for a preference relation satisfying axioms 1’, 2’ and 3’, the following statements are equivalent: (a) % is risk averse, namely X % Y whenever X ≥cv Y . (b) % satisfies axiom 4. (c) The function φ involved in the representation of the preference relation in Theorem 1 satisfies −φ(u) = αu + u0 for α > 0 and u0 ∈ Rd . So, in the multivariate setting the functional γ is convex if and only if −φ is a locationscale transformation: this is a major difference with dimension one, where the functional is convex if and only if −φ is a non-decreasing map. This implies that a multivariate Yaari risk averse decision maker is entirely characterized by the baseline distribution µ. 4.2. Local Utility Function. Throughout the rest of the paper, we shall assume the conditions in theorem 2 are met. By law-invariance, we denote γ(P ) := γ(X), where X ∼ P . Without loss of generality, we shall also assume φ(u) = −u, thus γ(P ) := −E[h∇VP (U ), U i],

DUAL THEORY OF CHOICE

15

where VP = VX is the Kantorovich transportation potential from the baseline probability distribution µ of U to the probability distribution P of X. As we have seen, the gradient ∇VP of this transportation potential is the µ-quantile function of distribution P . As shown in [7], when smoothness requirements are met, a local analysis can be carried where a (risk-averse) non-Expected Utility function behaves for small perturbations around a fixed risk as a (concave) utility function. Formally, the local utility function is defined as u (x|P ) = DP γ(x), where DP γ Fr´echet derivative of γ at P . The multivariate Yaari decision functional γ solves the optimal transportation problem

γ(P ) = max {E[X · U ] : X =d P, U =d µ} .

Denoting by V ∗ (x) = supu [u · x − V (u)] the Legendre-Fenchel transform of a convex lower semi continuous function V , we have:

E[∇VP (U ) · U ] = max {E[X · U ] : X =d P, U =d µ} ¾ ½Z Z ∗ V (u)dµ(u) + V (x)dP (x) : V convex and l.s.c. = min Z Z = VP (u)dµ(u) + VP∗ (x)dP (x)

by the duality of optimal transportation (see for instance theorem 2.9 page 60 of [17]). R R Defining f (V, Q) = V dµ + V ∗ dQ, we have γ(P ) = − inf V f (V, P ), hence an envelope theorem argument formally yields u(x|P ) = DP γ(x) = −VP∗ (x). Hence the local utility function is −VP∗ , the (negative of the) Legendre-Fenchel transform of the Kantorovich transportation potential VP . This point shed light on the economic interpretation of the latter, in the spirit of Machina’s theory of local utility. The function −VP∗ is concave, which is consistent with the risk aversion of a Yaari decision maker under the assumptions of theorem 2. Note that for univariate prospects, u(x|P ) = −VP∗ (x) = Rx −∞ FX (z)dz, so that we recover the fact that when X =d P is a mean preserving spread of Y =d Q, u(z|P ) ≤ u(z|Q) for all z.

ALFRED GALICHON†

16

MARC HENRY§

Conclusion We have developed notions of quantiles and comonotonicity for multivariate prospects, thus allowing for the consideration of choice among vectors of payments in different currencies, at different times, in different categories of goods, etc... The multivariate notions of quantiles and comonotonicity were used to generalize Yaari’s dual theory of choice under risk, where decision makers that are insensitive to hedging of comonotonic risks are shown to evaluate prospects with a weighted sum of quantiles. Risk averse decision makers were shown to be characterized within this framework by a single location-scale map, making the dual theory as readily applicable as expected utility. †

´ D´epartement d’´economie, Ecole polytechnique. E-mail: [email protected]

§ D´epartement de sciences ´economiques, Universit´e de Montr´eal, CIRANO, CIREQ. Email: [email protected]

Appendix A. Proof of results in the main text A.1. Proof of proposition 3. By definition, there are two convex lower semi-continuous functions V1 and V2 and a random vector U =d µ such that X = ∇V1 (U ) and Y = ∇V2 (U ) ˜ such almost surely. Similarly there are convex functions V3 and V4 and a random vector U ˜ ) and Z = ∇V4 (U ˜ ). Now the assumption on the absolute continuity of that Y = ∇V3 (U µ and the distribution of Y implies that ∇V2 is essentially unique hence ∇V2 = ∇V3 , and ˜ holds almost surely. It results that X and Z are µ-comonotonic. ¤ therefore U = U A.2. Proof of lemma 1. The proof is present in [4], Lemma 10, but we recall it here for completeness. Let d ≥ 2, and suppose that µ-comonotonicity implies ν-comonotonicity. Consider U =d µ, and let φ be the convex potential (defined up to an additive constant) such that ∇φ#ν = µ. Then there exists a random vector V =d ν such that U = ∇φ(V ) almost surely. Consider some arbitrary choice Σ of a symmetric positive matrix of size d. Then the map u → Σu is the gradient of a convex function (namely the associated quadratic form), therefore U and ΣU are µ-comonotonic. By assumption, it follows that U and ΣU are also νcomonotonic, hence there exists ζ a convex potential such that ΣU = ∇ζ(V ). Therefore, the

DUAL THEORY OF CHOICE

17

equality Σ∇φ = ∇ζ holds almost everywhere. By differentiating twice (which can be done almost everywhere, by Alexandrov’s second differentiability theorem), we get that ΣD2 φ = D2 ζ hence ΣD2 φ is almost everywhere a symmetric matrix. This being true regardless of the choice of Σ, it follows that D2 φ is almost everywhere a diagonal matrix, hence there exists a real valued map λ(u) such that D2 φ(u) = λ(u)Id , with λ(u) > 0. But this implies ∂ui ∂uj φ(u) = 0 for i 6= j and ∂u2i φ(u) = λ(u) for all i. Therefore, ∂uj λ(u) = ∂uj ∂u2i φ(u) = 0. Hence λ(u) = λ a strictly positive constant. It follows that ∇φ(u) = λu + u0 . The converse holds trivially. ¤ A.3. Proof of proposition 4. The equivalence between (a) and (b) is a famous result stated and improved by many authors, notably Hardy, Littlewood, Polya, Blackwell, Stein, Sherman, Cartier, Fell, Meyer and Strassen. See theorem 2 of [16] for an elegant proof; see also [11], theorem 6. We now show that (b) implies (c). Suppose (b) holds. As explained in Section 2.1, there exists a map ζ such that %µ (X) = E[ζ(X) · X] and ζ(X) =d µ. Now, E[ζ(X) · X] = E[ζ(X) · E[Yˆ |X]] = E[ζ(X) · Yˆ ] ≤ %µ (Y ). (c) implies (d): indeed, convex closure co(equi(Y )) of the equidistribution class of Y is a closed convex set and hence characterized by its support functional %µ (Y ). Hence X ∈ co(equi(Y )) is equivalent to E[Z · X] ≤ %µ (Y ) for all Z, which is equivalent to %µ (X) ≤ %µ (Y ). (d) implies (e): indeed, if X ∈ co(equi(Y )), there is a sequence (Ykn )k≤n of random Pn n vectors each distributed as Y and positive weights αkn such k=1 αk = 1, and X = Pn limn→∞ k=1 αkn Ykn . Then for any law invariant concave functional, we have à Φ

n X k=1

! αkn Ykn



n X

αkn Φ (Ykn ) = Φ(Y ),

k=1

and the conclusion follows by upper-semicontinuity. Finally (e) implies (a) since when LX gives no mass to small sets, for all bounded concave function f , X 7→ Ef (X) is a law invariant concave upper semi-continuous functional. ¤

ALFRED GALICHON†

18

MARC HENRY§

A.4. Proof of theorem 1. Note first that γ defined for all prospect X by γ(X) = E[QX (U )·φ(U )], for a function φ such that φ(U ) ∈ (R− )d is Lipschitz and monotonic, so that axioms 1’ and 2’ are satisfied for a preference relation represented by γ. Finally, comonotonic independence follows directly from the fact that for any two prospects X and Y , the generalized quantile functions QX , QY and QX+Y satisfy QX+Y (U ) = QX (U ) + QY (U ). We now show this fact.

By definition of the generalized quantile functions, we have

˜ ] ≤ sup ˜ ˜ ˜ E[QX+Y (U ) · U ] = supU˜ =d U E[(X + Y ) · U ˜ =d U E[Y · U ] = U =d U E[X · U ] + supU E[QX (U ) · U ] + E[QY (U ) · U ].

On the other hand, we also have E[QX+Y (U ) · U ] =

˜ supZ= ˜ d X+Y E[Z · U ] ≥ E[(QX (U ) + QY (U )) · U ], since by construction, QX (U ) =d X and QY (U ) =d Y , and the desired equality follows. Conversely, we now prove that a preference relation % satisfying axioms 1’,2’ and 3’ is represented by a functional γ defined for all prospect X by γ(X) = E[QX (U ) · φ(U )], for a function φ such that φ(U ) ∈ (R− )d almost surely. By axiom 1’, there exists a functional γ representing % and there is a point Z ∈ L2d , where γ is Fr´echet differentiable with non zero gradient D. Let QZ be the generalized quantile of Z relative to µ. There exists U ∈ L2d with distribution µ such that Z = QZ (U ) almost surely. Let X and Y be two prospects in Ldd with µ-quantile functions QX and QY respectively. By definition of µ-comonotonicity, QX (U ), QY (U ) and Z = QZ (U ) are µ-comonotonic. By axiom 2’, γ is law invariant, so that γ(X) ≥ γ(Y ) is equivalent to γ(QX (U )) ≥ γ(QY (U )). Hence, by axiom 3’, γ(X) ≥ γ(Y ) implies that for any 0 < ² ≤ 1, we have γ(²QX (U ) + (1 − ²)Z) ≥ γ(²QY (U ) + (1 − ²)Z), hence γ(Z + ²(QX (U ) − Z)) ≥ γ(Z + ²(QY (U )−Z)) and therefore γ(Z)+E[D·²(QX (U )−Z)] ≥ γ(Z)+E[D·²(QY (U )−Z)]−o(²), or finally E[D · QX (U )] ≥ E[D · QY (U )]. Suppose now that X and Y are two prospects such that E[D · QX (U )] = E[D · QY (U )]. We shall show that γ(QX (U )) = γ(QY (U )), hence γ(X) = γ(Y ), and thereby conclude that the functional X 7→ E[D · QX (U )] represents %. Indeed, suppose that E[D · QX (U )] = E[D·QY (U )]. We will show shortly that there exists a function φ such that E[D·∇φ(U )] > 0, hence E[D · (QX (U ) + ²∇φ(U ))] > E[D · QY (U )] and E[D · (QX (U ) − ²∇φ(U ))] < E[D ·

DUAL THEORY OF CHOICE

19

QY (U )] for any ² > 0. Using the result above yields γ(QX (U ) + ²∇φ(U )) ≥ γ(QY (U )) and γ(QX (U ) − ²∇φ(U )) ≤ γ(QY (U )), hence γ(QX (U )) = γ(QY (U )) by continuity of γ. Let us then show that E[D · ∇φ(U )] = 0 for all gradient function ∇φ yields a contradiction. Calling VZ the convex function such that Z = QZ (U ) = ∇VZ (U ) almost surely, D is the Fr´echet derivative of γ at Z = ∇VZ (U ). Hence E[D · ∇φ(U )] = 0 implies that γ(∇(VZ (U ) + ²φ(U ))) = γ(∇VZ (U )) + o(²). This is true for all gradient function ∇φ, hence in particular for φ = (V² − VZ )/², where V² is such that Z² = ∇V² (U ) converges to Z in L2 . We then have γ(Z² ) − γ(Z) = o(²), and hence D = 0, which contradicts axiom 1’. We have shown that % is represented by the functional X 7→ E[D · QX (U )]. As µ gives no mass to small sets, D can be written φ(U ), for some function φ which takes values in (R− )d by axiom 2’. ¤ A.5. Proof of theorem 2. (a) implies (b) follows from proposition 4. ˜ − QX )) ≥ γ(X) for all α ∈ (0, 1] (b) implies (c): axiom 4 implies that γ(QX + α(X ˜ in the equidistribution class of X. The representation of theorem 1, implies and all X ˜ · the differentiability of γ at QX (U ) for any X, call DX its gradient. This implies E[X ˜ =d X. Hence γ(X) = −%L DX ] ≤ E[QX (U ) · DX ] for all X (X). Thus, by axiom 3’, φ(U ) comonotonicity with respect to µ implies comonotonicity with respect to Lφ(U ) . By lemma 1, this implies that φ(U ) is a location-scale transform of U , and the result follows. Finally, we show (c) implies (a). Assume (c), in which case for all X ∈ L2d , −γ(X) = αE[QX (U ) · U ] + u0 · E[X], thus −γ(X) = α%µ (X) + u0 · E[X]. Therefore, by proposition 4, X ≥cv Y implies %µ (X) ≤ %µ (Y ), so γ(X) ≥ γ(Y ). ¤ References [1] Allais M., “Le comportement de l’homme rationnel devant le risque: Critiques des postulats et axiomes de l’´ecole am´ericaine,” Econometrica 21, pp. 503-546, (1953). [2] Brenier, Y., “Polar factorization and monotone rearrangement of vector-valued functions,” Communications on Pure and Applied Mathematics 44, pp. 375-417, 1991. [3] Carlier, G., Dana, R.-A., and Galichon, A., “Pareto efficiency for the concave order and multivariate comonotonicity,” working paper, 2009. [4] Ekeland, I., Galichon, A., and Henry, M., “Comonotone measures of multivariate risks,” Mathematical Finance, forthcoming.

ALFRED GALICHON†

20

MARC HENRY§

[5] M. Landsberger, I.I. Meilijson, “Comonotone allocations, Bickel Lehmann dispersion and the ArrowPratt measure of risk aversion,” Annals of Operation Research 52 (1994) pp. 97–106. [6] V. L. Levin, “Lipschitz pre-orders and Lipschitz utility functions,” Russian Mathematical Surveys vol. 39, pp. 217–218 (1984). [7] Machina, M., “ ‘Expected Utility’ Analysis without the Independence Axiom,” Econometrica vol. 50(2), pp. 277-323, 1982. [8] Puccetti, G., and Scarsini, M., “Multivariate comonotonicity,” Journal of Multivariate Analysis, forthcoming. [9] Preiss, D. “Diffentiability of Lipschitz functions on Banach spaces,” Journal of Functional Analysis 91, pp. 312-345, 1990. [10] Rachev, S., and R¨ uschendorf, L., Mass Transportation Problems. Volume I: Theory, and Volume II: Applications, New York: Springer, 1998. [11] R¨ uschendorf, L., “Ordering of distributions and rearrangement of functions,” Annals of Probability 9 (2), pp. 276–283, 1981. [12] Rothschild, M., and Stiglitz, J., “Increasing risk: I. A definition,” Journal of Economic Theory 2, pp. 225–243, 1970. [13] R¨ uschendorf, L., “Law invariant convex risk measures for portfolio vectors,” Statistics and Decisions 24, pp. 97–108, 2006. [14] Schmeidler, D., “Subjective probability and expected utility without additivity,” Econometrica 57, pp. 571–587, 1989. [15] Shaked, M., and Shantikumar, J., Stochastic Orders, New York: Springer, 2007. [16] Strassen, V., “The existence of probability measures with given marginals” Annals of Mathematical Statistics 36, pp. 423-439, 1965. [17] Villani, C., Topics in Optimal Transportation, Providence: American Mathematical Society, 2003. [18] Yaari, M., “The dual theory of choice under risk,” Econometrica 55, pp. 95–115, 1987.

DUAL THEORY OF CHOICE UNDER MULTIVARIATE RISKS ...

To handle these situations, we need to be able to express utility derived from monetary ... a liquidity and a price risk, collection of payments in different currencies, ...... [17] Villani, C., Topics in Optimal Transportation, Providence: American ...

224KB Sizes 3 Downloads 320 Views

Recommend Documents

Choice under aggregate uncertainty
Since sα is chosen uniformly, individual outcomes are identically distributed, but .... Definition 2 A utility U is indifferent to aggregate uncertainty if U(P) = U(Q) for any two lotteries P, ...... Storage for good times and bad: Of squirrels and

Games of school choice under the Boston mechanism ... - CiteSeerX
May 17, 2007 - ... 2007 / Accepted: 9 November 2007 / Published online: 8 December 2007 ... under the Boston mechanism when schools may have complex priority ... of schools, they show that the set of Nash equilibrium outcomes ..... Abdulkadiro˘glu A

Neural Mechanisms of Human Perceptual Choice Under Focused and ...
Feb 25, 2015 - typically won £6 in additional bonuses across the eight experimental blocks. ... The random-effects comparison is more conservative in allowing different ... Pz, P4, P8, PO7, PO3, POz, PO4, PO8, O1, Oz, and O2; plus four addi-.

Neural Mechanisms of Human Perceptual Choice Under Focused and ...
Feb 25, 2015 - contributed unpublished reagents/analytic tools; V.W., N.E.M., and C.S. ..... time courses are time series of the between-trial correlation between the ...... software for advanced analysis of MEG, EEG, and invasive electrophysi-.

Inference of Dynamic Discrete Choice Models under Incomplete Data ...
May 29, 2017 - directly identified by observed data without structural restrictions. ... Igami (2017) and Igami and Uetake (2016) study various aspects of the hard. 3. Page 4. disk drive industry where product quality and efficiency of production ...

Something about Credit Derivatives Risks under the ...
[email protected], or call C.201.232.1950. ... Fraunhofer-Chalmers Research Center for Industrial Mathematics (March 2007) ... Risk Magazine (June 2006).

pdf-1424\theory-of-multivariate-statistics-springer-texts-in-statistics ...
Try one of the apps below to open or edit this item. pdf-1424\theory-of-multivariate-statistics-springer-texts-in-statistics-by-martin-bilodeau-david-brenner.pdf.

Housing Tenure Choice and the Dual Income Household
Nov 24, 2008 - sive tax systems induce more home ownership for high income households. As seen in this result, the tax rate is an important variable in tenure choice studies. ... savings decision by the household, which makes wealth ..... An addition

Housing Tenure Choice and the Dual Income Household
Nov 24, 2008 - in the likelihood of home ownership based on the life-cycle stage of the household. ... They develop a continuous-time life cycle model in which households, ...... rule of thumb value of 10 may not have the same application to ...

Housing Tenure Choice and the Dual Income Household
Jan 26, 2009 - Keywords: Tenure Choice, Maximum Likelihood, Instrumental Variables .... They find that, as a household's marginal tax rate ... savings decision by the household, which makes wealth endogenous to tenure choice. Af- ..... An additional

Dual Problems in Property Testing - MIT CSAIL Theory of Computation
Thm 3: The following dual problems are equivalent to the original problems: 1. Testing whether a string is far from a code. *. 2. Testing whether a function is far from monotone. **. 3. Testing whether a distribution is far from uniform. ***. > Teste

Dual Problems in Property Testing - MIT CSAIL Theory of Computation
Testing whether a function is far from monotone. **. 3. Testing whether a distribution is far from uniform. ***. > Testers via equivalence to the original problem ...

Is Intertemporal Choice Theory Testable?
a Kreps–Porteus style utility function over an infinite horizon consumption program. .... The resulting function is clearly concave and strictly increasing and the.

Relay Selection Schemes for Dual-Hop Networks under ...
Oct 28, 2012 - share their antennas, spatial diversity can be achieved in the fashion of ..... (10). The corresponding achievable secrecy rate is expressed by. Copt = [Ck∗,D − Ck∗,E]. + . (11). The new selection metric is related to the maximiz

A Theory of Portfolio Choice and Partial Default
Kieran James Walsh∗†. University of Virginia Darden School of Business. July 2016 .... the haircut) implied by agent optimization depend only the current and previous re- alizations of the aggregate state ... on the current and last realizations

Choice and Process: Theory Ahead of Measurement - NYU
choices among acts. An act describes the consequence that the decision maker will obtain for every .... bargain (a low price).4 In summary, states satisfy s = γ.

School Choice: Student Exchange under Partial Fairness
‡Department of Economics, Massachusetts Institute of Technology. ..... Step k > 1: Each student rejected in Step k −1 applies to her next best school. .... 7The students actually reveal two pieces of information simultaneously: their preferences