Result: Risk measures and robust optimization problems

Title:
Risk measures and robust optimization problems
Source:
Proceedings of the 8th symposium on probability and stochastic processes, June 20-25, 2004, Puebla, MexicoStochastic models. 22(4):753-831
Publisher Information:
Philadelphia, PA: Taylor & Francis, 2006.
Publication Year:
2006
Physical Description:
print, 68 ref
Original Material:
INIST-CNRS
Subject Terms:
Control theory, operational research, Automatique, recherche opérationnelle, Mathematics, Mathématiques, Sciences exactes et technologie, Exact sciences and technology, Sciences et techniques communes, Sciences and techniques of general use, Mathematiques, Mathematics, Probabilités et statistiques, Probability and statistics, Statistiques, Statistics, Théorie de la décision, Decision theory, Applications, Assurances, économie, finance, Insurance, economics, finance, Sciences appliquees, Applied sciences, Recherche operationnelle. Gestion, Operational research. Management science, Recherche opérationnelle et modèles formalisés de gestion, Operational research and scientific management, Optimisation. Problèmes de recherche, Optimization. Search problems, Théorie du risque. Assurance, Risk theory. Actuarial science, Incertitude, Uncertainty, Incertidumbre, Marché financier, Financial market, Mercado financiero, Modèle dynamique, Dynamic model, Modelo dinámico, Modèle stochastique, Stochastic model, Modelo estocástico, Méthode optimisation, Optimization method, Método optimización, Stratégie optimale, Optimal strategy, Estrategia optima, Théorie mesure, Measure theory, Teoría medida, Théorie probabilité, Probability theory, Teoría probabilidad, Théorie préférence, Preference theory, Teoría preferencia, Théorie risque, Risk theory, Teoría riesgo, Incertitude Knightian, Knightian uncertainty, Investissement optimal, Optimal investment, Mesure convexe de risque, Convex measure of risk, Mesure risque loi invariante, Law-invariant risk measure, Model uncertainty, Risk measure, Value at Risk
Document Type:
Conference Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Institut für Mathematik, Berlin, Germany
ISSN:
1532-6349
Rights:
Copyright 2007 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Mathematics

Operational research. Management
Accession Number:
edscal.18291172
Database:
PASCAL Archive

Further Information

□ These lecture notes give a survey on recent developments in the theory of risk measures. The first part outlines the general representation theory of risk measures in a static one-period setting. In particular, it provides structure theorems for law-invariant risk measures. Examples include Value at Risk, Average Value at Risk, distortion risk measures, and risk measures arising from robust preferences. The second part analyzes risk measures and associated robust optimization problems in the framework of dynamic financial market models. The concept of efficient hedging, as introduced by Föllmer and Leukert[32], is discussed in terms of the more general framework of convex risk measures. The last two sections are devoted to the construction of optimal investment strategies under Knightian uncertainty.

AN0022295671;i3f01nov.06;2019Feb27.14:25;v2.2.500

Risk Measures and Robust Optimization Problems. 

These lecture notes give a survey on recent developments in the theory of risk measures. The first part outlines the general representation theory of risk measures in a static one-period setting. In particular, it provides structure theorems for law-invariant risk measures. Examples include Value at Risk, Average Value at Risk, distortion risk measures, and risk measures arising from robust preferences. The second part analyzes risk measures and associated robust optimization problems in the framework of dynamic financial market models. The concept of efficient hedging, as introduced by Föllmer and Leukert<sup>[32]</sup>, is discussed in terms of the more general framework of convex risk measures. The last two sections are devoted to the construction of optimal investment strategies under Knightian uncertainty.

Keywords: Convex measure of risk; Knightian uncertainty; Law-invariant risk measure; Model uncertainty; Optimal investment; Risk measure; Value at Risk; Primary 91B30; Secondary 46N10, 60G44, 91B16, 91B28

1. INTRODUCTION

These lecture notes reflect the material presented in a minicourse on risk measures given at the 8th Symposium on Probability and Stochastic Processes in Puebla. They were also used in a Cours Bachelier at the Insitut Henri Poincaré, Paris, in 2005. In the first section, we discuss the axiomatic structure theory for monetary measures of risk. This theory was initiated by P. Artzner, F. Delbaen, J. Eber, and D. Heath in their seminal paper[3] and further developed by Carlier and Dana[11], Delbaen[20][21], Föllmer and the author<sups>[</sups>[[33]]<sups>]</sups>, Frittelli and Rosazza Gianin[37][38], Heath[43], Heath and Ku[44], and Kusuoka[53], to mention only a few. The structure of coherent or, more generally, convex risk measures is also closely connected to the numerical representation of risk-averse preferences under Knightian uncertainty, a topic discussed in Section 2.6 along the lines of Schmeidler[65] and Gilboa and Schmeidler[40]. Most of the material presented in this first section is based on the second edition of the author's joint book with Hans Föllmer[35] and the reader is referred to its bibliographic notes for a more detailed historical account.

In Section 3, we discuss risk measures and associated robust optimization problems in the framework of dynamic financial market models. In Section 3.1, we study the effect that hedging has on the acceptability of a financial position. This part is based on Ref.[33]. In Section 3.2, we discuss the concept of efficient hedging, as introduced by Föllmer and Leukert[32], in terms of the more general framework of convex risk measures. In particular, we will give short proofs of recent results due to Sekine[66] and the author[61]. In the last two sections of Section 3, we will get back to the setting of Section 2.6 and discuss the construction of optimal investment strategies under Knightian uncertainty as considered in Ref.[62].

It is a great pleasure to thank the organizers of the 8th SPSP: Mogens Bladt, Antonio González, José Alfredo López-Mimbela, Reyla Navarro, and Juan Ruiz de Chávez. I am also grateful to my co-author Hans Föllmer and our publisher Walter de Gruyter Verlag, Berlin, for the permission to use material from Ref.[35] in the preparation of these notes. I furthermore thank the participants of my lecture series in Paris and Puebla for their comments, which helped to improve the first draft of these notes.

2. MEASURES OF RISK: AXIOMS AND STRUCTURE THEOREMS

In this section, we discuss the problem of quantifying the risk of a financial position, whose profits and losses (P&L) are described by a real-valued function X on some set Ω of possible scenarios ω. The basic asymmetry in the P&L interpretation of X will be taken into account by requiring a property of monotonicity. Convexity of the risk measure will make sure that diversification of portfolios will decrease the overall risk. From the point of view of a supervising agency it is natural to use risk measures in quantifying a capital requirement, i.e., the minimal amount of capital that, if added to the position and invested in a risk-free manner, makes the position acceptable. This monetary interpretation is captured by an axiom of cash invariance. Together with convexity and monotonicity, it singles out the class of convex measures of risk. Under the additional condition of positive homogeneity, we obtain the class of coherent risk measures. This axiomatic approach to monetary risk measures was initiated by Artzner et al.[3].

In this section, we will develop the structure theory of such risk measures in a static situation without reference to the possible elimination of risk via hedging strategies. In the first two sections, we present the general representation theory on L<sups>∞</sups>. In Section 2.3 we discuss some coherent risk measures related to Value at Risk. These risk measures only involve the P&L distribution with respect to a given probability measure. In Section 2.4, we characterize the class of convex risk measures which share this property of law-invariance. In Section 2.5, the resulting risk measures are characterized by a property of comonotonicity.

2.1. Risk Measures and Their Acceptance Sets

Let Ω be a fixed set of scenarios. The P&L of a financial position is described by a mapping X:Ω → ℝ where X(ω) is the discounted net worth of the position at the end of the trading period if the scenario ω ∈ Ω is realized. Our aim is to quantify the risk of X by some number ρ(X), where X belongs to a given class &#55349;&#56499; of financial positions. Throughout this section, &#55349;&#56499; will be a linear space of bounded functions containing the constants.

Definition 2.1.1

A mapping ρ:&#55349;&#56499; → ℝ is called a monetary measure of risk if it satisfies the following conditions for all X, Y ∈ &#55349;&#56499;.

Monotonicity: If X ≤ Y, then ρ(X) ≥ ρ(Y).

Cash invariance: If m ∈ ℝ, then ρ(X + m) = ρ(X) − m.

The financial meaning of monotonicity is clear. Cash invariance is also called translation invariance. It is motivated by the interpretation of ρ(X) as a capital requirement: if the amount m is added to the position and invested in a risk-free manner, the capital requirement is reduced by the same amount. In particular, cash invariance implies

Graph

and

Graph

For most purposes it would be no loss of generality to assume that a given monetary risk measure satisfies the condition of

Normalization: ρ(0) = 0.

In some situations, however, it will be convenient not to insist on normalization.

Lemma 2.1.1

Any monetary measure of risk ρ is Lipschitz continuous with respect to the supremum norm ‖ · ‖:

Graph

Proof

Clearly, X ≤ Y + ‖X − Y‖, and so ρ(Y) − ‖X − Y‖ ≤ ρ(X) by monotonicity and cash invariance. Reversing the roles of X and Y yields the assertion.

From now on our main interest will be on monetary measures of risk that have an additional convexity property.

Definition 2.1.2

A monetary risk measure ρ:&#55349;&#56499; → ℝ is called a convex measure of risk if it satisfies the property of

Convexity: ρ(λ X + (1 − λ)Y) ≤ λ ρ(X) + (1 − λ)ρ(Y), for 0 ≤ λ ≤ 1.

Consider the collection of possible future outcomes that can be generated with the resources available to an investor: One investment strategy leads to X, while a second strategy leads to Y. If one diversifies, spending only the fraction λ of the resources on the first possibility and using the remaining part for the second alternative, one obtains λ X + (1 − λ)Y. Thus, the axiom of convexity gives a precise meaning to the idea that diversification should not increase the risk. If ρ is convex and normalized, then

Graph

Definition 2.1.3

A convex measure of risk ρ is called a coherent measure of risk if it satisfies the condition of

Positive Homogeneity: If λ ≥ 0, then ρ(λ X) = λρ(X).

If a monetary measure of risk ρ is positively homogeneous, then it is normalized, i.e., ρ(0) = 0. Under the assumption of positive homogeneity, convexity is equivalent to

Subadditivity: ρ(X + Y) ≤ ρ(X) + ρ(Y).

This property allows to decentralize the management of risk arising from a collection of different positions: If separate risk limits are given to different "desks", then the risk of the aggregate position is bounded by the sum of the individual risk limits. In many situations, however, risk may grow in a non-linear way as the size of the position increases. Therefore, we will not insist on positive homogeneity.

A monetary measure of risk ρ induces an acceptance set

Graph

consisting of positions which are acceptable in the sense that they do not require additional capital. The following two propositions summarize the relations between monetary measures of risk and their acceptance sets. The author is grateful to Patrick Cheridito for remarks on the closure condition, which helped to improve the statements of these results.

Proposition 2.1.1

Suppose that ρ is a monetary measure of risk with acceptance set &#55349;&#56476; ≔ &#55349;&#56476;<subs>ρ</subs>.

a. &#55349;&#56476; is non-empty, closed with respect to the supremum norm ‖ · ‖, and satisfies the following two conditions:

Graph

b. ρ can be recovered from &#55349;&#56476;:

Graph

c. ρ is a convex risk measure if and only if &#55349;&#56476; is convex.

d. ρ is positively homogeneous if and only if &#55349;&#56476; is a cone. In particular, ρ is coherent if and only if &#55349;&#56476; is a convex cone.

Proof

(a) Closedness follows from Lemma 2.1.1, the remaining properties in (a) are straightforward.

(b) Cash invariance implies that for X ∈ &#55349;&#56499;,

Graph

(c) &#55349;&#56476; is clearly convex if ρ is a convex measure of risk. The converse will follow from Proposition 2.1.2 together with (6).

(d) Clearly, positive homogeneity of ρ implies that &#55349;&#56476; is a cone. The converse follows as in (c).

Conversely, one can take a given set &#55349;&#56476; ⊂ &#55349;&#56499; of acceptable positions as the initial object. For X ∈ &#55349;&#56499;, we can then define the capital requirement as the minimal amount m for which m + X becomes acceptable:

Graph

Note that, with this notation, (4) takes the form

Graph

Proposition 2.1.2

Assume that &#55349;&#56476; is a non-empty subset of &#55349;&#56499; satisfying (2) and (3). Then the functional ρ<subs>&#55349;&#56476;</subs> has the following properties:

a. ρ<subs>&#55349;&#56476;</subs> is a monetary measure of risk.

b. If &#55349;&#56476; is a convex set, then ρ<subs>&#55349;&#56476;</subs> is a convex measure of risk.

c. If &#55349;&#56476; is a cone, then ρ<subs>&#55349;&#56476;</subs> is positively homogeneous. In particular, ρ<subs>&#55349;&#56476;</subs> is a coherent measure of risk if &#55349;&#56476; is a convex cone.

d. &#55349;&#56476; is a subset of &#55349;&#56476;<subs>ρ&#55349;&#56476;</subs>, and &#55349;&#56476; = &#55349;&#56476;<subs>ρ&#55349;&#56476;</subs> holds if and only if &#55349;&#56476; is ‖ · ‖-closed.

Proof

(a) It is straightforward to verify that ρ<subs>&#55349;&#56476;</subs> satisfies cash invariance and monotonicity. We show next that ρ<subs>&#55349;&#56476;</subs> takes only finite values. To this end, fix some Y in the non-empty set &#55349;&#56476;. For X ∈ &#55349;&#56499; given, there exists a finite number m with m + X > Y, for X and Y are both bounded. Then

Graph

and hence ρ<subs>&#55349;&#56476;</subs>(X) ≤ m < ∞. Note that (2) is equivalent to ρ<subs>&#55349;&#56476;</subs>(0) > −∞. To show that ρ<subs>&#55349;&#56476;</subs>(X) > −∞ for arbitrary X ∈ &#55349;&#56499;, we take m′ such that X + m′ ≤ 0 and conclude by monotonicity and cash invariance that ρ<subs>&#55349;&#56476;</subs>(X) ≥ ρ<subs>&#55349;&#56476;</subs>(0) + m′ > −∞.

(b) Suppose that X<subs>1</subs>,X<subs>2</subs> ∈ &#55349;&#56499; and that m<subs>1</subs>,m<subs>2</subs> ∈ ℝ are such that m<subs>i</subs> + X<subs>i</subs> ∈ &#55349;&#56476;. If λ ∈ [0, 1], then the convexity of &#55349;&#56476; implies that λ(m<subs>1</subs> + X<subs>1</subs>) + (1 − λ)(m<subs>2</subs> + X<subs>2</subs>) ∈ &#55349;&#56476;. Thus, by the cash invariance of ρ<subs>&#55349;&#56476;</subs>,

Graph

and the convexity of ρ<subs>&#55349;&#56476;</subs> follows.

(c) As in the proof of convexity, we obtain that ρ<subs>&#55349;&#56476;</subs>(λ X) ≤ λρ<subs>&#55349;&#56476;</subs>(X) for λ ≥ 0 if &#55349;&#56476; is a cone. To prove the converse inequality, let m < ρ<subs>&#55349;&#56476;</subs>(X). Then m + X ∉ &#55349;&#56476; and hence λ m + λ X ∉ &#55349;&#56476; for λ ≥ 0. Thus λ m < ρ<subs>&#55349;&#56476;</subs>(λ X), and (c) follows.

(d) The inclusion &#55349;&#56476; ⊆ &#55349;&#56476;<subs>ρ&#55349;&#56476;</subs> is obvious, and Proposition 2.1.1 implies that &#55349;&#56476; is ‖ · ‖-closed as soon as &#55349;&#56476; = &#55349;&#56476;<subs>ρ&#55349;&#56476;</subs>. Conversely, assume that &#55349;&#56476; is ‖ · ‖ -closed. We have to show that X ∉ &#55349;&#56476; implies that ρ<subs>&#55349;&#56476;</subs>(X) > 0. To this end, take m > ‖X‖. Since &#55349;&#56476; is ‖ · ‖ -closed and X ∉ &#55349;&#56476;, there is some λ ∈ (0, 1) such that λ m + (1 − λ)X ∉ &#55349;&#56476;. Thus,

Graph

Since ρ<subs>&#55349;&#56476;</subs> is a monetary measure of risk, Lemma 2.1.1 shows that

Graph

Hence,

Graph

In the following examples, we take &#55349;&#56499; as the linear space of all bounded measurable functions on some measurable space (Ω, ℱ), and we denote by ℳ<subs>1</subs> = ℳ<subs>1</subs> (Ω, ℱ) the space of all probability measures on (Ω, ℱ).

Example 2.1.1

Consider the worst-case risk measure ρ<subs>max</subs> defined by

Graph

The value ρ<subs>max</subs>(X) is the least upper bound for the potential loss which can occur in any scenario. The corresponding acceptance set &#55349;&#56476; is given by the convex cone of all non-negative functions in &#55349;&#56499;. Thus, ρ<subs>max</subs> is a coherent measure of risk. It is the most conservative measure of risk in the sense that any normalized monetary risk measure ρ on &#55349;&#56499; satisfies

Graph

Note that ρ<subs>max</subs> can be represented in the form

Graph

where &#55349;&#56492; is the class ℳ<subs>1</subs> of all probability measures on (Ω, ℱ).

Example 2.1.2

Consider a utility function u:ℝ → ℝ, that is, u is concave and strictly increasing. If Q ∈ ℳ<subs>1</subs> is a probability measure, then we can consider the expected utility E<subs>Q</subs>[u(X)] of X ∈ &#55349;&#56499; and define the corresponding certainty equivalent as the number u<sups>−1</sups>(E<subs>Q</subs>[u(X)]). This notion gives rise to a convex risk measure as follows: Fix some threshold c ∈ ℝ and let us call a position X acceptable if its certainty equivalent is at least c, i.e., if its expected utility E<subs>Q</subs>[u(X)] is bounded from below by u(c). Clearly, the set

Graph

is non-empty, convex, and satisfies (2) and (3). Thus, ρ<subs>&#55349;&#56476;</subs> is a convex measure of risk. As an obvious robust extension, we can define acceptability in terms of a whole class &#55349;&#56492; of probability measures on (Ω, ℱ), i.e.,

Graph

with constants c<subs>Q</subs> such that sup<subs>Q∈&#55349;&#56492;</subs>c<subs>Q</subs> < ∞.

Example 2.1.3

Suppose now that we have specified a probability measure P on (Ω, ℱ). The distribution of X ∈ &#55349;&#56499; under P is sometimes called the profit-and-loss or P&L distribution. In this context, X can be defined as being acceptable if the probability of a loss is bounded by a given level λ ∈ (0, 1), i.e., if

Graph

The corresponding monetary risk measure V@R<subs>λ</subs>, defined by

Graph

is called Value at Risk at level λ. Note that it is well defined on the space ℒ<sups>0</sups>(Ω, ℱ, P) of all random variables which are P -a.s. finite, and that

Graph

if X is a Gaussian random variable with variance σ<sups>2</sups>(X) and Φ<sups>−1</sups> denotes the inverse of the distribution function Φ of N(0, 1). Clearly, V@R<subs>λ</subs> is positively homogeneous, but in general it is not convex, as shown by Example 2.3.1 below.

2.2. Structure Theorems for Risk Measures on L ∞

From now on we will fix a probability measure P on (Ω, ℱ) and consider risk measures ρ such that

Graph

and we are interested in general structure theorems for such risk measures on &#55349;&#56499; ≔ L<sups>∞</sups> ≔ L<sups>∞</sups>(Ω, ℱ, P). This representation theory was developed by Delbaen[20][21] in the coherent setting and extended to the convex case by Föllmer and Schied<sups>[</sups>[[33]]<sups>]</sups>, Frittelli and Rosazza Gianin[37], and others.

To this end, let us introduce the notation ℳ<subs>1</subs>(P) ≔ ℳ<subs>1</subs>(Ω, ℱ, P) for the set of all probability measures Q on (Ω, ℱ), which are absolutely continuous with respect to P. More generally, ℳ<subs>1,f</subs>(P) ≔ ℳ<subs>1,f</subs>(Ω, ℱ, P) will denote the set of all finitely additive set functions Q:ℱ → [0, 1] which are normalized to Q[Ω] = 1 and absolutely continuous with respect to P in the sense that Q[A] = 0 if P[A] = 0. By E<subs>Q</subs>[X] we denote the integral of X ∈ L<sups>∞</sups> with respect to Q ∈ ℳ<subs>1,f</subs>(P). There are two equivalent ways of defining this integral. One can either define

Graph

for step functions , and then extend the integral to L<sups>∞</sups> by using the density of the step functions. Alternatively, for any X ∈ &#55349;&#56499;, the expectation E<subs>Q</subs>[X] can be defined as the Choquet integral

Graph

It is not difficult to check that both integral notions coincide on step functions and hence are equivalent.

Let α:ℳ<subs>1,f</subs>(P) → ℝ ∪ {+∞} be any functional such that

Graph

for each Q ∈ ℳ<subs>1,f</subs>(P) with α(Q) < ∞, the functional XE<subs>Q</subs>[−X] − α(Q) is convex, monotone, and cash invariant on L<sups>∞</sups>, and these three properties are preserved when taking the supremum over Q ∈ ℳ<subs>1,f</subs>(P). Hence,

Graph

defines a convex measure of risk on L<sups>∞</sups> such that

Graph

The functional α will be called a penalty function for ρ on ℳ<subs>1,f</subs>(P), and we will say that ρ is represented by α on ℳ<subs>1,f</subs>(P). The following representation theorem states that any convex risk measure on L<sups>∞</sups> is of this form and it also gives a variational formula for the minimal penalty function α<subs>min</subs>.

Theorem 2.2.1

Any convex measure of risk ρ on L<sups>∞</sups> is of the form

Graph

where the penalty function α<subs>min</subs> is given by

Graph

Moreover, α<subs>min</subs> is the minimal penalty function which represents ρ, i.e., any penalty function α, for which (10) holds, satisfies α(Q) ≥ α<subs>min</subs>(Q) for all Q ∈ ℳ<subs>1,f</subs>(P).

Proof

Recall that X′ ≔ ρ(X) + X ∈ &#55349;&#56476;<subs>ρ</subs> by (1). Thus, for all Q ∈ ℳ<subs>1,f</subs>(P),

Graph

and we get that α<subs>min</subs> defined by (12) is also given by

Graph

Thus, α<subs>min</subs> corresponds to the Fenchel-Legendre transform of the convex function ρ on the Banach space L<sups>∞</sups>. More precisely,

Graph

where ρ*:(L<sups>∞</sups>)′ → ℝ ∪ {+∞} is defined on the dual (L<sups>∞</sups>)′ of L<sups>∞</sups> by

Graph

and where ℓ<subs>Q</subs> ∈ (L<sups>∞</sups>)′ is given by ℓ<subs>Q</subs>(X) = E<subs>Q</subs>[−X].

The functional ρ is lower semicontinuous with respect to the weak topology on L<sups>∞</sups>, since any set {ρ ≤ c} is convex, strongly closed due to Lemma 2.1.1, and hence weakly closed (see, e.g., Theorem V.3.13 in Ref.[24]). Thus, the general biduality theorem for conjugate functions as stated in Ref.[26] or in Ref.[35], Theorem A.61 yields

Graph

where ρ** denotes the conjugate function of ρ*, i.e.,

Graph

Recall next that (L<sups>∞</sups>)′ can be identified with the space ba(P) ≔ ba(Ω, ℱ, P) of finitely additive set functions with finite total variation that are absolutely continuous with respect to P (see, e.g., Ref.[35], Theorem A.50 or Theorem IV.8.16 in Ref.[24]). Moreover, ρ*(ℓ) < ∞ implies that −ℓ can be identified with some Q ∈ ℳ<subs>1,f</subs>(P), as we will show now. First, the cash invariance of ρ implies that

Graph

and hence that ℓ(1) = −1. Second, we have ℓ(X) ≤ 0 for X ≥ 0, since

Graph

for all c ≥ 0.

Thus, we see that (15) reduces to the representation

Graph

Moreover, the supremum is actually attained: ℳ<subs>1,f</subs>(P) is weak* compact in (L<sups>∞</sups>)′ = ba(P) due to the Banach–Alaoglu theorem, and so the upper semicontinuous functional QE<subs>Q</subs>[−X] − α<subs>min</subs>(Q) attains its maximum on ℳ<subs>1,f</subs>(P).

Finally, let α be any penalty function for ρ. Then, for all Q ∈ ℳ<subs>1,f</subs> and X ∈ L<sups>∞</sups>,

Graph

and hence

Graph

Thus, α dominates α<subs>min</subs>.

Corollary 2.2.1

The minimal penalty function α<subs>min</subs> of a coherent measure of risk ρ takes only the values 0 and +∞. In particular,

Graph

for the convex set

Graph

and &#55349;&#56492;<subs>max</subs> is the largest set &#55349;&#56492; for which a representation of the form ρ(X) = sup<subs>Q∈&#55349;&#56492;</subs>E<subs>Q</subs>[−X] holds.

Proof

Due to the positive homogeneity of ρ, its minimal penalty function satisfies

Graph

for all Q ∈ ℳ<subs>1,f</subs>(P) and λ > 0. Hence, α<subs>min</subs> can take only the values 0 and +∞.

The penalty function α arising in the representation (10) is not unique, and it is often convenient to represent a convex measure of risk by a penalty function that is not the minimal one. For instance, the minimal penalty function may be finite for certain finitely additive set functions while another α is concentrated only on probability measures as in the case of Example 2.1.1. Another situation of this type occurs for risk measures that are constructed as the supremum of a family of convex measures of risk:

Proposition 2.2.1

Suppose that for every i in some index set I we are given a convex measure of risk ρ<subs>i</subs> on L<sups>∞</sups> with associated penalty function α<subs>i</subs>. If sup<subs>iI</subs>ρ<subs>i</subs>(0) < ∞ then

Graph

is a convex measure of risk that can be represented with the penalty function

Graph

Proof

The condition ρ(0) = sup<subs>iI</subs>ρ<subs>i</subs>(0) < ∞ implies that ρ takes only finite values. Moreover,

Graph

and the assertion follows.

In the sequel, we are particularly interested in convex measures of risk, which admit a representation in terms of σ -additive probability measures. Such a risk measure ρ can be represented by a penalty function α, which is infinite outside the set ℳ<subs>1</subs>(P) = ℳ<subs>1</subs>(Ω, ℱ, P):

Graph

In this case, one can no longer expect that the supremum above is attained. This will also be illustrated by Example 2.2.1 below.

Theorem 2.2.2

Suppose ρ:L<sups>∞</sups> → ℝ is a convex measure of risk. Then the following conditions are equivalent.

a. ρ can be represented by some penalty function on ℳ<subs>1</subs>(P).

b. ρ can be represented by the restriction of the minimal penalty function α<subs>min</subs> to ℳ<subs>1</subs>(P):

Graph

c. ρ is continuous from above: If X<subs>n</subs> → X P-a.s. then ρ(X<subs>n</subs>) ↗ ρ(X).

d. ρ has the "Fatou property": For any bounded sequence (X<subs>n</subs>) which converges P -a.s. to some X,

Graph

e. ρ is lower semicontinuous for the weak* topology σ(L<sups>∞</sups>,L<sups>1</sups>).

f. The acceptance set &#55349;&#56476;<subs>ρ</subs> of ρ is weak* closed.

Proof

(b) ⇒ (a) is obvious.

(a) ⇒ (d) Dominated convergence implies that E<subs>Q</subs>[X<subs>n</subs>] → E<subs>Q</subs>[X] for each Q ∈ ℳ<subs>1</subs>(P). Hence,

Graph

(d) ⇒ (c) By monotonicity, ρ(X<subs>n</subs>) ≤ ρ(X) for each n if X<subs>n</subs> → X, and so ρ(X<subs>n</subs>) ↗ ρ(X) follows.

(c) ⇒ (d) Define Y<subs>m</subs> ≔ sup<subs>nm</subs>X<subs>n</subs>. Then Y<subs>m</subs> decreases P -a.s. to X. Since ρ(X<subs>n</subs>) ≥ ρ(Y<subs>n</subs>) by monotonicity, we get (d) from (c).

(d) ⇒ (e) We have to show that &#55349;&#56478; ≔ {ρ ≤ c} is weak* closed for c ∈ ℝ. To this end, let &#55349;&#56478;<subs>r</subs> ≔ &#55349;&#56478; ∩ {X ∈ L<sups>∞</sups> | ‖X‖<subs>∞</subs> ≤ r} for r > 0. If (X<subs>n</subs>) is a sequence in &#55349;&#56478;<subs>r</subs> converging in L<sups>1</sups> to some random variable X, then there is a subsequence that converges P -a.s., and the Fatou property of ρ implies that X ∈ &#55349;&#56478;<subs>r</subs>. Hence, &#55349;&#56478;<subs>r</subs> is closed in L<sups>1</sups> and, due to convexity, also weakly closed in L<sups>1</sups>. Since the natural injection

Graph

is continuous, &#55349;&#56478;<subs>r</subs> is σ(L<sups>∞</sups>,L<sups>1</sups>) -closed in L<sups>∞</sups>. Thus, &#55349;&#56478; is weak* closed due to the Krein–Šmulian theorem (see, e.g., Theorem V.5.7 in Ref.[24]).

(e) ⇒ (f) is obvious, and (f) ⇒ (e) follows from cash invariance.

(e) ⇒ (b) Weak* lower semicontinuity allows us to repeat the proof of Theorem 2.2.1 for the weak* topology on L<sups>∞</sups>. The dual of L<sups>∞</sups> for this topology is given by L<sups>1</sups>, so that ρ* is by definition concentrated on σ -additive measures absolutely continuous with respect to P. Only the compactness argument breaks down for ℳ<subs>1</subs>(P), so that we may no longer replace the supremum by a maximum in (18). All remaining arguments carry over with minor modifications.

Remark 2.2.1

The proof of Theorem 2.2.2 can be modified in a straight forward manner to cover representation theorems for convex risk measures on the Banach spaces L<sups>p</sups>(Ω, ℱ, P) for 1 ≤ p < ∞. More precisely, let q ∈ (1,∞] be such that , and define

Graph

A convex risk measure ρ on L<sups>p</sups> is of the form

Graph

if and only if it is lower semicontinuous on L<sups>p</sups>, i.e., the Fatou property holds in the form

Graph

In fact, one can show that (19) is automatically satisfied for any convex risk measure on L<sups>p</sups> with 1 ≤ p < ∞; see Cheridito et al.[13], Proposition 3.8.

Definition 2.2.1

A convex measure of risk ρ on L<sups>∞</sups> is called sensitive with respect to P if

Graph

for all X ∈ L<sups>∞</sups><subs>+</subs> such that E[X] > 0.

Sensitivity is also called relevance. In the context of Theorem 2.2.2, it is equivalent to the condition that E<subs>Q</subs>[X] > α(Q) for some Q ∈ ℳ<subs>1</subs>(P) as soon as X ≥ 0 is such that E[X] > 0.

Theorem 2.2.2 takes the following form for coherent measures of risk; the proof is the same as the one for Corollary 2.2.1.

Corollary 2.2.2

A coherent measure of risk on L<sups>∞</sups> can be represented by a set &#55349;&#56492; ⊂ ℳ<subs>1</subs>(P) if and only if the equivalent conditions of Theorem 2.2.2 are satisfied. In this case, the maximal representing subset of ℳ<subs>1</subs>(P) is given by

Graph

Moreover, ρ is sensitive if and only if &#55349;&#56492;<subs>max</subs> ∼ P in the sense that for any A ∈ ℱ

Graph

Continuity from above was one of the equivalent properties in Theorem 2.2.2. If one considers continuity from below in the form

Graph

then it turns out that it is a stronger condition than continuity from above. One can show that continuity from below is equivalent to the following Lebesgue property:

Graph

see Ref.[34], Remark 4.19 or Ref.[35], Remark 4.23. The argument relies on the fact that continuity from below implies that the minimal penalty function is concentrated on ℳ<subs>1</subs>(P), as is proved in the next corollary for coherent risk measures.

Corollary 2.2.3

For a coherent measure of risk ρ on L<sups>∞</sups> the following properties are equivalent:

a. ρ is continuous from below: X<subs>n</subs> ↗ X → ρ(X<subs>n</subs>)→ρ(X).

b. There exists a set &#55349;&#56492; ⊂ ℳ<subs>1</subs>(P) representing ρ such that the supremum is attained:

Graph

c. There exists a set &#55349;&#56492; ⊂ ℳ<subs>1</subs>(P) representing ρ such that the set of densities

Graph

is weakly compact in L<sups>1</sups>(Ω, ℱ, P).

Proof

(c) ⇒ (a) This follows from Dini's lemma.

(a) ⇒ (b) Consider the representation ρ(X) = max<subs>Q∈&#55349;&#56492;max</subs>E<subs>Q</subs>[−X], where &#55349;&#56492;<subs>max</subs> is the maximal representing subset on ℳ<subs>1,f</subs>(P), and let A<subs>1</subs> ⊂ A<subs>2</subs> ⊂ ··· be any decreasing sequence of events such that ∩<subs>n</subs>A<subs>n</subs> = ∅. Continuity from below implies that Q[A<subs>n</subs>] → 0 for each Q ∈ &#55349;&#56492;<subs>max</subs>, so that &#55349;&#56492;<subs>max</subs> ⊂ ℳ<subs>1</subs>(P).

(b) ⇒ (c) Without loss of generality, we can assume that &#55349;&#56479; is weakly closed in L<sups>1</sups>. For any X ∈ L<sups>∞</sups>, the continuous linear functional J<subs>X</subs> on L<sups>1</sups> defined by J<subs>X</subs>(Z) ≔ E[XZ] attains its infimum on &#55349;&#56479;. According to Jame' theorem (see, for instance, Ref.[28]), this implies weak compactness of &#55349;&#56479;.

A similar result as the preceding corollary also holds for convex risk measures on L<sups>∞</sups>; see Ref.[34], Proposition 4.17 or Ref.[35], Proposition 4.21, and Jouini et al.[48]. We now give examples of coherent measures of risk that will be studied in more detail in Section 2.3.

Example 2.2.1

In our present context, where we require condition (9), the worst-case risk measure takes the form

Graph

One can easily check that ρ<subs>max</subs> is coherent and satisfies the Fatou property. Moreover, the acceptance set of ρ<subs>max</subs> is equal to the positive cone L<sups>∞</sups><subs>+</subs> in L<sups>∞</sups>, and this implies α<subs>min</subs>(Q) = 0 for any Q ∈ ℳ<subs>1</subs>(P). Thus,

Graph

Note however that the supremum on the right cannot be replaced by a maximum in case (Ω, ℱ, P) cannot be reduced to a finite model. Indeed, let X ∈ L<sups>∞</sups> be such that X does not attain its essential infimum. Then there can be no Q ∈ ℳ<subs>1</subs>(P) such that E<subs>Q</subs>[X] = ess ∈ f X = −ρ<subs>max</subs>(X). In this case, the preceding corollary shows that ρ<subs>max</subs> is not continuous from below.

Example 2.2.2

Let &#55349;&#56492;<subs>λ</subs> be the class of all Q ∈ ℳ<subs>1</subs>(P) whose density dQ/dP is bounded by 1/λ for some fixed parameter λ ∈ (0, 1). The corresponding coherent risk measure

Graph

will be called the Average Value at Risk at level λ. This terminology will become clear in Section 2.3, which contains a detailed study of AV@R<subs>λ</subs>. Note that the set of densities dQ/dP for Q ∈ &#55349;&#56492;<subs>λ</subs> is weakly closed in L<sups>1</sups>. Moreover, it is weakly compact due to the Dunford–Pettis theorem. Thus, the supremum in (20) is actually attained, and Corollary 2.2.3 applies. An explicit construction of the maximizing measure will be given in the proof of Theorem 2.3.1.

Example 2.2.3

We take for &#55349;&#56492; the class of all conditional distributions P[· | A] such that A ∈ ℱ has P[A] > λ for some fixed level λ ∈ (0, 1). The coherent measure of risk induced by &#55349;&#56492;,

Graph

is called the worst conditional expectation at level λ. We will show in Section 2.3 that it coincides with the Average Value at Risk of Example 2.2.2 if the underlying probability space is rich enough.

2.3. Value at Risk

As seen in Example 2.1.3, a common approach to the problem of measuring the risk of a financial position X consists in specifying a quantile of the distribution of X under the given probability measure P. In the sequel, we will first recall the notion of a quantile function. We will use the generic notation F<subs>X</subs> for the distribution function of a random variable X. When the emphasis is on the law μ of X, we will also write F<subs>μ</subs>.

Definition 2.3.1

A function q<subs>X</subs>:(0, 1) → ℝ is called a quantile function for X if

Graph

The left- and right-continuous inverse functions of F<subs>X</subs>,

Graph

are called the lower and upper quantile functions. The value q<subs>X</subs>(λ) of a quantile function at a given level λ ∈ (0, 1) is called a λ-quantile of X.

The following lemma explains the reason for calling q<sups>−</sups><subs>X</subs> and q<sups>+</sups><subs>X</subs> the upper and lower quantile functions.

Lemma 2.3.1

A function q:(0, 1) → ℝ is a quantile function X if and only if

Graph

In particular, q<sups>−</sups><subs>X</subs> and q<sups>+</sups><subs>X</subs> are quantile functions. Moreover, q<sups>−</sups><subs>X</subs> is left-continuous, q<sups>+</sups><subs>X</subs> is right-continuous, and every quantile function q<subs>X</subs> is increasing and satisfies q<subs>X</subs>(s −) = q<subs>X</subs><sups>−</sups>(s) and q<subs>X</subs>(s +) = q<subs>X</subs><sups>+</sups>(s) for all s ∈ (0, 1). In particular, any two quantile functions coincide a.e. on (0, 1).

Proof

We have q<subs>X</subs><sups>−</sups> ≤ q<subs>X</subs><sups>+</sups>, and any quantile function q<subs>X</subs> satisfies q<subs>X</subs><sups>−</sups> ≤ q<subs>X</subs> ≤ q<subs>X</subs><sups>+</sups>, due to the definitions of q<subs>X</subs><sups>−</sups> and q<sups>+</sups><subs>X</subs>. Hence, the first part of the assertion follows if we can show that F<subs>X</subs>(q<sups>+</sups><subs>X</subs>(s) −) ≤ s ≤ F<subs>X</subs>(q<subs>X</subs><sups>−</sups>(s)) for all s. But x < q<subs>X</subs><sups>+</sups>(s) implies F<subs>X</subs>(x) ≤ s and y > q<subs>X</subs><sups>−</sups>(s) implies F<subs>X</subs>(y) ≥ s, which gives the result. Next, the set {x | F<subs>X</subs>(x) > s} is the union of the sets {x | F<subs>X</subs>(x) > s + ϵ} for ϵ < 0, and so q<sups>+</sups><subs>X</subs> is right-continuous. An analogous argument shows the left-continuity of q<sups>−</sups><subs>X</subs>. It is clear that both q<subs>X</subs><sups>−</sups> and q<subs>X</subs><sups>+</sups> are increasing, so that the second part of the assertion follows.

Remark 2.3.1

The left- and right-continuous quantile functions can also be represented as

Graph

To see this, note first that q<subs>X</subs><sups>−</sups>(s) is clearly dominated by the infimum. On the other hand, y > q<subs>X</subs><sups>−</sups>(s) implies F<subs>X</subs>(y) ≥ s, and we get q<subs>X</subs><sups>−</sups>(s) ≥ ∈f{x ∈ ℝ | F<subs>X</subs>(x) ≥ s}. The proof for q<subs>X</subs><sups>+</sups> is analogous.

The following basic fact is well known. See, e.g., Ref.[35], Appendix A.3 for a proof.

Lemma 2.3.2

Let U be a random variable on a probability space (Ω, ℱ, P) with a uniform distribution on (0, 1), i.e., P[U ≤ s] = s for all s ∈ (0, 1). If q<subs>X</subs> is a quantile function for the random variable X, then

Graph

has the same distribution as X. If, moreover, F<subs>X</subs> is continuous, then U ≔ F<subs>X</subs>(X) is uniformly distributed on (0, 1), and X = q<subs>X</subs>(U) P-almost surely.

The second part of the preceding lemma implies that a probability space supports a random variable with uniform distribution on (0, 1) if and only if it supports any non-constant random variable X with a continuous distribution. See Lemma 2.4.1 below for a more general result.

In this section, we will focus on the properties of q<subs>X</subs><sups>+</sups>(λ), viewed as a functional on a space of financial positions X. Many results presented here were first obtained by Artzner et al.[3], Delbaen[20][21], and Acerbi and Tasche[1]; see the notes in Ref.[35] for details.

Definition 2.3.2

Fix some level λ ∈ (0, 1). For a financial position X, we define its Value at Risk at level λ as

Graph

In financial terms, V@R<subs>λ</subs>(X) is the smallest amount of capital which, if added to X and invested in the risk-free asset, keeps the probability of a negative outcome below the level λ. However, Value at Risk only controls the probability of a loss; it does not capture the size of such a loss if it occurs. Clearly, V@R<subs>λ</subs> is a monetary measure of risk on &#55349;&#56499; = L<sups>0</sups> and positively homogeneous. The following example shows that the acceptance set of V@R<subs>λ</subs> is typically not convex, and so V@R<subs>λ</subs> is not a convex measure of risk. In particular, V@R<subs>λ</subs> may penalize diversification instead of encouraging it, and this fact will also be illustrated by the example.

Example 2.3.1

Consider an investment into two defaultable corporate bonds, each with return  > r, where r ≥ 0 is the return on a riskless investment. The discounted net gain, or P&L, of an investment w > 0 in the ith bond is given by

Graph

If a default of the first bond occurs with probability p ≤ λ, then

Graph

Hence,

Graph

This means that the position X<subs>1</subs> is acceptable in the sense that is does not carry a positive Value at Risk, regardless of the possible loss of the entire investment w. In fact, from a V@R point of view, an investment into the defaultable bond is actually "less risky" than a riskfree investment with P&L R ≡ 0 and V@R<subs>λ</subs>(R) = 0.

Diversifying the portfolio by investing the amount w/2 into each of the two bonds leads to the position Y ≔ (X<subs>1</subs> + X<subs>2</subs>)/2. Let us assume that the two bonds default independently of each other, each of them with probability p. For realistic , the probability that Y is negative is equal to the probability that at least one of the two bonds defaults: P[Y < 0] = p(2 − p). If, for instance, p = 0. 009 and λ = 0.01 then we have p < λ < p(2 − p), hence

Graph

Typically, this value is close to one half of the invested capital w and thus constitutes a dramatic increase compared to the investment where the entire amount w is invested into one defaultable bond only. Thus, the acceptance set of V@R<subs>λ</subs> is not convex. This example also illustrates that V@R may strongly discourage diversification: It penalizes quite drastically the increase of the probability that something goes wrong, without rewarding the significant reduction of the expected loss conditional on the event of default. Thus, optimizing a portfolio with respect to V@R<subs>λ</subs> may lead to a concentration of the portfolio in one single asset with a sufficiently small default probability, but with an exposure to large losses.

In the remainder of this section, we will focus on monetary measures of risk which, in contrast to V@R<subs>λ</subs>, are convex or even coherent on &#55349;&#56499; ≔ L<sups>∞</sups>. In particular, we are looking for convex risk measures which come close to V@R<subs>λ</subs>. A first guess might be that one should take the smallest convex measure of risk, continuous from above, which dominates V@R<subs>λ</subs>. However, since V@R<subs>λ</subs> itself is not convex, the following proposition shows that such a smallest V@R<subs>λ</subs> -dominating convex measure of risk does not exist. A proof can be found in Ref.[20].

Proposition 2.3.1

For each X ∈ L<sups>∞</sups> and each λ ∈ (0, 1),

Graph

For the rest of this section, we concentrate on the following risk measure which is defined in terms of Value at Risk, but does satisfy the axioms of a coherent risk measure.

Definition 2.3.3

The Average Value at Risk at level λ ∈ (0,1] of a position X ∈ L<sups>∞</sups> is given by

Graph

Sometimes, the Average Value at Risk is also called the "Conditional Value at Risk" or the "expected shortfall", and one writes CV@R<subs>λ</subs>(X) or ES<subs>λ</subs>(X). These terms are motivated by formulas (26) and (23) below, but they are potentially misleading: "Conditional Value at Risk" might also be used to denote the Value at Risk with respect to a conditional distribution, and "expected shortfall" might be understood as the expectation of the shortfall X<sups>−</sups>. For these reasons, we prefer the term Average Value at Risk. Note that

Graph

by (22). In particular, the definition of AV@R<subs>λ</subs>(X) makes sense for any X ∈ L<sups>1</sups>(Ω, ℱ, P) and we have, in view of Lemma 2.3.2,

Graph

Remark 2.3.2

For X ∈ L<sups>∞</sups>, we have lim<subs>λ ↓ 0</subs>V@R<subs>λ</subs>(X) = −ess ∈ f X = ∈f{m | P[X + m < 0] ≤ 0}. Hence, it makes sense to define

Graph

which is the worst-case risk measure on L<sups>∞</sups> introduced in Example 2.2.1. Recall that it is continuous from above but in general not from below.

Lemma 2.3.3

For λ ∈ (0, 1) and any λ-quantile q of X,

Graph

Proof

Let q<subs>X</subs> be a quantile function with q<subs>X</subs>(λ) = q. By Lemma 2.3.2,

Graph

This proves the first identity. For the second one, let f denote the convex function f(r) ≔ E[(r − X)<sups>+</sups>] − λ r. Note that the right-hand and left-hand derivatives of f are given by f′<subs>+</subs>(r) = F<subs>X</subs>(r) − λ and f′<subs>−</subs>(r) = F<subs>X</subs>(r −) − λ. A point r is a minimizer of f iff f<subs>+</subs>′(r) ≥ 0 and f<subs>−</subs>′(r) ≤ 0, which is equivalent to r being a λ-quantile. This proves the second identity.

Theorem 2.3.1

For λ ∈ (0,1], AV@R<subs>λ</subs> is a coherent measure of risk which is continuous from below. It has the representation

Graph

where &#55349;&#56492;<subs>λ</subs> is the set of all probability measures Q ≪ P whose density dQ/dP is P -a.s. bounded by 1/λ. Moreover, &#55349;&#56492;<subs>λ</subs> is equal to the maximal set &#55349;&#56492;<subs>max</subs> of Corollary 2.2.2.

The proof relies on the following version of the classical Neyman–Pearson lemma, which we recall here for the convenience of the reader. It is concerned with the infinite-dimensional optimization problem

Graph

where λ ∈ [0, 1] and is probability measure such that .

Proposition 2.3.2 (Neyman–Pearson Lemma)

A solution to the problem (25) is given by

Graph

where q is a (1 − λ)-quantile of with respect to P and κ is defined as

Graph

Moreover, any other solution coincides with ψ<sups>0</sups>, P -a.s. on {ϕ ≠ q}. In particular, ψ<sups>0</sups> is the P -a.s. unique σ(ϕ) -measurable maximizer.

Proof

We first show that ψ<sups>0</sups> satisfies the constraints of the problem. Let F<subs>ϕ</subs> denote the distribution function of ϕ under P. Then P[ϕ > q] = 1 − F<subs>ϕ</subs>(q) ≤ λ and

Graph

Hence 0 ≤ κ ≤ 1 and in turn 0 ≤ ψ<sups>0</sups> ≤ 1. The fact that E[ψ<sups>0</sups>] = λ is obvious.

Next, let ψ be any other measurable function satisfying the constraints. Then (ψ<sups>0</sups> − ψ) (ϕ − q) ≥ 0. Hence

Graph

Thus, ψ<sups>0</sups> solves the optimization problem.

Finally, suppose that ψ* is another solution. Then and also E[ψ*] = E[ψ<sups>0</sups>], due to the already established fact that ψ<sups>0</sups> is a solution. Hence,

Graph

But we have seen above that (ψ<sups>0</sups> − ψ*)(ϕ − q) ≥ 0. Hence, (ψ<sups>0</sups> − ψ*) (ϕ − q) = 0 P-a.s., i.e., ψ* = ψ<sups>0</sups>P-a.s. on {ϕ ≠ q}.

Proof of Theorem 2.3.1

Since &#55349;&#56492;<subs>1</subs> = {P}, the assertion is obvious for λ = 1. For 0 < λ < 1, We will show that the coherent risk measure ρ<subs>λ</subs>(X) ≔ sup<subs>Q∈&#55349;&#56492;λ</subs>E<subs>Q</subs>[−X] is such that the supremum in its definition is attained and that ρ<subs>λ</subs>(X)=AV@R<subs>λ</subs>(X). Since both ρ<subs>λ</subs> and AV@R<subs>λ</subs> are cash invariant and positively homogeneous, we may assume without loss of generality that X < 0 with E[−X] = 1. We define a measure by . Then

Graph

Due to the Neyman–Pearson lemma, the supremum is attained by

Graph

for a λ -quantile q of X and some κ ∈ [0, 1] for which E[ψ<subs>0</subs>] = λ. Note that a λ -quantile of X is the negative of a (1 − λ) -quantile for −X. We get

Graph

Since dQ<subs>0</subs> = λ<sups>−1</sups>ψ<subs>0</subs> dP defines a probability measure in &#55349;&#56492;<subs>λ</subs>, we conclude that

Graph

where we have used (23) in the last step. This proves (24).

It remains to prove that &#55349;&#56492;<subs>λ</subs> is the maximal set of Corollary 2.2.2. To this end, we show that

Graph

We denote ψ ≔ dQ/dP. There exist λ′ ∈ (0,λ) and k > 1/λ′ such that P[ψ∧ k ≥ 1/λ′] > 0. For c > 0 define X<sups>(c)</sups> ∈ L<sups>∞</sups> by

Graph

Since

Graph

we have V@R<subs>λ</subs>(X<sups>(c)</sups>) = 0, and (23) yields that

Graph

On the other hand,

Graph

Thus, the difference between E<subs>Q</subs>[−X<sups>(c)</sups>] and AV@R<subs>λ</subs>(X<sups>(c)</sups>) becomes arbitrarily large as c ↑ ∞.

Remark 2.3.3

The proof shows that for λ ∈ (0, 1) the maximum in (24) is attained by the measure Q<subs>0</subs> ∈ &#55349;&#56492;<subs>λ</subs>, whose density is given by

Graph

where q is a λ -quantile of X, and where κ is defined as

Graph

Corollary 2.3.1

For all X ∈ L<sups>∞</sups>,

Graph

where WCE<subs>λ</subs> is the coherent risk measure defined in (21). Moreover, the first two inequalities are in fact identities if

Graph

which is the case if X has a continuous distribution.

Proof

If P[A] ≥ λ, then the density of P[· | A] with respect to P is bounded by 1/λ. Therefore, Theorem 2.3.1 implies that AV@R<subs>λ</subs> dominates WCE<subs>λ</subs>. Since

Graph

we have

Graph

and the second inequality follows by taking the limit as ϵ ↓ 0. Moreover, (23) shows that

Graph

as soon as (27) holds.

Remark 2.3.4

We will see in Corollary 2.4.3 that the two coherent risk measures AV@R<subs>λ</subs> and WCE<subs>λ</subs> coincide if the underlying probability space is rich enough. If this is not the case, then the first inequality in (26) may be strict for some X; see Ref.[1]. Moreover, the functional

Graph

does not define a convex measure of risk. Hence, the second inequality in (26) cannot reduce to an identity in general; see Ref.[1].

We have seen in Proposition 2.3.1 that there is no smallest convex risk measure dominating V@R<subs>λ</subs>. But if we restrict our attention to the class of convex risk measures that dominate V@R<subs>λ</subs> and only depend on the distribution of a random variable, then the situation is different. In fact, we will see in Theorem 2.4.4 that AV@R<subs>λ</subs> is the smallest risk measure in this class, provided that the underlying probability space is rich enough. In this sense, Average Value at Risk can be regarded as the best conservative approximation to Value at Risk.

2.4. Law-Invariant Risk Measures

Clearly, V@R<subs>λ</subs> and AV@R<subs>λ</subs> only involve the distribution of a position under the given probability measure P. In this section we study the class of all risk measures which share this property of law-invariance. Such risk measures were first discussed systematically by Kusuoka[53]. The extensions to the convex case were given by Dana[19], Föllmer and Schied[35], and Kunze[52].

Definition 2.4.1

A monetary measure of risk ρ on &#55349;&#56499; = L<sups>∞</sups>(Ω, ℱ, P) is called law-invariant if ρ(X) = ρ(Y) whenever X and Y have the same distribution under P.

Throughout this section, we assume that the probability space (Ω, ℱ, P) is rich enough in the sense that it supports a random variable with a continuous distribution. This condition is satisfied if and only if (Ω, ℱ, P) is atomless. We can now formulate our first structure theorem for law-invariant convex risk measures.

Theorem 2.4.1

Let ρ be a convex measure of risk and suppose that ρ is continuous from above. Then ρ is law-invariant if and only if its minimal penalty function α<subs>min</subs>(Q) depends only on the law of under P when Q ∈ ℳ<subs>1</subs>(P). In this case, ρ has the representation

Graph

and the minimal penalty function satisfies

Graph

The condition of continuity from above can in fact be dropped: every law-invariant convex risk measure is automatically continuous from above, as was shown very recently by Jouini et al.[48]. We will see a particular case of this general fact in Theorem 2.5.1 below.

The proof of Theorem 2.4.1 uses the following general results on quantile functions. They will also be useful in the second part of these notes.

Lemma 2.4.1

If X = f(Y) for an increasing function f and q<subs>Y</subs> is a quantile function for Y, then f(q<subs>Y</subs>(t)) is a quantile function for X. In particular,

Graph

for any quantile function q<subs>X</subs> of X.

If f is decreasing, then f(q<subs>Y</subs>(1 − t)) is a quantile function for X. In particular,

Graph

Proof

If f is decreasing, then q(t) ≔ f(q<subs>Y</subs>(1 − t)) satisfies

Graph

Hence q(t) = f(q<subs>Y</subs>(1 − t)) is a quantile function. A similar argument applies to an increasing function f.

The following theorem is a version of the Hardy–Littlewood inequalities. They estimate the expectation E[XY] in terms of quantile functions q<subs>X</subs> and q<subs>Y</subs>.

Theorem 2.4.2

Let X, Y ≥ 0 be two random variables on (Ω, ℱ, P) with quantile functions q<subs>X</subs> and q<subs>Y</subs>. Then,

Graph

Moreover, if X = f(Y) and the lower (upper) bound is finite, then the lower (upper) bound is attained if and only if f can be chosen as a decreasing (increasing) function.

Proof

By Fubini's theorem,

Graph

Since

Graph

and since

Graph

for any random variable Z ≥ 0, another application of Fubini's theorem yields

Graph

In the same way, the upper estimate follows from the inequality

Graph

For X = f(Y),

Graph

due to Lemma 2.3.2, and so Lemma 2.4.1 implies that the upper and lower bounds are attained for increasing and decreasing functions, respectively.

Conversely, assume that X = f(Y), and that the upper bound is attained and finite:

Graph

Our aim is to show that P -a.s. X = f(Y) = (Y), where is the increasing function on [0, ∞) defined by (x) ≔ q<subs>X</subs>(F<subs>Y</subs>(x)) if x is a continuity point of F<subs>Y</subs>, and by

Graph

otherwise. Note that

Graph

where E<subs>λ</subs>[· | q<subs>Y</subs>] denotes the conditional expectation with respect to q<subs>Y</subs> under the Lebesgue measure λ on (0, 1). Hence, (31) takes the form

Graph

where we have used Lemma 2.3.2. Let ν denote the distribution of Y. By introducing the positive measures dμ = f dν and d =  dν, (33) can be written as

Graph

On the other hand, with g denoting the increasing function &#55349;&#56640;<subs>[y,∞)</subs>, the upper Hardy–Littlewood inequality, Lemma 2.4.1, and (32) yield

Graph

In view of (34), we obtain μ = , hence f =  ν -a.s. and X = (Y) P-almost surely. An analogous argument applies to the lower bound.

The following lemma generalizes the second part of Lemma 2.3.2.

Lemma 2.4.2

If X is a random variable on an atomless probability space, then there exists a random variable with a uniform law on (0, 1) such that X = q<subs>X</subs>(U) P -almost surely.

Proof

We follow Ryff[59]. Without loss of generality, we may assume that q<subs>X</subs> = q<sups>+</sups><subs>X</subs>. Then I<subs>x</subs> ≔ {t ∈ (0, 1) | q<subs>X</subs>(t) = x} is a (possibly empty or degenerate) real interval with Lebesgue measure λ(I<subs>x</subs>) = P[X = x] for each x ∈ ℝ. Consider the set D ≔ {x ∈ ℝ | P[X = x] > 0}, which is at most countable. For each x ∈ D, the probability space (Ω, ℱ, P[· | X = x]) is again atomless and hence supports a random variable U<subs>x</subs> with a uniform law on I<subs>x</subs>. That is, P[U<subs>x</subs> ∈ A | X = x] = λ(A ∩ I<subs>x</subs>)/λ(I<subs>x</subs>) or, equivalently,

Graph

On D<sups>c</sups> = (0, 1)\ D, q<subs>X</subs> is one-to-one and hence admits a measurable inverse function F (which can actually be taken as F<subs>X</subs>, but this fact will not be needed here). We let

Graph

which clearly is a measurable random variable. By definition we have q<subs>X</subs>(U(ω)) = X(ω) for all ω. It remains to show that U has a uniform law. To this end, take a measurable subset A of (0, 1). Using (35) we get

Graph

Now let I<sups>c</sups> denote the complement of ∪<subs>xD</subs>I<subs>x</subs> in (0, 1). Then {X ≠ in D} = {X ∈ q<subs>X</subs>(I<sups>c</sups>)} P -a.s. and hence

Graph

where we have used the fact that . This proves the result.

The preceding lemma and Theorem 2.4.2 imply the following result. Recall that we assume that (Ω, ℱ, P) is atomless.

Lemma 2.4.3

For Y ∈ L<sups>∞</sups> and X ∈ L<sups>1</sups>,

Graph

where indicates that is a random variable with the same law as Y. Moreover, the maximum is attained for , where U is as in Lemma 2.4.2, i.e., U has a uniform law on (0, 1) and satisfies X = q<subs>X</subs>(U) P -a.s.

Proof

The upper Hardy–Littlewood inequality in Theorem 2.4.2 yields "≥". To prove the reverse inequality, let U be as in Lemma 2.4.2. According to Lemma 2.3.2, then has the same law as Y. Hence,

Graph

Proof of Theorem 2.4.1

Suppose first that ρ is law-invariant. Then X ∈ &#55349;&#56476;<subs>ρ</subs> implies that for all . Hence,

Graph

by Lemma 2.4.3. It follows that α<subs>min</subs>(Q) depends only on the law of ϕ<subs>Q</subs>. In order to check the second identity in (28), note that belongs to &#55349;&#56476;<subs>ρ</subs> for any X ∈ L<sups>∞</sups> and that q<subs>−X</subs> − ρ(X) is a quantile function for .

Conversely, let us assume that α<subs>min</subs>(Q) depends only on the law of ϕ<subs>Q</subs>. Let us write to indicate that ϕ<subs>Q</subs> and have the same law. Then Lemma 2.4.3 yields

Graph

Example 2.4.1

Let u:ℝ → ℝ be an increasing concave function, and suppose that a position X ∈ L<sups>∞</sups> is acceptable if E[u(X)] ≥ c, where c is a given constant in the interior of u(ℝ). We have seen in Example 2.1.2 that the corresponding acceptance set induces a convex risk measure ρ. Clearly, ρ is law-invariant, and it is not difficult to show that ρ is continuous from below and, hence, from above; see Ref.[34], Proposition 4.59 or Ref.[35], Proposition 4.104. Moreover, the corresponding minimal penalty function can be computed as

Graph

where

Graph

is the Fenchel–Legendre transform of the convex increasing loss function ℓ(x) ≔ −u(− x); see Ref.[33], Theorem 10, Ref.[34], Theorem 4.61, or Ref.[35], Theorem 4.106.

For a probability density ϕ = dQ/dP, the functional

Graph

appearing in Theorem 2.4.1 is sometimes called the maximal correlation risk measure. The following theorem shows that ρ<subs>ϕ</subs> can be represented as a mixture of the risk measures AV@R<subs>λ</subs> and hence is itself a coherent measure of risk. Recall that we assume that (Ω, ℱ, P) is atomless.

Theorem 2.4.3

Let ϕ = dQ/dP for some Q ∈ ℳ<subs>1</subs>(P). Then there exists a probability measure μ on (0,1] such that

Graph

In particular, a convex measure of risk ρ is law-invariant and continuous from above if and only if

Graph

where

Graph

Proof

Since q<subs>−X</subs>(t)=V@R<subs>1−t</subs>(X) and q<subs>ϕ</subs>(t) = q<sups>+</sups><subs>ϕ</subs>(t) for a.e. t ∈ (0, 1),

Graph

Since q<subs>ϕ</subs><sups>+</sups> is increasing and right-continuous, we can write q<subs>ϕ</subs><sups>+</sups>(t) = ν((1 − t,1]) for some positive locally finite measure ν on (0,1]. Moreover, the measure μ given by μ(dt) = tν(dt) is a probability measure on (0,1]:

Graph

Thus,

Graph

The second assertion in Theorem 2.4.3 takes the following form for coherent measures of risk.

Corollary 2.4.1

A coherent risk measure ρ is continuous from above and law-invariant if and only if

Graph

for some set ℳ ⊂ ℳ<subs>1</subs>((0,1]).

The preceding result is due to Kusuoka[53]. We point out once more that the condition of continuity from above can actually be dropped according to a recent result by Jouini et al.[48].

Law-invariant convex risk measures enjoy the following Jensen-type inequality, which is due to H. Föllmer and taken from Ref.[66]. Here we give a proof based on Lemma 2.3.3 and Theorem 2.4.3.

Corollary 2.4.2

Assume that ρ is a convex risk measure which is continuous from above and law-invariant. Then, for X ∈ L<sups>∞</sups> and any σ -algebra &#55349;&#56482; ⊂ ℱ,

Graph

and in particular

Graph

Proof

By Jensen's inequality for conditional expectations,

Graph

for any r ∈ ℝ. Hence, Lemma 2.3.3 implies that the first inequality holds for ρ:AV@R<subs>λ</subs>. But this is enough, due to Theorem 2.4.3. The second inequality follows from the first by taking &#55349;&#56482; = {∅,Ω}. In contrast to Proposition 2.3.1, the following theorem shows that AV@R<subs>λ</subs> is the best conservative approximation to V@R<subs>λ</subs> in the class of all law-invariant convex measures of risk which are continuous from above, given our standing assumption that (Ω, ℱ, P) is atomless. This result is due to Delbaen[20].

Theorem 2.4.4

AV@R<subs>λ</subs> is the smallest law-invariant convex measure dominating V@R<subs>λ</subs>.

Proof

That AV@R<subs>λ</subs> dominates V@R<subs>λ</subs> was already stated in (26). Suppose now that ρ is another law-invariant convex risk measure which dominates V@R<subs>λ</subs> and which is continuous from above. We must show that, for a given X ∈ L<sups>∞</sups>,

Graph

Take ϵ > 0, and let A ≔ {−XV@R<subs>λ</subs>(X) − ϵ} and

Graph

Since Y > q<subs>X</subs><sups>+</sups>(λ) + ϵ ≥ E[X | A] on A<sups>c</sups>, we get P[Y < E[X | A]] = 0. On the other hand, P[Y ≤ E[X | A]] ≥ P[A] > λ, and this implies that V@R<subs>λ</subs>(Y) = E[−X | A]. Since ρ dominates V@R<subs>λ</subs>, we have ρ(Y) ≥ E[−X | A]. Thus,

Graph

by Corollary 2.4.2. Taking ϵ ↓ 0 yields

Graph

If the distribution of X is continuous, Corollary 2.3.1 states that the conditional expectation on the right equals AV@R<subs>λ</subs>(X), and we obtain (38). If the distribution of X is not continuous, we denote by D the set of all points x such that P[X = x] > 0 and take any bounded random variable Z ≥ 0 with a continuous distribution. Such a random variable exists due to our assumption that (Ω, ℱ, P) is atomless. Note that has a continuous distribution. Indeed, for any y,

Graph

Moreover, X<subs>n</subs> decreases to X. The inequality (38) holds for each X<subs>n</subs> and extends to X by continuity from above.

Corollary 2.4.3

AV@R<subs>λ</subs> and WCE<subs>λ</subs> coincide under our standing assumption that the probability space is atomless.

Proof

We know from Corollary 2.3.1 that WCE<subs>λ</subs>(X)=AV@R<subs>λ</subs>(X) if X has a continuous distribution. Repeating the approximation argument at the end of the preceding proof yields WCE<subs>λ</subs>(X)=AV@R<subs>λ</subs>(X) for each X ∈ L<sups>∞</sups>.

Since AV@R<subs>λ</subs> is coherent, continuous from below, and law-invariant, any mixture

Graph

for some probability measure μ on (0,1] has the same properties. According to Remark 2.3.2, we may set AV@R<subs>0</subs>(X) = −ess ∈ f X so that we can extend the definition (39) to probability measures μ on the closed interval [0, 1]. However, ρ<subs>μ</subs> will only be continuous from above and not from below if μ({0}) > 0, because AV@R<subs>0</subs> is not continuous from below. Our next goal is to derive a representation of the risk measure ρ<subs>μ</subs> in terms of the Choquet integral with respect to the set function c<subs>ψ</subs>(A) ≔ ψ(P[A]), where ψ is the nonlinear function constructed in the following lemma.

Lemma 2.4.4

By defining ψ(0) = 0 and

Graph

we get a one-to-one correspondence between probability measures μ on [0, 1] and increasing concave functions ψ:[0, 1] → [0, 1] with ψ(0) = 0 and ψ(1) = 1.

Proof

Suppose first that μ is given and ψ is defined by (40). Then ψ is concave and increasing on (0,1]. Moreover,

Graph

Conversely, if ψ is given, then its right-hand derivative ψ′<subs>+</subs>(t) is a decreasing right-continuous function on (0, 1) and can be written as ψ′<subs>+</subs>(t) = ν((t,1]) for some locally finite positive measure ν on (0,1]. We first define μ on (0,1] by μ(dt) = tν(dt). Then, by Fubini's theorem,

Graph

Hence, setting μ({0}) ≔ ψ(0 +) defines a probability measure μ on [0, 1], for which (40) holds.

Theorem 2.4.5

For a probability measure μ on [0, 1], let ψ be the concave function defined in Lemma 2.4.4. Then, for X ∈ L<sups>∞</sups>,

Graph

Proof

Using the fact that V@R<subs>λ</subs>(− X) = q<subs>X</subs><sups>−</sups>(1 − λ), we get as in (37) that

Graph

Hence, we obtain the first identity. For the second one, we will first assume X ≥ 0. Then, using (29)and Fubini's theorem, we obtain

Graph

since . This proves the second identity for X ≥ 0, since ψ(0 +) = μ({0}) and ess sup XAV@R<subs>0</subs>(− X). If X ∈ L<sups>∞</sups> is arbitrary, we consider X + C, where C ≔ −ess ∈ f X. The cash invariance of ρ<subs>μ</subs> yields

Graph

Example 2.4.2

Clearly, the risk measure AV@R<subs>λ</subs> is itself of the form ρ<subs>μ</subs> where μ = δ<subs>λ</subs>. For λ > 0, the corresponding concave distortion function is given by

Graph

Thus, we obtain yet another representation of AV@R<subs>λ</subs>:

Graph

As another consequence of Theorem 2.4.5, we obtain an explicit description of the maximal representing set &#55349;&#56492;<subs>μ</subs> ⊂ ℳ<subs>1</subs>(P) for the coherent risk measure ρ<subs>μ</subs>, which was first obtained by Carlier and Dana[11] in the case of a sufficiently regular distortion function ψ.

Theorem 2.4.6

Let μ be a probability measure on [0, 1], and let ψ be the corresponding concave function defined in Lemma 2.4.4. Then ρ<subs>μ</subs> can be represented as

Graph

where the set &#55349;&#56492;<subs>μ</subs> is given by

Graph

Moreover, &#55349;&#56492;<subs>μ</subs> is the maximal subset of ℳ<subs>1</subs>(P) that represents ρ<subs>μ</subs>.

Proof

The risk measure ρ<subs>μ</subs> is coherent and continuous from above. By Corollary 2.2.2, it can be represented by taking the supremum of expectations over the set &#55349;&#56492;<subs>max</subs> = {Q ∈ ℳ<subs>1</subs>(P) | α<subs>min</subs>(Q) = 0}. It from (13) that Q ∈ &#55349;&#56492;<subs>max</subs> if and only if E<subs>Q</subs>[−X] ≤ ρ<subs>μ</subs>(X) for all X ∈ L<sups>∞</sups>. By the second identity in Theorem 2.4.5, this condition is equivalent to Q[A] ≤ ψ(P[A]) for all A ∈ ℱ.

In order to get the second representation of &#55349;&#56492;<subs>μ</subs>, we use (28) and the first identity in Theorem 2.4.5 to see that a measure Q ∈ ℳ<subs>1</subs>(P) with density ϕ = dQ/dP belongs to &#55349;&#56492;<subs>max</subs> if and only if

Graph

for all X ∈ L<sups>∞</sups>. For constant random variables Xt, we have q<subs>X</subs> = &#55349;&#56640;<subs>[t,1]</subs> a.e., and so we obtain

Graph

for all t ∈ (0, 1). Hence &#55349;&#56492;<subs>max</subs> ⊂ &#55349;&#56492;<subs>μ</subs>. For the proof of the converse inclusion, we show that the density ϕ of a fixed measure Q ∈ &#55349;&#56492;<subs>μ</subs> satisfies (41) for any given X ∈ L<sups>∞</sups>. To this end, let ν be the positive finite measure on [0, 1] such that q<subs>X</subs><sups>+</sups>(s) = ν([0,s]). Using Fubini's theorem and the definition of &#55349;&#56492;<subs>μ</subs>, we get

Graph

which coincides with the right-hand side of (41).

2.5. Comonotonic Law-Invariant Risk Measures

In many situations, the risk ρ(X + Y) of a combined position will be strictly lower than the sum of the individual risks ρ(X) and ρ(Y) because one position serves as a hedge against adverse changes in the other position. If, on the other hand, there is no way for X to work as a hedge for Y then we may want the risk simply to add up. In order to make this idea precise, we introduce the notion of comonotonicity. Our main goal in this section is to show that a law-invariant convex risk measure ρ is comonotonic if and only if it is of the form

Graph

for some probability measure μ on [0, 1]. In other words, comonotonicity characterizes those law-invariant convex risk measures which quantify the risk of a position as the expected loss with respect to a concave distortion of the underlying probability measure P. Law-invariant comonotonic coherent risk measures were first characterized by Kusuoka[53]. The reader can find further results on comonotonic risk measures in Refs.[20][21][35].

Definition 2.5.1

Two measurable functions X and Y on (Ω, ℱ) are called comonotone if

Graph

A monetary measure of risk ρ is called comonotonic if

Graph

whenever X and Y are comonotone.

Lemma 2.5.1

If ρ is a monetary measure of risk defined on the space of bounded measurable functions and if ρ is comonotonic, then ρ is positively homogeneous.

Proof

Note that (X, X) is a comonotone pair. Hence ρ(2X) = 2ρ(X). An iteration of this argument yields ρ(rX) = rρ(X) for all rational numbers r ≥ 0. Positive homogeneity now follows from the Lipschitz continuity of ρ; see Lemma 2.1.1.

The following lemma is taken from Denneberg[23].

Lemma 2.5.2

Two measurable functions X and Y on (Ω, ℱ) are comonotone if and only if there exists a third measurable function Z on (Ω, ℱ) and increasing functions f and g on ℝ such that X = f(Z) and Y = g(Z).

Proof

Clearly, X ≔ f(Z) and Y ≔ g(Z) are comonotone for given Z, f, and g. Conversely, suppose that X and Y are comonotone and define Z by Z ≔ X + Y. We show that z ≔ Z(ω) has a unique decomposition as z = x + y, where (x, y) = (X(ω′),Y(ω′)) for some ω′ ∈ Ω. Having established this, we can put f(z) ≔ x and g(z) ≔ y. The existence of the decomposition as z = x + y follows by taking x ≔ X(ω) and y ≔ Y(ω), so it remains to show that these are the only possible values x and y. To this end, let us suppose that X(ω) + Y(ω) = z = X(ω′) + Y(ω′) for some ω′ ∈ Ω. Then

Graph

and comonotonicity implies that this expression vanishes. Hence x = X(ω′) and y = Y(ω′). Next, we check that both f and g are increasing functions on Z(Ω). So let us suppose that

Graph

This implies

Graph

Comonotonicity thus yields that X(ω<subs>1</subs>) − X(ω<subs>2</subs>) ≤ 0 and Y(ω<subs>1</subs>) − Y(ω<subs>2</subs>) ≤ 0, whence f(z<subs>1</subs>) ≤ f(z<subs>2</subs>) and g(z<subs>1</subs>) ≤ g(z<subs>2</subs>). Thus, f and g are increasing on Z(Ω), and it is straightforward to extend them to increasing functions defined on ℝ.

The following lemma implies in particular that V@R<subs>λ</subs> and AV@R<subs>λ</subs> are comonotonic.

Lemma 2.5.3

If X and Y is a pair of comonotone random variables on (Ω, ℱ, P), then q<subs>X</subs> + q<subs>Y</subs> is a quantile function for X + Y. In particular,

Graph

Proof

By Lemma 2.5.2, X = f(Z) and Y = g(Z) for some random variable Z and increasing functions f and g. Applying Lemma 2.4.1 to the increasing function h ≔ f + g shows that h(q<subs>Z</subs>) = f(q<subs>Z</subs>) + g(q<subs>Z</subs>) is a quantile function for X + Y. Another application of the same lemma yields that q<subs>X</subs> + q<subs>Y</subs> is a quantile function for X + Y.

The following theorem shows on the one hand that the risk measures AV@R<subs>λ</subs> may be viewed as the extreme points in the convex class of all comonotonic law-invariant convex risk measures on L<sups>∞</sups> that are continuous from above. This part of the result was first proved by Kusuoka[53]. The theorem also shows that every comonotonic law-invariant convex risk measure is automatically continuous from above, and this fact was first observed by Kunze[52].

Theorem 2.5.1

On an atomless probability space, the class of risk measures

Graph

is precisely the class of all law-invariant convex risk measures on L<sups>∞</sups> that are comonotonic. In particular, any convex measure of risk that is law-invariant and comonotonic is also coherent and continuous from above.

Proof

Comonotonic additivity of ρ<subs>μ</subs> follows from Lemma 2.5.3 and the first representation in Theorem 2.4.5.

Now assume that ρ is a law-invariant convex measure of risk that is also comonotonic. Then ρ is a coherent risk measure by Lemma 2.5.1 and hence subadditive. Consider the set function A↦ρ(− &#55349;&#56640;<subs>A</subs>). Since ρ is law-invariant and coherent, there exists an increasing function ψ on [0, 1] such that ψ(0) = 0, ψ(1) = 1, and ρ(− &#55349;&#56640;<subs>A</subs>) = ψ(P[A]) = :c<subs>ψ</subs>(A). Note that &#55349;&#56640;<subs>AB</subs> and &#55349;&#56640;<subs>AB</subs> is a pair of comonotone functions for all A, B ∈ ℱ. Hence, comonotonicity and subadditivity of ρ imply

Graph

To verify the concavity of ψ, we shall show that ψ(y) ≥ (ψ(x) + ψ(z))/2 whenever 0 ≤ x ≤ z ≤ 1 and y = (x + z)/2. To this end, we will construct two sets A, B ⊂ ℱ such that P[A] = P[B] = y, P[A ∩ B] = x, and P[A ∪ B] = z. We then get ψ(x) + ψ(z) ≤ 2ψ(y) from (43) and in turn the concavity of ψ. In order to construct the two sets A and B, take a random variable U with a uniform distribution on [0, 1], which exists due to our assumption that (Ω, ℱ, P) is atomless. Then

Graph

are as desired.

Theorem 2.4.5 shows that the Choquet integral with respect to c<subs>ψ</subs> can be identified with a risk measure ρ<subs>μ</subs>, where μ is obtained from ψ via Lemma 2.4.4. Let us now show that ρ and ρ<subs>μ</subs> coincide on simple random variables of the form

Graph

Since these random variables are dense in L<sups>∞</sups>, Lemma 2.1.1 will then imply that ρ = ρ<subs>μ</subs>. In order to show that ρ<subs>μ</subs>(X) = ρ(X) for X as above, we may assume without loss of generality that a<subs>1</subs> ≥ a<subs>2</subs> ≥ ··· ≥ a<subs>n</subs> and that the sets A<subs>i</subs> are disjoint. Thus, we can write , where b<subs>i</subs> ≔ a<subs>i</subs> − a<subs>i+1</subs>, a<subs>n+1</subs> ≔ 0, and . Since and b<subs>k</subs>&#55349;&#56640;<subs>Bk</subs> are comonotone and ρ(− &#55349;&#56640;<subs>A</subs>) = c<subs>ψ</subs>(A) = ρ<subs>μ</subs>(− &#55349;&#56640;<subs>A</subs>),

Graph

Remark 2.5.1

Let ψ:[0, 1] → [0, 1] be an increasing function with ψ(0) = 0 and ψ(1) = 1. The preceding proof shows that the concavity of ψ is equivalent to the fact that the set function c<subs>ψ</subs>(A) ≔ ψ(P[A]) is submodular or 2-alternating in the sense of Choquet:

Graph

This property of submodularity will play an important role in Sections 3.3 and 3.4.

2.6. Risk Measures Arising from Robust Preferences

In this section, we will see how risk measures arise in a natural way from numerical representations of the preferences of an investor. As a motivation, let us first consider the following simple thought experiment. Suppose the investor is offered a bet β<subs>1</subs> that pays off +1000€ or −1000€, both with known probability p = 0.5. The alternative would be to reject the offered bet, and this could be regarded as accepting the "bet" β<subs>2</subs> with the certain payoff 0€, which is also identical to the expected payoff of the risky bet β<subs>1</subs>. An investor who is risk-averse will thus prefer β<subs>2</subs> over the risky bet β<subs>1</subs>. Now consider a third bet β<subs>3</subs> that also yields either +1000€ or −1000€, but this time we assume that no information on the success probability is provided. That is, the investor is facing model uncertainty or ambiguity, which is sometimes also called Knightian uncertainty. Although the possible payoffs of β<subs>3</subs> and β<subs>1</subs> are identical, it is reasonable to assign some value to the information on the success probability given for β<subs>1</subs>. Hence, β<subs>1</subs> should be preferred over β<subs>3</subs>. That is, the underlying decision rule should exhibit a feature one might call uncertainty aversion. In this section, our aim is to outline a corresponding theory of choice that was developed by Schmeidler[65] and Gilboa and Schmeidler[40]. In particular, we wish to highlight its connections to coherent risk measures.

The general aim of a theory of choice is to give an axiomatic foundation and corresponding mathematical representation theory for a normative decision rule by means of which one can reach decisions when presented with several alternatives. Our starting point is the classical theory of expected utility as developed by John von Neumann and Oscar Morgenstern; see, e.g., Kreps[51] or Refs.[34][35], Chapter 2 for introductions. It deals with monetary bets whose outcome probabilities are known. Such a bet can be regarded as a Borel probability measure μ on ℝ. More precisely, we will consider here the space

Graph

of boundedly supported Borel probability measures. The decision rule is usually taken as a preference relation or preference order ≻ on ℳ<subs>b</subs>, i.e., ≻ is a binary relation on ℳ<subs>b</subs> that is asymmetric

Graph

and negative transitiv

Graph

see, e.g., Refs.[51] or Refs.[34][35], Chapter 2 for details. The corresponding weak preference order μ ⪰ ν is defined via μ ⪰ ν ⇔ν ⊁ μ. If both μ⪰ν and ν ⪰ μ hold, we will write . Dealing with a preference order is greatly facilitated if one has a numerical representation, namely a function R:ℳ<subs>b</subs> → ℝ such that

Graph

John von Neumann and Oscar Morgenstern formulated a set of axioms that are necessary and sufficient for the existence of a numerical representation U of von Neumann–Morgenstern form, that is,

Graph

for a function U:ℝ → ℝ. The two main axioms are:

the Archimedean axiom: for any triple μ≻λ≻ν there are α, β ∈ (0, 1) such that αμ + (1 − α)ν≻λ≻β μ + (1 − β)ν;

the independence axiom: for all μ, ν ∈ ℳ, the relation μ≻ν implies αμ + (1 − α)λ≻αν + (1 − α)λ for all λ ∈ ℳ and all α ∈ (0,1].

These two axioms are equivalent to the existence of an affine numerical representation R. To obtain an integral representation (44) for this affine functional on ℳ<subs>b</subs> one needs some additional regularity condition such as topological assumptions on the level sets of ≻; see Ref.[51] and Refs.[34][35], Chapter 2.

The preference order is called monotone if

Graph

Clearly, monotonicity holds if and only if the function U in (44) is strictly increasing. One says that the preference order exhibits risk aversion if the certain amount m(μ) ≔ ∈t xμ(dx) is preferred over the risky lottery μ, i.e.,

Graph

Risk aversion is equivalent to the strict concavity of U, and if U is both increasing and strictly concave, it is called a utility function.

Now we wish to extend this setting to the case in which the probabilities of outcomes may be subject to uncertainty. This is achieved by randomizing lotteries via an exterior probability space (Ω, ℱ). More precisely, we will consider a space defined as the set of all stochastic kernels from (Ω, ℱ) to ℝ for which there exists a constant c ≥ 0 such that

Graph

In mathematical economics, the elements of are sometimes called horse race lotteries. The space of standard lotteries ℳ<subs>b</subs> has a natural embedding into by identifying μ ∈ ℳ<subs>b</subs> with the constant map for all ω.

Now consider a given preference order ≻ on . We will assume that ≻ is compatible with the embedding in the sense that

Graph

We will furthermore assume the following extension of the Archimedean axiom of classical von Neumann–Morgenstern theory.

Archimedean axiom: if are such that , then there are α,β ∈ (0, 1) with

Graph

We also need an extended version of the independence axiom:

certainty independence: for , and α ∈ (0,1] we have

Graph

<bold> These two immediately imply that the restriction of ≻ to ℳ_SB__I</bold>i__sb_ satisfies both the classical Archimeadean axiom and the independence axiom. Hence, it admits an affine numerical representation R:ℳ<subs>b</subs> → ℝ, and to simplify things, we will assume henceforth that R is of von Neumann–Morgenstern form (44) for some function U:ℝ → ℝ.

The main new axiom is:

uncertainty aversion: if are such that , then

Graph

In order to motivate the term "uncertainty aversion", consider the following simple example. For Ω ≔ {0,1} define

Graph

Suppose that an agent is indifferent between the choices and , which both involve the same kind of uncertainty. In the case of uncertainty aversion, the convex combination is weakly preferred to both and . It takes the form

Graph

This convex combination now allows for upper and lower probability bounds in terms of α, and this means that model uncertainty is reduced in favor of risk. For α = 1/2, the resulting lottery is independent of the scenario ω, i.e., model uncertainty is completely eliminated.

Theorem 2.6.1

Under the above conditions, there exists a unique extension of R to a numerical representation , and is of the form

Graph

for a convex set &#55349;&#56492; ⊂ ℳ<subs>1,f</subs>(Ω, ℱ).

A proof of this theorem can be found in Ref.[40] or in Refs.[34][35], Section 2.5.

Remark 2.6.1

Let us comment on the axiom of certainty independence. It extends the independence axiom for preferences on ℳ<subs>b</subs> to our present setting, but only under the restriction that one of the two contingent lotteries and is certain, i.e., does not depend on the scenario ω ∈ Ω. Without this restriction, the extended independence axiom would lead to a so-called Savage representation

Graph

in terms of a subjective measure Q. But there are good reasons for not requiring full independence for all . As an example, take Ω = {0,1} and define , , and . An agent may prefer over , thus expressing the implicit view that scenario 1 is somewhat more likely than scenario 0. At the same time, the agent may like the idea of hedging against the occurrence of scenario 0, and this could mean that the certain lottery

Graph

is preferred over the contingent lottery

Graph

thus violating the independence assumption in its unrestricted form. In general, the role of as a hedge against scenarios unfavorable for requires that and are not comonotone, where comonotonicity means:

Graph

Thus, the wish to hedge would still be compatible with the following stronger version of certainty independence, called

comonotonic independence: For , , and α ∈ (0,1]

Graph

whenever and are comonotone.

This stronger requirement holds iff the set &#55349;&#56492; in the preceding theorem is such that γ(A) ≔ sup<subs>Q∈&#55349;&#56492;</subs>Q[A] is submodular:

Graph

see Schmeidler[65], p. 582 or Ref.[35], Section 4.7. Compare also with Remark 2.5.1.

The space &#55349;&#56499; of all bounded measurable function on (Ω, ℱ) can be embedded into by virtue of the mapping

Graph

In this way, &#55349;&#56499; can be identified with the set of all uncertain payoffs. The preceding theorem implies that the restriction of ≻ to &#55349;&#56499; admits the numerical representation

Graph

and it is this representation in which we are really interested. Note, however, that it is necessary to formulate the axiom of uncertainty aversion on the larger space of uncertain lotteries. But even without its axiomatic foundation, such a representation of preferences in the face of model uncertainty by a subjective utility assessment R<subs>&#55349;&#56499;</subs>(X) is highly plausible as it stands. It may be viewed as a robust approach to the problem of model uncertainty: The agent has in mind a whole collection of possible probabilistic views of the given set of scenarios and takes a worst case approach in evaluating the payoff of a given financial position.

Let us now emphasize the downside rather than the upside by switching from a utility functional R<subs>&#55349;&#56499;</subs> to the associated loss functional L ≔ −R<subs>&#55349;&#56499;</subs>, and let us assume that U is a utility function. Then our representation takes the form

Graph

where ℓ denotes the convex increasing loss function on ℝ defined by ℓ(x) = −U(− x). Now suppose that an agent finds a position X acceptable if L(X) does not exceed a given bound x<subs>0</subs>. How do we determine the amount of capital which is needed to turn a given position X ∈ &#55349;&#56499; into an acceptable position by adding this amount? In order to answer this question, consider the convex acceptance set

Graph

where x<subs>0</subs> is an interior point in the range of ℓ. Recalling Proposition 2.1.1, we see that &#55349;&#56476; induces a convex measure of risk ρ<subs>L</subs>. Applying the results of the preceding section, we can conclude that ρ<subs>L</subs> admits a representation of the form

Graph

Thus, the problem is reduced to the computation of a suitable penalty function α<subs>L</subs>. To this end, let us introduce the Fenchel–Legendre transform ℓ* of the loss function ℓ defined by

Graph

By combining the formula stated in Example 2.4.1 with Proposition 2.2.1, we get the following result.

Theorem 2.6.2

Suppose Q is a set of equivalent probability measures. Then the convex risk measure corresponding to the acceptance set &#55349;&#56476; can be represented in terms of the penalty function

Graph

Thus, α<subs>L</subs>(P) < ∞ only if P ≪ Q for at least some Q ∈ &#55349;&#56492;.

Example 2.6.1

For the exponential loss function ℓ(x) = e<sups>x</sups> and x<subs>0</subs> = 1, the penalty function in Theorem 2.6.2 takes the form

Graph

where

Graph

denotes the relative entropy of P with respect to Q; see Ref.[35].

3. RISK MEASURES AND ROBUST OPTIMIZATION IN FINANCIAL MARKETS

In this section, we will extend our setting by assuming that financial positions arise in a multivariate market model. In such a model, it may be possible to eliminate risk by using appropriate hedging strategies, and this idea should be considered when measuring the risk of positions. In Section 3.1, we show how (super-)hedgeability can be used as a criterion for acceptability of positions, and we identify the corresponding risk measures. We also discuss how hedging can be used in order to relax a given acceptability criterion. In Section 3.2, we address the problem of risk-minimal hedging, when the risk criterion is defined in terms of a risk measure. In Section 3.3, we consider an optimal investment problem under uncertainty aversion. Thus, we are back in the setting of Section 2.6, and it is more natural to consider utility rather than risk, i.e., we are looking for investment strategies that maximize a robust utility functional. In particular, we give a general criterion under which the robust problem can be reduced to a standard one. In Section 3.4, we will then discuss a number of examples.

3.1. Measures of Risk in a Financial Market with Convex Constraints

In this section, we will assume that positions are—at least to some extend—contingent on a financial market, and we will show how one can combine a given risk measure with the idea of risk reduction by hedging. Moreover, we will see how various arbitrage valuation principles correspond to certain convex risk measures. Thus, the concept of a risk measures can unify and combine many common static and dynamic approaches to risk.

The most popular market models in mathematical finance are based on time-continuous price processes. However, insofar arbitrage theory is concerned, these models exhibit certain pathologies that stem from the idealization of trading in continuous time. For instance, it is well known that even the standard Black–Scholes model admits arbitrage opportunities unless one excludes certain trading strategies such as doubling strategies embedded into a finite time interval. Therefore, we will restrict ourselves to discrete-time market models, for which the arbitrage theory is much simpler and easier to handle. These market models are typically incomplete, i.e., they involve intrinsic risks which cannot be hedged away completely. Hence, the need for combining a static risk measure with dynamic hedging arises in a natural manner. This section is based on joint work of Hans Föllmer and Schied<sups>[</sups>[[33]]<sups>]</sups>.

We consider a filtered probability space (Ω, ℱ,(ℱ<subs>t</subs>)<subs>t=0,...,T</subs>,P) and a market where one bond and d risky assets are traded. The price of the bond will be assumed to be normalized to 1, and the (correspondingly discounted) price process of the risky assets is denoted by . We will assume that

Graph

Any d -dimensional predictable process ξ gives rise to a self-financing trading strategy; is the number of shares held of the ith asset during the trading period t − 1 ↝ t, and

Graph

is the associated value process for an initial investment V<subs>0</subs>. Here, is the Euclidean scalar product. Recall that an arbitrage opportunity is a self-financing trading strategy such that V<subs>T</subs> ≥ V<subs>0</subs> and P[V<subs>T</subs> > V<subs>0</subs>] > 0. The existence of arbitrage opportunities can be regarded as a market inefficiency, and one usually insists on arbitrage-free market models. Due to the fundamental theorem of asset pricing, the market model does not admit arbitrage opportunities if and only if there exists a measure P* ∼ P under which S is a martingale; see, e.g., Ref.[34], Theorem 5.17 or Ref.[35], Theorem 5.17. The set of all these equivalent martingale measures will be denoted by &#55349;&#56491;.

Now consider a financial position X ∈ L<sups>∞</sups>(P). X can be interpreted as "riskless" if X ≥ 0 or, more generally, if the "risky part" of X can be hedged at no additional cost. The latter means that we can find a suitable hedging portfolio ξ such that

Graph

Due to the fact that X is bounded, (49) is only possible if ξ is admissible in the sense that there is a constant c = c(ξ) such that the associated gains process satisfies

Graph

Thus, we define the following set of acceptable positions in L<sups>∞</sups>:

Graph

Theorem 3.1.1

Suppose that inf{m ∈ ℝ | m ∈ &#55349;&#56476;<subs>0</subs>} > −∞. Then ρ<subs>0</subs> ≔ ρ<subs>&#55349;&#56476;0</subs> is a coherent measure of risk. Moreover, ρ<subs>0</subs> is sensitive in the sense of Definition 2.2.1 if and only if the market model is arbitrage-free, i.e., G<subs>T</subs>(ξ) ≥ 0 P -a.s. implies G<subs>T</subs>(ξ) = 0 P-almost surely. In this case, ρ<subs>0</subs> is continuous from above and can be represented in terms of the set &#55349;&#56491; of equivalent martingale measures for the price process S:

Graph

Proof

The fact that ρ<subs>0</subs> is a coherent measure of risk follows from Proposition 2.1.2. If the model is arbitrage-free, then the superhedging duality theorem (see, e.g., Ref.[35], Corollary 7.9) yields the representation (51), and it follows that ρ<subs>0</subs> is sensitive and continuous from above.

Conversely, suppose that ρ<subs>0</subs> is sensitive, but the market model admits an arbitrage opportunity ξ. Then there is some k > 0 such that 0 ≤ G<subs>T</subs>(ξ) P -a.s. and P[G<subs>T</subs>(ξ)∧ k > 0] > 0. It follows that X ≔ −G<subs>T</subs>(ξ)∧ k ∈ &#55349;&#56476;<subs>0</subs>. However, the sensitivity of ρ<subs>0</subs> implies that

Graph

Thus, we arrive at a contradiction.

There are several reasons why it may make sense to allow in (49) only strategies ξ that belong to a proper subset &#55349;&#56494; of the class of all strategies. For instance, if the resources available to an investor are limited, only those strategies should be considered for which the initial investment in risky assets is below a certain amount. Such a restriction corresponds to an upper bound on V<subs>0</subs>. There may be other constraints. For instance, short sales constraints are lower bounds on the number of shares in the portfolio. In view of market illiquidity, the investor may also wish to avoid holding too many shares of one single asset, since the market capacity may not suffice to resell the shares. Such constraints will be taken into account by assuming throughout the remainder of this section that &#55349;&#56494; has the following properties:

a. 0 ∈ &#55349;&#56494;.

b. &#55349;&#56494; is predictably convex: If ξ, η ∈ &#55349;&#56494;, and h is a predictable process with 0 ≤ h ≤ 1, then the process

Graph

belongs to &#55349;&#56494;.

c. For each t ∈ {1,...,T}, the set

Graph

is closed in L<sups>0</sups>(Ω, ℱ<subs>t−1</subs>,P;ℝ<sups>d</sups>).

d. Each ξ ∈ &#55349;&#56494; is admissible.

Moreover, we will assume that the price increments satisfy the following non-redundance condition: For all t ∈ {1,...,T} and ξ<subs>t</subs> ∈ &#55349;&#56494;<subs>t</subs>,

Graph

In a first step, we define the non-empty set

Graph

of acceptable positions which can be hedged with strategies in &#55349;&#56494; at no cost. By Proposition 2.1.2, &#55349;&#56476;<sups>&#55349;&#56494;</sups> induces the convex measure of risk

Graph

provided that

Graph

Note that (55) holds, in particular, if &#55349;&#56494; does not contain arbitrage opportunities. We will assume (55) throughout this section.

The following problems arise:

Compute the minimal penalty function of ρ<sups>&#55349;&#56494;</sups>.

Give criteria under which ρ<sups>&#55349;&#56494;</sups> is continuous from above.

Let us consider the first problem.

Proposition 3.1.1

The minimal penalty function of ρ<sups>&#55349;&#56494;</sups> is given by

Graph

Here A<sups>Q</sups> is the predictable increasing process defined by

Graph

In particular, ρ<sups>&#55349;&#56494;</sups> has the representation

Graph

if ρ<sups>&#55349;&#56494;</sups> is continuous from above.

Proof

First note that E<subs>Q</subs>[ξ<subs>t</subs> · (S<subs>t</subs> − S<subs>t−1</subs>) | ℱ<subs>t−1</subs>] is well-defined and satisfies

Graph

for every ξ ∈ &#55349;&#56494;. To see this, observe first that, by predictable convexity, also ξ<sups>(t)</sups> ∈ &#55349;&#56494;, where

Graph

By assumption every element in &#55349;&#56494; is admissible in the sense of (50), and thus there is some constant c with ξ<subs>t</subs> · (S<subs>t</subs> − S<subs>t−1</subs>) = G<subs>T</subs>(ξ<sups>(t)</sups>) ≥ −c P -a.s. Using our assumption (48) that prices are non-negative, we see that the conditional expectation is well defined, and we obtain (59) by adding the ℱ<subs>t−1</subs>-measurable term ξ<subs>t</subs> · S<subs>t−1</subs> to both sides of the equation.

Next, if X ∈ &#55349;&#56476;<sups>&#55349;&#56494;</sups> there exists ξ ∈ &#55349;&#56494; with −X ≤ G(ξ) P -a.s. By using (59), we obtain that for Q ≪ P

Graph

Hence, we conclude that

Graph

Now we turn to the proof of the converse inequality. To this end, we show first that

Graph

is directed upwards in the sense that for ψ<subs>1</subs>,ψ<subs>2</subs> ∈ Ψ there is ψ<subs>3</subs> ∈ Ψ with ψ<subs>3</subs> ≥ ψ<subs>1</subs>∨ψ<subs>2</subs>. For let

Graph

and define ξ′ ∈ &#55349;&#56494; by

Graph

Then clearly

Graph

and therefore Ψ is directed upwards. It follows that is the limit of an increasing sequence in Ψ; see, e.g., Ref.[35], Appendix A.5. Hence, by monotone convergence,

Graph

Admissibility yields that −(G<subs>T</subs>(ξ)∧ k) ∈ &#55349;&#56476;<sups>&#55349;&#56494;</sups> ⊆ &#55349;&#56476;<subs>ρ&#55349;&#56494;</subs>, and thus

Graph

This concludes the proof.

Let us now turn to the second problem above. That is, we are looking for criteria that guarantee that ρ is continuous from above. It will turn out that the absence of arbitrage opportunities in &#55349;&#56494; is a sufficient condition under a certain regularity requirement. It can in turn be characterized by the following class of local supermartingale measures.

Definition 3.1.1

By &#55349;&#56491;<subs>&#55349;&#56494;</subs> we denote the class of all probability measures such that

Graph

and such that the gains process of any trading strategy in &#55349;&#56494; is a local -supermartingale.

The following result follows from Theorem 9.29 of Ref.[35].

Theorem 3.1.2

If &#55349;&#56491;<subs>&#55349;&#56494;</subs> is nonempty, then ρ<sups>&#55349;&#56494;</sups> is continuous from above and admits the representations

Graph

where &#55349;&#56492;<subs>&#55349;&#56494;</subs> denotes the set of all Q ∼ P such that and such that E<subs>Q</subs>[|X<subs>t+1</subs> − X<subs>t</subs>| | ℱ<subs>t</subs>] < ∞ P-a.s. for all t.

We now give a sufficient condition on &#55349;&#56494; to guarantee that &#55349;&#56491;<subs>&#55349;&#56494;</subs> is nonempty. To this end, denote the cone generated by &#55349;&#56494; by , and let ℛ<subs>t</subs> be its L<sups>0</sups> -closure.

Theorem 3.1.3

In addition to the above assumptions, suppose that . Then the following conditions are equivalent:

a. ρ<sups>&#55349;&#56494;</sups> is sensitive.

b. &#55349;&#56494; contains no arbitrage opportunities, i.e., for ξ ∈ &#55349;&#56494;, G<subs>T</subs>(ξ) ≥ 0 P-a.s. implies G<subs>T</subs>(ξ) = 0 P -a.s.

c. &#55349;&#56491;<subs>&#55349;&#56494;</subs> ≠ ∅.

Proof

(a) ⇒ (b) Suppose that ρ<sups>&#55349;&#56494;</sups> is sensitive, but &#55349;&#56494; contains an arbitrage opportunity ξ. Then there exists k > 0 such that and P[G<subs>T</subs>(ξ)∧ k > 0] > 0. Then X ≔ −G<subs>T</subs>(ξ)∧ k is bounded and satisfies X + G(ξ) ≥ 0, i.e., X ∈ &#55349;&#56476;. It follows that ρ<sups>&#55349;&#56494;</sups>(X) ≤ 0. But this contradicts the sensitivity of ρ<sups>&#55349;&#56494;</sups> and the facts that 0 ≥ X and P[X < 0] > 0.

(b) ⇒ (c) By standard arguments, the existence of arbitrage opportunities in &#55349;&#56494; is equivalent to the existence of some t and ξ<subs>t</subs> ∈ &#55349;&#56494;<subs>t</subs> ∩ (L<sups>∞</sups>)<sups>d</sups> such that ξ<subs>t</subs> · (S<subs>t+1</subs> − S<subs>t</subs>) ≥ 0 P-a.s. and P[ξ<subs>t</subs> · (S<subs>t+1</subs> − S<subs>t</subs>) > 0] > 0; see Lemma 9.11 in Ref.[35]. The condition hence guarantees that we may replace &#55349;&#56494;<subs>t</subs> by ℛ<subs>t</subs>. Now we can apply Theorem 9.9 of Ref.[35]; the condition is missing in the statements of Theorem 9.9 and Lemma 9.13 of Ref.[35], as was kindly pointed out to us by Konstantinos Kardaras and Sven Lickfeld.

(c) ⇒ (a) Due to (c) and (62), we have , and this implies the sensitivity of ρ<sups>&#55349;&#56494;</sups>.

Example 3.1.1

Let C<subs>t</subs> be a closed convex subset of ℝ<sups>d</sups> and suppose that 0 ∈ C<subs>t</subs>. Take &#55349;&#56494; as the class of all predictable processes ξ such that ξ<subs>t</subs> ∈ C<subs>t</subs>P -a.s. for all t. Then &#55349;&#56494; satisfies our conditions (a) through (d). If, in addition, the cones generated by the convex sets C<subs>t</subs> are closed in ℝ<sups>d</sups>, then the condition is also satisfied. This case includes short sales constraints and constraints on the size of a long position. These types of constraints are modeled by taking for certain numbers such that .

Example 3.1.2

Let &#55349;&#56494; denote the set of all predictable ξ such that a ≤ ξ<subs>t</subs> · S<subs>t−1</subs> ≤ b for certain numbers a, b such that −∞ ≤ a < 0 < b ≤ ∞. This class &#55349;&#56494; corresponds to constraints on the capital invested into risky assets. It satisfies conditions (a) through (d). We claim that &#55349;&#56494; does not contain arbitrage opportunities if and only if the unconstrained market is arbitrage-free, so that we have &#55349;&#56491;<subs>&#55349;&#56494;</subs> = &#55349;&#56491;. To prove this, note that the existence of an arbitrage opportunity in the unconstrained marked is equivalent to the existence of some t and some ℱ<subs>t−1</subs> -measurable ξ<subs>t</subs> such that ξ<subs>t</subs> · (S<subs>t+1</subs> − S<subs>t</subs>) ≥ 0 P-a.s. and P[ξ<subs>t</subs> · (S<subs>t+1</subs> − S<subs>t</subs>) > 0] > 0 (see Proposition 5.11 in Ref.[35]). Next, there exists a constant c > 0 such that these properties are shared by <subs>t</subs> ≔ ξ<subs>t</subs>&#55349;&#56640;<subs>{|ξt · St−1|≤c}</subs> and in turn by ϵ<subs>t</subs>, where ϵ > 0. But ϵ<subs>t</subs> ∈ &#55349;&#56494;<subs>t</subs> if ϵ is small enough.

Example 3.1.3

Suppose that &#55349;&#56494; consists of all bounded predictable processes with non-negative components, and that &#55349;&#56491;<subs>&#55349;&#56494;</subs> ≠ ∅. Then

Graph

To see this, note first that can only take the values 0 and +∞, due to the fact that &#55349;&#56494; is a cone. Hence, for all Q ∈ &#55349;&#56492;<subs>&#55349;&#56494;</subs>. Moreover, G ≔ G(ξ) is a local Q -supermartingale for Q ∈ &#55349;&#56492;<subs>&#55349;&#56494;</subs> and ξ ∈ &#55349;&#56494;. To prove this, denote by τ<subs>n</subs>(ω) the first time t at which

Graph

If such a t does not exist, let τ<subs>n</subs>(ω) ≔ T. Then τ<subs>n</subs> is a stopping time. Since

Graph

G <subs>τnt</subs> belongs to ℒ<sups>1</sups>(Q), and

Graph

This proves that G<subs>τn</subs> is a Q -supermartingale, i.e., Q ∈ &#55349;&#56491;<subs>&#55349;&#56494;</subs>.

Remark 3.1.1

In a continuous-time financial market model where the price process S follows a special semimartingale under P, one can similarly define a predictably convex set &#55349;&#56494; of admissible integrands and a corresponding convex measure of risk ρ. If one assumes in addition that the set {∈ tξ dS | ξ ∈ &#55349;&#56494;} is closed in the semimartingale or Émery topology, the optional decomposition theorem of Ref.[31] will imply a representation (62) of ρ. The penalty function α(Q) can be described as provided that Q satisfies the following three conditions are fulfilled: Q is equivalent to P, every process ∈tξ dS with ξ ∈ &#55349;&#56494; is a special semimartingale under Q, and Q admits the upper variation process A<sups>Q</sups> for the set {∈ tξ dS | ξ ∈ &#55349;&#56494;}. One can take α(Q) = ∞ for measures Q which do not satisfy one of these conditions.

Let us now relax the condition of acceptability in (53). We no longer insist that the final outcome of an acceptable position, suitably hedged, should always be non-negative. Instead, we only require that the hedged position is acceptable in terms of a given convex risk measure ρ<subs>&#55349;&#56476;</subs> with acceptance set &#55349;&#56476;. Since we cannot be sure that the hedged position is bounded from above, we define

Graph

if ρ<subs>&#55349;&#56476;</subs> is normalized, then and hence

Graph

From now on, we assume that

Graph

which implies our assumption (55) for ρ<sups>&#55349;&#56494;</sups>. Note also that we have if ρ<subs>&#55349;&#56476;</subs> is the worst-case measure.

Proposition 3.1.2

The minimal penalty function <subs>min</subs> for is given by

Graph

where is the minimal penalty function for ρ<sups>&#55349;&#56494;</sups> and α<subs>min</subs> is the minimal penalty function for ρ<subs>&#55349;&#56476;</subs>.

Proof

We claim that

Graph

If , then there exists A ∈ &#55349;&#56476; and ξ ∈ &#55349;&#56494; such that X + G<subs>T</subs>(ξ) ≥ A. Therefore X<sups>&#55349;&#56494;</sups> ≔ X − A ∈ &#55349;&#56476;<sups>&#55349;&#56494;</sups>. Conversely, if X<sups>&#55349;&#56494;</sups> ∈ &#55349;&#56476;<sups>&#55349;&#56494;</sups> then X<sups>&#55349;&#56494;</sups> + G<subs>T</subs>(ξ) ≥ 0 for some ξ ∈ &#55349;&#56494;. Hence, for any A ∈ &#55349;&#56476;, we get X<sups>&#55349;&#56494;</sups> + A + G<subs>T</subs>(ξ) ≥ A ∈ &#55349;&#56476;, i.e., .

In view of (65), we have

Graph

This proves the result.

Barrieu and El Karoui[7] (see also the references therein) study the risk measure in more general situation and relate it to the inf-convolution of the risk measures ρ<subs>&#55349;&#56476;</subs> and ρ<sups>&#55349;&#56494;</sups>. By this operation they can characterize optimal risk transfers in financial markets. The assumption (64) is not as harmless as it might seem. Consider, for instance, the situation of Example 3.1.3, suppose that the price process S is bounded, and take AV@R<subs>λ</subs> as ρ<subs>&#55349;&#56476;</subs>. The preceding proposition implies that the minimal penalty function of is given by

Graph

There may, however, be no measures in &#55349;&#56491;<subs>&#55349;&#56494;</subs> whose density is bounded by 1/λ, in which case (64) is not satisfied. For this reason, the alternative approach to combining hedging with subjective risk measurement as presented in the next section may be more appropriate. It will also have the additional advantage that it yields optimal hedging strategies.

It should also be mentioned that there are some recent approaches to dynamic risk measures in financial markets; see Artzner et al.<sups>[</sups>[[4]]<sups>]</sups>, Delbaen[22], Föllmer and Schied[34][35], Frittelli and Scandolo[39], Riedel[57], and Weber[67], to mention only a few.

3.2. Efficient Hedging with AV@R λ and Other Convex Risk Measures

Let us consider the discrete-time market model of the preceding section. We assume that the model is arbitrage-free, which is equivalent to the assumption that the set &#55349;&#56491; of equivalent martingale measure for S is nonempty. Let H ≥ 0 be the discounted payoff of a European claim, and consider an investor who is short in H, i.e., at time T, the investor must deliver the random amount H(ω). In the situation of Theorem 3.1.1, the risk of the short position −H is given by

Graph

where we assume that the right-hand side is finite. In fact, the theory of superhedging tells us π<subs>sup</subs>(H) is equal to the cost of super replicating H, i.e., there exists a trading strategy ξ such that

Graph

see, e.g., Ref.[35], Corollary 7.9. By using such a superhedging strategy, the seller of H can cover almost any possible obligation which may arise from the sale of H and thus eliminate completely the corresponding risk. In many cases, however, the cost π<subs>sup</subs>(H) will be much too expensive from a practical point of view. And even if H is attainable, a complete elimination of risk by using a replicating strategy for H would consume the entire proceeds from the sale of H, and any opportunity of making a profit would be lost along with the risk.

Let us therefore suppose that the investor is unwilling to put up the initial amount of capital required by a superhedge and is ready to accept some risk. What is the optimal partial hedge which can be achieved with a given smaller amount of capital? In order to make this question precise, we need a risk measure ρ expressing the seller's attitude towards risk, and suppose the investor is only willing to put up a smaller amount

Graph

This means that the investor is ready to take some risk: Any "partial" hedging strategy whose value process V satisfies the capital constraint V<subs>0</subs> ≤ v will generate a non-trivial shortfall

Graph

Our aim is to minimize the shortfall risk

Graph

among all admissible strategies satisfying the capital constraint V<subs>0</subs> ≤ v (here we assume that ρ is defined on a suitable function space, so that the shortfall risk is well defined). Alternatively, we could minimize the cost under a given bound on the shortfall risk. In other words, the problem consists in constructing strategies which are efficient with respect to the trade-off between cost and shortfall risk.

It turns out that the construction of the optimal hedging strategy is carried out in two steps. The first one is to solve the "static" problem of minimizing

Graph

among all ℱ<subs>T</subs> -measurable random variables X ≥ 0 which satisfy the constraints

Graph

if X* solves this problem, then so does HX*. Hence, we may assume that 0 ≤ X* ≤ H. Thus, we can reformulate the problem as

Graph

The next step is to fit the terminal value V<subs>T</subs> of an admissible strategy to the optimal profile X*. It turns out that this step can be carried out without any further assumptions on our risk measure ρ. Thus, we assume at this point that the optimal X* of step one is granted, and we construct the corresponding optimal strategy.

Proposition 3.2.1

A superhedging strategy for a solution X* of (66) with initial investment π<subs>sup</subs>(X*) has minimal shortfall risk among all admissible strategies whose value process satisfies the capital constraint V<subs>0</subs> ≤ v.

Proof

Let V be the value process of any admissible strategy such that V<subs>0</subs> ≤ v. Due to Doob's systems theorem (e.g., Ref.[35], Theorem 5.15), V is a martingale under any P* ∈ &#55349;&#56491;, and so sup<subs>P*∈&#55349;&#56491;</subs>E*[V<subs>T</subs>] = V<subs>0</subs> ≤ v. Thus, X ≔ HV<subs>T</subs> satisfies the constraints in (66), and we get

Graph

Next let V* be a superhedging strategy for X* with initial investment π<subs>sup</subs>(X*) = sup<subs>P*∈&#55349;&#56491;</subs>E*[X*]. Then we have V*<subs>0</subs> = sup<subs>P*∈&#55349;&#56491;</subs>E*[X*] ≤ v and V<subs>T</subs>* ≥ 0. Moreover, V<subs>T</subs>* ≥ X* P-a.s., and thus

Graph

This concludes the proof.

Let us now return to the static problem defined by (66).

Proposition 3.2.2

If ρ is lower semicontinuous with respect to a.s. convergence of random variables in the class {−X | 0 ≤ X ≤ H}, then there exists a solution of the static optimization problem (66). In particular, there exists a solution if H is bounded and ρ is continuous from above.

The proof needs the the following variant of Komlos'[49]principle of subsequences.

Lemma 3.2.1

Let X<subs>n</subs> be a sequence in L<sups>0</sups> such that sup<subs>n</subs>|X<subs>n</subs>|<∞ P-almost surely. Then there exists a sequence of convex combinations

Graph

which converges P-almost surely to some Y ∈ L<sups>0</sups>.

Proof

We can assume without loss of generality that sup<subs>n</subs>|X<subs>n</subs>|≤1 P -a.s.; otherwise we consider the sequence . Then (X<subs>n</subs>) is a bounded sequence in the Hilbert space L<sups>2</sups>. Since the closed unit ball in L<sups>2</sups> is weakly compact, the sequence (X<subs>n</subs>) has an accumulation point Y ∈ L<sups>2</sups>. For each n, the accumulation point Y belongs to the L<sups>2</sups> -closure &#55349;&#56478;<subs>n</subs> of conv{X<subs>n</subs>,X<subs>n+1</subs>,...}, due to the fact that a closed convex set in L<sups>2</sups> is also weakly closed. Thus, we can find Y<subs>n</subs> ∈ conv{X<subs>n</subs>,X<subs>n+1</subs>,...} such that . This sequence (Y<subs>n</subs>) converges P -a.s. to Y.

Proof of Proposition 3.2.2

Take X<subs>n</subs> with 0 ≤ X<subs>n</subs> ≤ H and sup<subs>P*∈&#55349;&#56491;</subs>E*[X<subs>n</subs>] ≤ v such that ρ(X<subs>n</subs> − H) converges to the infimum A of the shortfall risk. We can use Lemma 3.2.1 to select convex combinations Y<subs>n</subs> ∈ conv {X<subs>n</subs>,X<subs>n+1</subs>,...} which converge P -a.s. to some Y. Then 0 ≤ Y ≤ H and Fatou's lemma yields that

Graph

for all P* ∈ &#55349;&#56491;. The lower semicontinuity of ρ implies that

Graph

Moreover, the right-hand side is equal to A, due to the convexity of ρ. Hence, Y is the desired minimizer.

Combining Propositions 3.2.2 and 3.2.1 yields the existence of an optimal hedging strategy under risk aversion in a general arbitrage-free market model. So far, all arguments were practically the same as in the paper by Föllmer and Leukert[32].

Beyond the general existence statement of Proposition 3.2.2, it is sometimes possible to obtain an explicit formula for the optimal solution of the static problem if the market model is complete. Recall that model completeness is equivalent to the uniqueness of the equivalent martingale measure, i.e., to the condition &#55349;&#56491; = {P*}. Thus, the static optimization problem simplifies to

Graph

By substituting Y for H − X, this is equivalent to the problem

Graph

where  ≔ E*[H] − v. We will now solve this problem in the case ρ=AV@R<subs>λ</subs> and thus recover a recent result by Sekine[66]. Note, however, that our proof is significantly shorter. It relies on the general idea that a minimax problem can be transformed into a standard minimization problem by using a duality result for the expression involving the maximum. In the case of AV@R<subs>λ</subs>, we can use Lemma 2.3.3, and this idea was first applied by Eichhorn and Römisch[25]. See also Quenez[56] and Hernández-Hernández and Schied[42] for applications of this general idea to optimal investment problems as discussed in Sections 3.3 and 3.4 below.

Theorem 3.2.1

Suppose that the price density ϕ ≔ dP*/dP has a continuous distribution under P. Then the problem (67) has a unique solution Y* for ρ=AV@R<subs>λ</subs>. Moreover, there exists a critical capital v* such that

Graph

and

Graph

for certain constants c, r* > 0.

Proof

Lemma 2.3.3 gives

Graph

for Y ≥ 0. Hence, Y* must solve

Graph

if r* ≥ 0 is such that

Graph

By Lemma 2.3.3, r* is a λ -quantile for Y*.

Let us now solve (69). To this end, we consider first the case in which r* = 0. By writing Y = H(1 − ψ), we see that the solution is provided by the Neyman–Pearson lemma in the form of Proposition 2.3.2. Indeed, ψ* corresponding to Y* must be a solution of the problem

Graph

where

Graph

Thus, ψ* is of the form

Graph

for certain constants and c, and in turn Y* = H&#55349;&#56640;<subs>{ϕ≥c}</subs>.

Now we consider the case r* > 0. Note first that we must have Y* ≥ Hr*. Indeed, let us assume P[Y* < Hr*] > 0. Then we could obtain a strictly lower risk AV@R<subs>λ</subs>(− Y*) either by decreasing the level r* in case P[Y* ≤ Hr*] = 1 or, in case P[Y* > Hr*] > 0, by shifting mass of Y* from {Y* > Hr*} to the set {Y* < Hr*}. Thus, we have to minimize E[( + Hr* − r*)<sups>+</sups>] subject to 0 ≤  ≤ H − Hr* and . Any satisfying these constraints must be concentrated on {H > r*}, so that the problem is equivalent to

Graph

But this problem is equivalent to the one for r* = 0 if we replace H by H − Hr*. Hence, it is solved by * = (H − Hr*)&#55349;&#56640;<subs>{ϕ≥c}</subs> for some constant c. It follows that

Graph

The preceding arguments also yield that for any r* = r ≤  there exists a unique solution Y<subs>r, ṽ</subs> to the minimization problem (69). Clearly, we have Y<subs>r, [wtilde]</subs> ≥ Y<subs>r, ṽ</subs> if  ≥ . Moreover, we can find some r() such that E[(Y<subs>r, ṽ</subs> − r)<sups>+</sups>] + λ r is minimal. We leave it as an exercise for the reader to prove that r() ≥ r() if  ≥ . This fact then yields the uniqueness of solutions as well as the existence of the critical value v*.

The following example is taken from Ref.[61].

Example 3.2.1

In case H≡1, the solution can be determined explicitly if ϕ has a continuous and strictly increasing distribution function. In order to compute the solution, let us fix  ∈ (0, 1) and let

Graph

where r = r(c) is such that E*[Y<subs>r</subs>] = , i.e.,

Graph

This makes sense as long as c ≥ c<subs>0</subs>, where c<subs>0</subs> is defined via  = E[ϕ;ϕ ≥ c<subs>0</subs>]. Due to the Neyman–Pearson lemma, Y<subs>r(c)</subs> then is the optimal solution of the problem (69) for r* ≔ r(c). The preceding theorem states that the solution of our original problem is within the class {Y<subs>r(c)</subs> | c ≥ c<subs>0</subs>}. Thus, we have to minimize

Graph

over c ≥ c<subs>0</subs>. Here we have used Lemma 2.4.1 in the second identity. This minimization problem can be simplified further be using the reparameterization c = q<subs>ϕ</subs>(t), where q<subs>ϕ</subs> is the (unique) quantile function for ϕ under P. Indeed, by letting

Graph

for , we simply have to minimize the function

Graph

over t ≥ t<subs>0</subs> ≔ F<subs>ϕ</subs>(c<subs>0</subs>). For t ≤ 1 − λ, we get R(t) = 1, which cannot be optimal. Moreover, it is easy to see that the function

Graph

has a unique maximizer t<subs>λ</subs> ∈ (1 − λ,1], which will define the solution as soon as t<subs>λ</subs> ≥ t<subs>0</subs> and as long as t = t<subs>0</subs> does not give a better result. If t<subs>λ</subs> ≤ t<subs>0</subs>, then R has no minimizer on (t<subs>0</subs>,1], and it follows that t* = t<subs>0</subs>. Thus, let us finally compare R(t<subs>λ</subs>) against R(t<subs>0</subs>) in case t<subs>λ</subs> > t<subs>0</subs>. Since t<subs>λ</subs> > 1 − λ, we have

Graph

and

Graph

Since t<subs>λ</subs> is the unique maximizer of the function t↦ (t − 1 + λ)<sups>+</sups>/Φ(t), we thus see that R(t<subs>λ</subs>) is strictly better than R(t<subs>0</subs>). Hence the solution is defined by

Graph

Note also that, for ‖ϕ‖<subs>∞</subs> > λ<sups>−1</sups>, t<subs>λ</subs> is the unique solution of the equation

Graph

Clearly, t<subs>λ</subs> is independent of , while t<subs>0</subs> decreases from 1 to 0 as increases from 0 to 1. Thus, by taking * as the capital level for which t<subs>λ</subs> = t<subs>0</subs>, we see that the optimal solution has the form Y* = &#55349;&#56640;<subs>{ϕ≥qϕ(t0)}</subs> for  ≤ * and Y* = r* + (1 − r*)&#55349;&#56640;<subs>{ϕ≥qϕ(tλ)}</subs> for  > v*, where .

3.3. Optimal Investment Under Knightian Uncertainty

In this section, we return to the setting of Section 2.6, where an investor assesses payoffs at some time T by a robust utility functional of the form

Graph

Here, &#55349;&#56492; is a set of probability measures and U is a strictly concave utility function. The problem we are considering is to find trading strategies that are optimal with respect to the investor's preferences. This problem was first discussed systematically by Quenez[56] and Schied[62]. Our presentation draws from this second paper.

In this section, we consider a standard complete market model in continuous time. That is, we consider a market model consisting of one bond and d risky assets, whose price processes are denoted by . We may assume without loss of generality that the price of the bond is constant. The process S is assumed to be a semimartingale on the filtered probability space (Ω, ℱ,(ℱ<subs>t</subs>)<subs>0≤tT</subs>,P), and we emphasize that this includes the case of a discrete-time market model, in which prices are adjusted only at times t = 0,1,...,T: just set S<subs>t</subs> ≔ S<subs>[t]</subs> and ℱ<subs>t</subs> ≔ ℱ<subs>[t]</subs> for arbitrary t ∈ [0,T]. We assume that ℱ<subs>0</subs> is P -trivial and that the market is complete in the sense that there exists a unique probability measure P* that is equivalent to P and under which S is a d -dimensional local martingale. In a discrete-time setting, market completeness implies that Ω can be chosen as a finite set, and this will simplify certain assumptions on our set &#55349;&#56492;.

A self-financing trading strategy can be regarded as a pair (x,ξ), where x ∈ ℝ is the initial investment and is a predictable and S -integrable process. The value process X associated with (x,ξ) is given by X<subs>0</subs> = x and

Graph

for x ∈ ℝ given, we denote by &#55349;&#56499;(x) the set of all such processes X with X<subs>0</subs> ≤ x which are admissible in the sense that X<subs>t</subs> ≥ 0 for 0 ≤ t ≤ T and whose terminal wealth X<subs>T</subs> has a well-defined robust utility

Graph

Now we can state our main problem:

Graph

Definition 3.3.1

Let &#55349;&#56492; be a set of probability measures absolutely continuous with respect to P*. Q<subs>0</subs> ∈ &#55349;&#56492; is called a least favorable measure with respect to P* if the density π = dP*/dQ<subs>0</subs> (taken in the sense of the Lebesgue decomposition) satisfies

Graph

In the sequel, we will assume that &#55349;&#56492; is a convex set. Moreover, we will assume throughout this note that &#55349;&#56492; is equivalent to P* in the following sense:

Graph

Clearly, our problem (72) would not be well-posed without the implication "↠". The converse implication is economically natural, since a position with a positive price should lead to a non-vanishing utility. Note that (73) is strictly weaker than the condition that every measure in &#55349;&#56492; is equivalent to P*, which is often assumed in papers on model uncertainty; for a discussion see Cont[17].

Now we can state our first main result. It reduces the robust utility maximization problem to a standard utility maximization problem plus the computation of a least favorable measure, which is independent of the utility function.

Theorem 3.3.1

Suppose that &#55349;&#56492; admits a least favorable measure Q<subs>0</subs> ∼ P*. Then the robust utility maximization problem (72) is equivalent to the standard utility maximization problem with respect to Q<subs>0</subs>, i.e., to (72) with &#55349;&#56492; replaced by &#55349;&#56492;<subs>0</subs> ≔ {Q<subs>0</subs>}. More precisely, X*<subs>T</subs> ∈ &#55349;&#56499;(x) solves the robust problem (72) if and only if it solves the standard problem for Q<subs>0</subs>, and the corresponding value functions are equal, whether there exists a solution or not:

Graph

This result has the following economic consequence. Let ≻ denote the preference order induced by our robust utility functional, i.e.,

Graph

then, although ≻ does not satisfy the axioms of (subjective) expected utility theory, optimal investment decisions with respect to ≻ are still made in accordance with the Savage/Anscombe–Aumann version of expected utility, provided that we take Q<subs>0</subs> as the subjective probability measure. The surprising part is that this subjective measure neither depends on the initial investment x = X<subs>0</subs> nor on the choice of the utility function U. If &#55349;&#56492; does not admit a least favorable measure, then it is still possible that the robust problem is equivalent to a standard utility maximization problem with a subjective measure Q, which then, however, will depend on x and U. This can be shown by suitably adjusting the arguments in Proposition 3.3.1 below; see Gundel[41] for details.

Let us now show that the condition Q<subs>0</subs> ∼ P* is always satisfied if &#55349;&#56492; is convex and closed in total variation. Recall that &#55349;&#56492; is closed in total variation if and only if {dQ/dP* | Q ∈ &#55349;&#56492;} is closed in L<sups>1</sups>(P*).

Lemma 3.3.1

Suppose that &#55349;&#56492; is convex and closed in total variation. Then every least favorable measure Q<subs>0</subs> is equivalent to P*.

Proof

Due to our assumptions and the Halmos–Savage theorem, &#55349;&#56492; contains a measure Q<subs>1</subs> ∼ P*. We get

Graph

Hence, also P*[π < ∞] = 1 and in turn P* ≪ Q<subs>0</subs>.

We also have the following converse to Theorem 3.3.1:

Theorem 3.3.2

Suppose Q<subs>0</subs> ∈ &#55349;&#56492; is such that for all utility functions and all x > 0 the robust utility maximization problem (72) is equivalent to the standard utility maximization problem with respect to Q<subs>0</subs>. Then Q<subs>0</subs> is a least favorable measure in the sense of Definition 3.3.1.

The proof will show that in the preceding theorem the class of all utility functions can be replaced by the smaller class of all bounded and continuously differentiable utility functions.

Let us next state the following elementary characterization of least favorable measures, which is a variant of Theorem 6.1 in Ref.[46].

Proposition 3.3.1

For Q<subs>0</subs> ∈ &#55349;&#56492; with Q<subs>0</subs> ∼ P* and π ≔ dP*/dQ<subs>0</subs>, the following conditions are equivalent.

a. Q<subs>0</subs> is a least favorable measure for P*.

b. For all decreasing functions f:(0, ∞] → ℝ such that ∈f<subs>Q∈&#55349;&#56492;</subs>E<subs>Q</subs>[f(π)∧ 0] > −∞,

Graph

c. For all increasing functions g:(0, ∞] → ℝ such that ,

Graph

d. Q<subs>0</subs> minimizes

Graph

among all Q ∈ &#55349;&#56492;, for all continuous convex functions Φ:[0, ∞) → ℝ such that I<subs>Φ</subs>(P*|Q) is finite for some Q ∈ &#55349;&#56492;.

Proof

(a)⇔(b) According to the definition, Q<subs>0</subs> is a least favorable measure if and only if Q<subs>0</subs> ○ π<sups>−1</sups> stochastically dominates Q<subs>0</subs> ○ π<sups>−1</sups> for all Q ∈ &#55349;&#56492;. Hence, if f is bounded, then the equivalence of (a) and (b) is just the standard characterization of stochastic dominance (see, e.g., Ref.[35], Theorem 2.71). If f is unbounded but satisfies ∈f<subs>Q∈&#55349;&#56492;</subs>E<subs>Q</subs>[f(π)∧ 0] > −∞, then assertion (b) holds for f<subs>N</subs> ≔ (−N) ∨ f∧0. Thus, for all Q ∈ &#55349;&#56492; and N ∈ ℕ,

Graph

By sending N to infinity, it follows that E<subs>Q</subs>[f(π)∧ 0] ≥ E<subs>Q0</subs>[f(π)∧ 0] for every Q ∈ &#55349;&#56492;. After using a similar argument on 0 ∨ f(π), we get

Graph

(b)⇔(c) follows by changing signs.

(b) ⇒ (d) Clearly, I<subs>Φ</subs>(P*|Q) is well-defined and larger than Φ(1) for each Q ≪ P. Now take a Q<subs>1</subs> ∈ &#55349;&#56492; with I<subs>Φ</subs>(P*|Q<subs>1</subs>) < ∞, and denote by Φ′<subs>+</subs>(x) the right-hand derivative of Φ at x ≥ 0. Suppose first that Φ<subs>+</subs>′ is bounded. Since Φ(y) − Φ(x) ≥ Φ′<subs>+</subs>(x)(y − x), we have

Graph

where f(x) ≔ Φ<subs>+</subs>′(1/x) is a bounded decreasing function. Therefore ∈t f(π)dQ<subs>1</subs> ≥ ∈t f(π)dQ<subs>0</subs>, and Q<subs>0</subs> minimizes I<subs>Φ</subs>(P* | ·) on &#55349;&#56492;. If Φ<subs>+</subs>′ is unbounded, one can either use a straightforward approximation argument or apply Ref.[35], Corollary 2.62.

(d) ⇒ (b) It is enough to prove (b) for continuous bounded decreasing functions f. For such a function f let . Then Φ is convex. For Q<subs>1</subs> ∈ &#55349;&#56492; we let Q<subs>t</subs> ≔ t Q<subs>1</subs> + (1 − t)Q<subs>0</subs> and h(t) ≔ I<subs>Φ</subs>(P* | Q<subs>t</subs>). The right-hand derivative of h satisfies 0 ≤ h<subs>+</subs>′(0) = ∈t f(π)dQ<subs>1</subs> − ∈t f(π)dQ<subs>0</subs>, and the proof is complete.

An important method for solving optimal investment problems is the duality approach via convex analysis, which is sometimes also called the 'martingale method'. Below we will state such a result which is valid in our present setting. It follows immediately by combining Theorem 3.3.1 with Proposition 3.3.1 and Theorem 2.0 of Kramkov and Schachermayer[53], which is the corresponding result for standard utility maximization problems. We have to assume that U is continuously differentiable and satisfies the Inada conditions

Graph

We denote by

Graph

the value function of the problem (72). Let

Graph

denote the convex conjugate of U and define the function

Graph

We also define the convex function

Graph

Theorem 3.3.3

Suppose that &#55349;&#56492; admits a least favorable measure Q<subs>0</subs>∼P* and that u(x) is finite for some x > 0. Then:

a. u(x) is finite for all x > 0, and v(y) < ∞ for y > 0 sufficiently large. The function v is continuously differentiable in the interior (y<subs>0</subs>,∞) of its effective domain. The function u is continuously differentiable on (0, ∞) and strictly concave on (0,x<subs>0</subs>), where x<subs>0</subs> ≔ −lim<subs>y ↓ y0</subs>v′(y). For x, y > 0,

Graph

Moreover, and .

b. For x < x<subs>0</subs> there exists a unique solution X*(x) ∈ &#55349;&#56499;(x) of (72), and its terminal wealth is of the form

Graph

c. For 0 < x < x<subs>0</subs> and y < y<subs>0</subs>,

Graph

A duality theorem for robust utility maximization in incomplete markets was first obtained by Quenez[56] under the additional assumption that every measure Q ∈ &#55349;&#56492; is equivalent to a given reference measure P. It was extended to the general case by Schied and Wu[64]; see also Föllmer and Gundel[30] for a different technique and Schied[68] for a further extension. One of the important features of these kind of duality results is that they reduce the original maximin problem to a dual problem, which then is of reduced complexity as it just consists in minimizing over a certain set of controls. We have seen this principle already in the proof of Theorem 3.2.1. In robust utility maximization, it has been employed by Quenez[56] together stochastic control techniques involving backward stochastic differential equations. Hernández–Hernández and Schied[42] use duality combined with partial differential equations to characterize the value function of a robust utility problem as the unique bounded classical solution of a nonlinear Hamilton–Jacobi–Bellman equation.

Let us now turn to the proofs of Theorems 3.3.1 and 3.3.2. Let X* be a solution of the standard utility maximization problem for the least favorable measure Q<subs>0</subs>. Then it is well known that X*<subs>T</subs> = I(yπ) for some constant y > 0. Thus, one easily checks via Proposition 3.3.1 that X* is also a solution of the robust utility maximization problem. However, in order to show the full equivalence of the two problems, we must also take care of the situation in which the standard problem has no solution. Our key result is the following proposition.

Proposition 3.3.2

Let Q<subs>0</subs> ∼ P* be a least favorable measure and π = dP*/dQ<subs>0</subs>.

a. For any X ∈ &#55349;&#56499;(x) there exists such that

Graph

and such that for some deterministic decreasing function f:(0, ∞) → [0, ∞).

b. The terminal wealth of any solution X* of (72) is of the form X*<subs>T</subs> = f*(π) for a deterministic decreasing function f*:(0, ∞) → [0, ∞).

Proof of Proposition 3.3.2

(a) By market completeness, it suffices to construct a decreasing function f ≥ 0 such that E*[f(π)] ≤ x and

Graph

To this end, we denote by F<subs>Y</subs>(x) ≔ Q<subs>0</subs>[Y ≤ x] the distribution function and by q<subs>Y</subs>(t) a quantile function of a random variable Y with respect to the probability measure Q<subs>0</subs>.

Let us define a function f by

Graph

Then f is decreasing and satisfies f(q<subs>π</subs>) = E<subs>λ</subs>[h | q<subs>π</subs>], where λ is the Lebesgue measure and h(t) ≔ q<subs>XT</subs>(1 − t). Hence, Jensen's inequality for conditional expectations and Lemma 2.4.1 show that

Graph

where we have used Proposition 3.3.1 in the last step. Thus, f satisfies (74).

It remains to show that f(π) satisfies the capital constraint. To this end, we first use the lower Hardy–Littlewood inequality:

Graph

Here we may replace q<subs>XT</subs>(1 − t) = h(t) by E<subs>λ</subs>[h | q<subs>π</subs>](t) = f(q<subs>π</subs>(t)). We then get

Graph

Thus, f is as desired.

(b) Now suppose X* solves (72). If X*<subs>T</subs> is not Q<subs>0</subs> -a. s. σ(π) -measurable, then Y ≔ E<subs>Q0</subs>[X*<subs>T</subs> | π] must satisfy

Graph

due to the strict concavity of U. If we define as in (75) with Y replacing X<subs>T</subs>, then the proof of part (a) yields that

Graph

and by (76) and (79),

Graph

in contradiction to the optimality of X*. Thus, X*<subs>T</subs> is necessarily σ(π) -measurable and can hence be written as a (not yet necessarily decreasing) function of π.

If we define f* as in (75) with X*<subs>T</subs> replacing X<subs>T</subs>, then f*(π) is the terminal wealth of yet another solution in &#55349;&#56499;(x). Clearly, we must have E*[X*<subs>T</subs>] = x = E*[f*(π)]. Thus, (77) and (78) yield that . But then the "only if" part of the lower Hardy–Littlewood inequality together with the σ(π) -measurability of X*<subs>T</subs> imply that X*<subs>T</subs> is a decreasing function of π.

Proof of Theorem 3.3.1

Proposition 3.3.2 implies that in solving the robust utility maximization problem (72) we may restrict ourselves to strategies whose terminal wealth is a decreasing function of π. By Proposition 3.3.1, the robust utility of a such a terminal wealth is the same as the expected utility with respect to Q<subs>0</subs>. On the other hand, taking &#55349;&#56492;<subs>0</subs> ≔ {Q<subs>0</subs>} in Proposition 3.3.2 implies that the standard utility maximization problem for Q<subs>0</subs> also requires only strategies whose terminal wealth is a decreasing function of π. Therefore, the two problems are equivalent, and Theorem 3.3.1 is proved.

Proof of Theorem 3.3.2

Let (U<subs>n</subs>) be a sequence of nonnegative and continuously differentiable utility functions that increase uniformly to the concave increasing function U(x) ≔ x∧1. Uniform convergence of U<subs>n</subs> implies convergence of the corresponding value functions:

Graph

If we assume that decreases fast enough to 0 as x↑∞, then E*[I<subs>1</subs><sups>+</sups>(cπ)] < ∞ for all c > 0, where π ≔ dP*/dQ<subs>0</subs> and I<subs>1</subs><sups>+</sups> is the inverse of on and I<subs>1</subs><sups>+</sups>(x) = 0 for . Market completeness and Ref.[35], Theorem 3.39 guarantee that, for every 0 < x ≤ 1 and each n ∈ ℕ, there exists a solution X<sups>n</sups> ∈ &#55349;&#56499;(x) for the standard utility maximization problem with utility function U<subs>n</subs> under Q<subs>0</subs>. Note that the preceding two statements also remain true for P*  Q<subs>0</subs> and that on {π = ∞}.

By a Komlos-type argument (see Lemma 3.3 in Ref.[53]), there exists a sequence which converges P* -a.s. to some random variable X*<subs>T</subs> ≥ 0, which satisfies E*[X*<subs>T</subs>] ≤ x, due to Fatou's lemma. Hence, X*<subs>T</subs> corresponds to a value process X* ∈ &#55349;&#56499;(x). Let us write Y<subs>n</subs> as the convex combination , where only finitely many α<subs>k, n</subs> are nonzero. Then,

Graph

due to (80). Hence, X* is optimal for the utility maximization problem with U and Q<subs>0</subs>. Since U is constant on [1,∞), we must have 0 ≤ _I_X_i_*_SB__I_T_i__sb_ ≤ 1 _I_P_i_* -almost surely. Thus, _I_X_i_*_SB__I_T_i__sb_ is a solution to the problem of maximizing _I_E_i__SB__I_Q_i_0_sb_[_I_U_i_(_I_X_i_)] = E<subs>Q0</subs>[X] under the constraints 0 ≤ X ≤ 1 and E*[X] ≤ x. Hence, the generalized Neyman–Pearson lemma in the form of Proposition 2.3.2 implies that X*<subs>T</subs> = &#55349;&#56640;<subs>{π<q}</subs> + κ&#55349;&#56640;<subs>{π=q}</subs>, where q can be any x -quantile for the law of π under P*, and κ is a [0, 1]-valued random variable. In particular,

Graph

Note also that the x -quantile q is unique if P*[π = q] > 0.

Next, if Q ∈ &#55349;&#56492; is given, then

Graph

where we have used the fact that for all k. This inequality follows from the hypothesis of the theorem: solves both the standard and the robust utility maximization problems, and the corresponding value functions are equal, i.e.,

Graph

Finally, combining (82) with (81) yields Q[π ≤ q] = E<subs>Q</subs>[U(X*<subs>T</subs>)] ≥ E<subs>Q0</subs>[U(X*<subs>T</subs>)] = Q<subs>0</subs>[π ≤ q] for all but countably many and, in turn, all q ∈ [0, 1].

3.4. Examples of Least Favorable Measures

In this section, we will discuss two classes of examples in which least favorable measures can be determined. The first is a Black–Scholes market with uncertain drift. The second is provided by the classical Huber–Strassen theory, where &#55349;&#56492; is is the σ -core of a submodular capacity.

3.4.1. Utility maximization with uncertain drift

Consider a Black–Scholes market model with a riskless bond, B<subs>t</subs>, of which we assume B<subs>t</subs>≡ 1 and with d risky assets that satisfy an SDE of the form

Graph

with a d -dimensional Brownian motion W and a volatility matrix σ<subs>t</subs> that has full rank. Now suppose the investor is uncertain about the "true" future drift in the market: any drift α is possible that is adapted to the filtration generated by W and satisfies α<subs>t</subs> ∈ C<subs>t</subs>, where C<subs>t</subs> is a nonrandom bounded closed convex subset of ℝ<sups>d</sups>. Let us denote by &#55349;&#56476; the set of all such processes α. This uncertainty in the choice of the drift can be expressed by the set

Graph

Under P* the drift α in (83) vanishes. It turns out that the optimal investment problem with uncertain drift can be solved by transforming it into a problem for uncertain volatility as studied by El Karoui et al.[27]. To this end, we denote by the element in C<subs>t</subs> that minimizes the norm among all x ∈ C<subs>t</subs>.

Proposition 3.4.1.1

Suppose that σ<subs>t</subs> is deterministic and that both and σ<subs>t</subs> are continuous in t. Then &#55349;&#56492; admits a least favorable measure Q<subs>0</subs> with respect to P*, which is characterized by having the drift α<sups>0</sups>.

Proof

We will use arguments from Ref.[27] to check condition (d) of Proposition 3.3.1. The density process of Q ∈ &#55349;&#56492; with respect to P* has the form

Graph

where and W* is a d -dimensional P*-Brownian motion. Similarly, the density process Z ≔ Z<sups>Q0</sups> will involve the deterministic integrand . Let Φ be a convex function on ℝ<subs>+</subs>. We may assume without loss of generality that Φ has at most polynomial growth. Then v(t, x) ≔ E*[Φ(xZ<subs>t</subs>)] is a solution of the Black–Scholes equation . This fact and Itô's formula show that

Graph

One easily checks that the first term on the right is a martingale increment. Moreover, v is convex and |λ<subs>t</subs>|<sups>2</sups> ≥ |γ<subs>t</subs>|<sups>2</sups> by definition of α<sups>0</sups>. Hence, is a submartingale and

Graph

An obvious question is whether the strong condition that the volatility σ<subs>t</subs> and the drift α<sups>0</sups> are deterministic can be relaxed. One case of interest is, for instance, a local volatility model in which the Equation (83) is replaced by the one-dimensional SDE

Graph

In this case, however, it will typically no longer be optimal to take the drift that is closest to the riskneutral case α≡ 0. The reason is that the utility of an investment can be reduced both by a small drift and by a large volatility, and these two requirements may be competing with each other. This effect may also destroy the existence of a least favorable measure; see Hernández-Hernández and Schied[42] for the discussion of a related tradeoff effect. See also Schied[67] for examples in the setting of Proposition 3.4.1.1, in which a path-dependent volatility σ<subs>t</subs> or drift prevent the existence of a least favorable measure.

3.4.2. Examples within the Huber–Strassen theory

In the preceding section, the way of determining the set &#55349;&#56492; was to specify a "confidence set" around an estimate of a certain market parameter and to take for &#55349;&#56492; the class of all measures that are consistent with this confidence set. In practice, however, one would rather try to assign a high weight to the original estimate, while a measure concentrated on the outmost edge of the confidence set should receive a lower weight. This idea illustrates that the set &#55349;&#56492; may arise in a more complicated manner from the investor's preference relation than in the ad hoc approach of the preceding section.

The complexity of determining the set &#55349;&#56492; is reduced if one imposes additional assumptions on the underlying preference order. For instance, we have already discussed the assumption of comonotonic independence, which is reasonable insofar comonotonic positions cannot act as mutual hedges; see Remark 2.6.1. Mathematically, comonotonic independence is essentially equivalent to the fact that the nonadditive set function

Graph

is submodular in the sense of Choquet:

Graph

see Remark 2.6.1.

Assumption 1

Consider the following set of conditions.

a. γ is submodular.

b. &#55349;&#56492; is maximal in the sense that it contains every measure Q with Q[A] ≤ γ(A) for all A ∈ ℱ<subs>T</subs>.

c. There exists a Polish topology on Ω such that ℱ<subs>T</subs> is the corresponding Borel field and &#55349;&#56492; is compact.

Let us also comment on conditions (b) and (c) in Assumption 1. Condition (c) guarantees that γ is a capacity in the sense of Choquet[16]. Condition (b) implies that &#55349;&#56492; is convex and closed in total variation. Hence, Lemma 3.3.1 yields that any least favorable measure must be equivalent to P*. Moreover, under assumption (a), the set &#55349;&#56492; = {Q | Q ≤ γ} is equal to

Graph

see, e.g., Ref.[35], Theorem 4.88.

Consider the submodular set function

Graph

It is shown in Lemmas 3.1 and 3.2 of Ref.[47] that under Assumption 1 there exists a decreasing family (A<subs>t</subs>)<subs>t>0</subs> ⊂ ℱ<subs>T</subs> such that A<subs>t</subs> minimizes ν<subs>t</subs> and such that A<subs>t</subs> = ∪<subs>s>t</subs>A<subs>s</subs>.

Definition 3.4.2.1 (Huber and Strassen)

The function

Graph

is called the Radon–Nikodym derivative of P* with respect to γ.

The terminology "Radon–Nikodym derivative" comes from the fact that dP*/dγ coincides with the usual Radon–Nikodym derivative dP*/dQ in case where &#55349;&#56492; = {Q}; see Ref.[47]. We will need the following simple lemma:

Lemma 3.4.2.1

Condition (73) implies that .

Proof

Let ν<subs>t</subs> be as in (85). Clearly, if and only if ω ∈ A<subs>∞</subs> ≔ ∩<subs>0<t<∞</subs>A<subs>t</subs>. Since ν<subs>t</subs>(A<subs>t</subs>) ≤ ν<subs>t</subs>(∅) = 0, we have γ(A<subs>t</subs>) ≤ 1/t. It follows that γ(A<subs>∞</subs>) = 0, which by (73) implies that P[A<subs>∞</subs>] = 0.

Letting A<subs>0</subs> ≔ ∪<subs>0<t<∞</subs>A<subs>t</subs>, we see that if and only if . From ν<subs>t</subs>(A<subs>t</subs>) ≤ ν<subs>t</subs>(Ω) = t − 1, we find that . As t ↓ 0 we thus get .

Let us now state the Huber–Strassen theorem from Ref.[47] in a form in which it will be needed here.

Theorem 3.4.2.1 (Huber–Strassen)

Under Assumption 1, &#55349;&#56492; admits a least favorable measure Q<subs>0</subs> with respect to any probability measure R on (Ω, ℱ<subs>T</subs>). Moreover, if R = P* and &#55349;&#56492; satisfies (73), then Q<subs>0</subs> is equivalent to P* and given by

Graph

Together with Theorem 3.3.1, we get a complete solution of the robust utility maximization problem within the large class of utility functionals that arise from sets &#55349;&#56492; as in Assumption 1. Before discussing particular examples, let us state the following converse of the Huber–Strassen theorem in order to clarify the role of condition (a) in Assumption 1.

Theorem 3.4.2.2

Suppose Ω is a Polish space with Borel field ℱ<subs>T</subs> and &#55349;&#56492; is a compact set of probability measures. If every probability measure on (Ω, ℱ<subs>T</subs>) admits a least favorable measure Q<subs>0</subs> ∈ &#55349;&#56492;, then γ(A) = sup<subs>Q∈&#55349;&#56492;</subs>Q[A] is submodular.

For finite probability spaces, Theorem 3.4.2.2 is due to Huber and Strassen[46]. In the form stated above, it was proved by Lembcke[54]. An alternative formulation was given by Bednarski[10].

Let us now turn to the discussion of examples. The following example class was first studied by Bednarski[9] under slightly different conditions than here. These examples also play a role in the theory of law-invariant risk measures; see Kusuoka[53] and Sections 4.4 through 4.7 in Ref.[35].

Example 3.4.2.1

The following class of submodular set functions arises in Yaari's[68] "dual theory of choice under risk". Let ψ:[0, 1] → [0, 1] be an increasing concave function with ψ(0) = 0 and ψ(1) = 1. In particular, ψ is continuous on (0,1]. We define γ by

Graph

Then γ is submodular, and the set &#55349;&#56492; of all probability measures Q on (Ω, ℱ<subs>T</subs>) with Q[A] ≤ γ(A) can be described in terms of ψ; see Theorem 2.4.6. If (Ω, ℱ<subs>T</subs>) is a standard Borel space, then there exists a compact metric topology on Ω whose Borel field is ℱ<subs>T</subs>. For such a topology, &#55349;&#56492; is weakly compact, and so Assumption 1 is satisfied and &#55349;&#56492; admits a least favorable measure Q<subs>0</subs>. It can be explicitly determined in the case in which ψ(t) = (tλ<sups>−1</sups>)∧1 for some λ ∈ (0, 1). Indeed, it follows from Example 3.2.1 that the Radon–Nikodym derivative of P* with respect to γ is given by

Graph

where c is the normalizing constant and t<subs>λ</subs> is the unique maximizer of the function t↦ (t − 1 + λ)<sups>+</sups>/Φ(t).

Example 3.4.2.2 (Weak Information)

Let Y be a measurable function on (Ω, ℱ<subs>T</subs>), and denote by μ its law under P*. For ν∼μ given, let

Graph

The robust utility maximization problem for this set &#55349;&#56492; was studied by Baudoin[8], who coined the terminology "weak information". The interpretation behind the set &#55349;&#56492; is that an investor has full knowledge about the pricing measure P* but is uncertain about the true distribution P of market prices and only knows that a certain functional Y of the stock price has distribution ν.

Define Q<subs>0</subs> by

Graph

Then Q<subs>0</subs> ∈ &#55349;&#56492; and the law of π ≔ dQ<subs>0</subs>/dP* = dμ/dν(Y) is the same for all Q ∈ &#55349;&#56492;. Hence, Q<subs>0</subs> satisfies the definition of a least favorable measure. The same procedure can be applied to any measure R ∼ P*. In fact, &#55349;&#56492; fits into the framework of the Huber–Strassen theory, as is shown in the following proposition. Least favorable measures for this setting of "weak information" and its generalizations were first analyzed in robust statistics; see Huber[46], Section 10.3 and Plachky and Rüschendorf[55].

Proposition 3.4.2.1

Suppose (Ω, ℱ<subs>T</subs>) is a standard Borel space. Then the set &#55349;&#56492; defined in Example 3.4.2.2 satisfies Assumption 1. In particular, γ(A) ≔ sup<subs>Q∈&#55349;&#56492;</subs>Q[A] is submodular.

Proof

If Q is a probability measure with Q[·] ≤ γ(·), then

Graph

Using the same argument on {Y > t} shows that Y has law ν under Q. Hence, &#55349;&#56492; is maximal in the sense of part (b) of Assumption 1.

To prove that part (a) holds, we will use Theorem 3.4.2.2. To this end, we may choose a compact metric topology on Ω such that Y is continuous and ℱ<subs>T</subs> is the Borel σ -algebra. Write P* = μ K* ≔ ∈tμ(dy)K*(y, ·), where K*(y, ·) = P* [· | Y = y] is a regular conditional expectation given Y. If R ≪ P*, then η ≔ R<sups>−1</sups> ≪ ν and R can be written as η K<subs>R</subs>, where K<subs>R</subs> is a stochastic kernel such that K<subs>R</subs>(y, ·) ≪ K*(y, ·) for η -a.e. y. Let ν = ν<subs>a</subs> + ν<subs>s</subs> be the Lebesgue decomposition of ν with respect to η into the absolutely continuous part ν<subs>a</subs> ≪ η and into the singular part ν<subs>s</subs>. If we let Q<subs>0</subs> ≔ ν<subs>a</subs>K<subs>R</subs> + ν<subs>s</subs>K*, then Q<subs>0</subs> ∈ &#55349;&#56492; and

Graph

Again, the distribution of π is the same for all Q ∈ &#55349;&#56492;, and it follows that Q<subs>0</subs> is a least favorable measure. If R  P*, then it is clear that any measure Q<subs>0</subs> will be least favorable for R if it is least favorable for the absolutely continuous part of R.

In the 1970's and 1980's, explicit formulas for Radon–Nikodym derivatives with respect to capacities were found in a number of examples such as sets &#55349;&#56492; defined in terms of ϵ -contamination or via probability metrics like total variation or Prohorov distance; we refer to Chapter 10 in the book by Huber[46] and the references therein. But, unless Ω is finite, these examples fail to satisfy either implication in (73). Nevertheless, they are still interesting for discrete-time market models.

REFERENCES

1 Acerbi, C.; Tasche, D.On the coherence of expected shortfall. Journal of Banking and Finance2002, 26 (7), 1487–1503.[CSA]

2 Anscombe, F.J.; Aumann, R.J.A definition of subjective probability. Ann. Math. Statistics1963, 34, 199–205.[CSA]

3 Artzner, P.; Delbaen, F.; Eber, J.-M.; Heath, D.Coherent measures of risk. Math. Finance1999, 9, 203–228.[CSA]

4 Artzner, P.; Delbaen, F.; Eber, J.-M.; Heath, D.; Ku, H.Coherent multiperiod risk adjusted values and Bellman's principle. To appear in Annals of Operations Research.[CSA]

5 Artzner, P.; Delbaen, F.; Eber, J.-M.; Heath, D.; Ku, H.Coherent multiperiod risk measurement. Preprint, ETH Zürich2003.

6 Artzner, P.; Delbaen, F.; Eber, J.-M.; Heath, D.; Ku, H.Multiperiod risk and coherent multiperiod risk measurement. Preprint, ETH Zürich2003.

7 Barrieu, P.; El Karoui, N.Inf-convolution of risk measures and optimal risk transfer. Finance Stoch.2005, 9 (2), 269–298.[CSA]

8 Baudoin, F.Conditioned stochastic differential equations: theory, examples and application to finance. Stochastic Process. Appl.2002, 100, 109–145.[CSA]

9 Bednarski, T.On solutions of minimax test problems for special capacities. Z. Wahrsch. Verw. Gebiete1981, 58, 397–405.[CSA]

Bednarski, T.Binary experiments, minimax tests and 2-alternating capacities. Ann. Statist.1982, 10, 226–232.[CSA]

Carlier, G.; Dana, R.A.Core of convex distortions of a probability. J. Econom. Theory2003, 113 (2), 199–222.[CSA]

Carlier, G.; Dana, R.A.Insurance contracts with deductibles and upper limits. Preprint, Ceremade, UniversitéParis Dauphine, 2002.

Cheridito, P.; Delbaen, F.; Kupper, M.Coherent and convex monetary risk measures for bounded càdlàg processes. Stoch. Proc. Appl.2004, 112, 1–22.[CSA]

Cheridito, P.; Delbaen, F.; Kupper, M.Coherent and convex monetary risk measures for unbounded càdlàg processes. Finance Stoch.2005, 9, 1713–1732.[CSA]

Cheridito, P.; Delbaen, F.; Kupper, M.Dynamic monetary risk measures for bounded discrete-time processes. Electronic J. Probab.2006, 11, 57–106.[CSA]

Choquet, G.Theory of capacities. Ann. Inst. Fourier1953/54, 5, 131–295.[CSA]

Cont, R.Model uncertainty and its impact on the pricing of derivative instruments. Math. Finance2006, 16, 519–542.[CSA]

Cvitanic, J.; Karatzas, I.; Generalized Neyman-Pearson lemma via convex duality. Bernoulli2001, 7, 79–97.[CSA]

Dana, R.-A.A representation result for concave Schur concave functions. Math. Finance2005, 15, 613–634.[CSA]

Delbaen, F.Coherent measures of risk on general probability spaces. In: Advances in Finance and Stochastics; Essays in Honour of Dieter Sondermann; Springer-Verlag, 2002; 1–37.

Delbaen, F.Coherent risk measures. Cattedra Galileiana. Scuola Normale Superiore, Classe di Scienze, Pisa, 2000.

Delbaen, F.The structure of m-stable sets and in particular of the set of riskneutral measures. To appear in Seminaire Probab.2006, 39.[CSA]

Denneberg, D.Non-Additive Measure and Integral; Theory Decis. Lib. Ser. B Math. Statist. Meth. 27; Kluwer Academic Publishers: Dordrecht, 1994.

Dunford, N.; Schwartz, J.Linear Operators. Part I: General Theory; Interscience Publishers: New York, 1958.

Eichhorn, A.; Römisch, W.Polyhedral risk measures in stochastic programming. SIAM J. Optimization2005, 16 (1), 69–95.[CSA]

Ekeland, I.; Temam, R.Convex Analysis and Variational Problems; Stud. Math. Appl. 1; North-Holland Publishing Co.: Amsterdam–Oxford; American Elsevier Publishing Co., Inc.: New York, 1976.

El Karoui, N.; Jeanblanc-Picqué, M.; Shreve, S.Robustness of the Black and Scholes formula. Math. Finance19988, 93–126.[CSA]

Floret, K.Weakly Compact Sets; Lecture Notes in Math. 801; Springer-Verlag: Berlin, 1980.

Föllmer, H.Probabilistic Aspects of Financial Risk. Plenary Lecture at the Third European Congress of Mathematics. Proceedings of the European Congress of Mathematics; Barcelona 2000, Birkhäuser, Basel, 2001.

Föllmer, H.; Gundel, A.Robust projections in the class of martingale measures. To appear in Illinois J. Math.[CSA]

Föllmer, H.; Kramkov, D.Optional decompositions under constraints. Probab. Theory Related Fields1997, 109, 1–25.[CSA]

Föllmer, H.; Leukert, P.Efficient hedging: cost versus shortfall risk. Finance Stoch.2000, 4, 117–146.[CSA]

Föllmer, H.; Schied, A.Convex measures of risk and trading constraints. Finance Stoch.2002, 6 (4), 429–447.[CSA]

Föllmer, H.; Schied, A.Stochastic Finance: An Introduction in Discrete Time; de Gruyter Studies in Mathematics 27; Walter de Gruyter & Co.: Berlin, 2002.

Föllmer, H.; Schied, A.Stochastic Finance: An Introduction in Discrete Time, , 2nd revised and extended edition. de Gruyter Studies in Mathematics 27; Walter de Gruyter & Co.: Berlin, 2004.

Föllmer, H.; Schied, A.Robust preferences and convex measures of risk. Advances in Finance and Stochastics; Springer: Berlin, 2002, 39–56.

Frittelli, M.; Rosazza Gianin, E.Putting order in risk measures. J. Banking & Finance2002, 26, 1473–1486.[CSA]

Frittelli, M.; Rosazza Gianin, E.Dynamic convex risk measures. In: New Risk Measures in Investment and Regulation; G.Szegö Ed.; John Wiley & Sons: New York, 2003.

Frittelli, M.; Scandolo, G.Risk measures and capital requirements for processes. To appear in Math. Finance.[CSA]

Gilboa, I.; Schmeidler, D.Maxmin expected utility with non-unique prior. J. Math. Econom.1989, 18, 141–153.[CSA]

Gundel, A.Robust utility maximization in complete and incomplete market models. Finance Stoch.2005, 9 (2), 151–176.[CSA]

Hernández-Hernández, D.; Schied, A.Robust utility maximization in a stochastic factor model. Statistics & Decisions2006, 24 (3), 109–125.[CSA]

Heath, D.Back to the future. Plenary lecture, First World Congress of the Bachelier Finance Society, Paris, 2000.

Heath, D.; Ku, H.Pareto equilibria with coherent measures of risk. Math. Finance2004, 14 (2), 163–172.[CSA]

Huber, P.Robust Statistics; Wiley Ser. Probab. Math. Statist.; Wiley: New York, 1981.

Huber, P.; Strassen, V.Minimax tests and the Neyman-Pearson lemma for capacities. Ann. Statist.1973, 1, 251–263.[CSA]

Jaschke, S.; Küchler, U.Coherent risk measures and good-deal bounds. Finance Stoch.2001, 5 (2), 181–200.[CSA]

Jouini, E.; Schachermayer, W.; Touzi, N.Law invariant risk measures have the Fatou property. Preprint, Université Paris Dauphine2005.

Komlos, J.A generalization of a problem of Steinhaus. Acta Math. Acad. Sci. Hung.1967, 18, 217–229.[CSA]

Kramkov, D.; Schachermayer, W.The asymptotic elasticity of utility functions and optimal investment in incomplete markets. Ann. Appl. Probab.1999, 9 (3), 904–950.[CSA]

Kreps, D.Notes on the Theory of Choice; Westview Press: Boulder, CO, 1988.

Kunze, M.Verteilungsinvariante konvexe Risikomaße. Diplomarbeit, Humboldt-Universität zu Berlin, 2003.

Kusuoka, S.On law invariant coherent risk measures. Adv. Math. Econ.2001, 3, 83–95.[CSA]

Lembcke, J.The necessity of strongly subadditive capacities for Neyman-Pearson minimax tests. Monatsh. Math.1988, 105, 113–126.[CSA]

Plachky, D.; Rüschendorf, L.Conservation of the UMP—Resp. Maximin-Proerty of Statistical Tests Under Extensions of Probability Measures. Goodness-of-fit (Debrecen, 1984); Colloq. Math. Soc. Janos Bolyai, 45; North-Holland: Amsterdam, 1987; 439–457.

Quenez, M.Optimal portfolio in a multiple-priors model. In Seminar on Stochastic Analysis, Random Fields and Applications IV; Progr. Probab., 58; Birkhäuser: Basel, 2004; 291–321.

Riedel, F.Dynamic coherent risk measures. Stochastic Process. Appl.2004, 112 (2), 185–200.[CSA]

Rüschendorf, L.Stochastically ordered distributions and monotonicity of the OC of an SPRT. Math. Operationsforschung, Series Statistics1981, 12, 327–338.[CSA]

Ryff, J.Orbits of L1 -functions under doubly stochastic transformations. Trans. Amer. Math. Soc.1965, 117, 92–100.[CSA]

Savage, L.J.The Foundations of Statistics; Wiley Publ. Stat.John Wiley and Sons: New York, 1954.

Schied, A.On the Neyman-Pearson problem for law-invariant risk measures and robust utility functionals. Ann. Appl. Probab.2004, 14, 1398–1423.[CSA]

Schied, A.Optimal investments for robust utility functionals in complete market models. Math. Oper. Research.2005, 30 (3), 750–764.[CSA]

Schied, A.Optimal investments for risk- and ambiguity-averse preferences: a duality approach. To appear in Finance Stoch.[CSA]

Schied, A.; Wu, C.-T.Duality theory for optimal investments under model uncertainty. Stat. Decisions2005, 23 (3), 199–217.[CSA]

Schmeidler, D.Subjective probability and expected utility without additivity. Econometrica1989, 57 (3), 571–587.[CSA]

Sekine, J.Dynamic minimization of worst conditional expectation of shortfall. Math. Finance.2004, 14, 605–618.[CSA]

Weber, S.Distribution-invariant dynamic risk measures. Math. Finance2006, 16 (2), 419–441.[CSA]

Yaari, M.The dual theory of choice under risk. Econometrica1987, 55, 95–116.[CSA]

By Alexander Schied*

Reported by Author