Quantifying the Model Risk Inherent in the Calibration and Recalibration of Option Pricing Models

We focus on two particular aspects of model risk: the inability of a chosen model to fit observed market prices at a given point in time (calibration error) and the model risk due to recalibration of model parameters (in contradiction to the model assumptions). In this context, we follow the approach of Glasserman and Xu (2014) and use relative entropy as a pre-metric in order to quantify these two sources of model risk in a common framework, and consider the trade-offs between them when choosing a model and the frequency with which to recalibrate to the market. We illustrate this approach applied to the models of Black and Scholes (1973) and Heston (1993), using option data for Apple (AAPL) and Google (GOOG). We find that recalibrating a model more frequently simply shifts model risk from one type to another, without any substantial reduction of aggregate model risk. Furthermore, moving to a more complicated stochastic model is seen to be counterproductive if one requires a high degree of robustness, for example as quantified by a 99 percent quantile of aggregate model risk.


INTRODUCTION
The renowned statistician George E. P. Box wrote that "essentially, all models are wrong, but some are useful." 1 This is certainly true in finance, where many models and techniques that have been extensively empirically invalidated remain in widespread use, not just in academia, but also (perhaps especially) among practitioners. At times, the way models are used directly contradicts the model assumptions: As observed market prices change, parameters in option pricing models, which are assumed to be time-invariant, are recalibrated, often on a daily basis. Incorrect models, and model misuse, represent a source of risk that is being increasingly recognised -this is called "model risk." As a paper by the Board of Governors of the Federal Reserve System put it in 2011, 2 "The use of models invariably presents model risk, which is the potential for adverse consequences from decisions based on incorrect or misused model outputs and reports." In broad terms, one could identify four general classes of model risk inherent to the way mathematical models are used in finance, for example in (but not limited to) option pricing applications: • Parameter uncertainty (and sensitivity to parameters) -let's call this "Type 0" model risk for short. If model parameters need to be statistically estimated, they will only be known up to some level of statistical confidence, and this parameter uncertainty induces uncertainty about the correctness of the model outputs. 3 • Inability to fit a model to a full set of simultaneous market observations -this is "calibration error," let's call this "Type 1" model risk for short. To the extent that a model cannot match observed prices on a given day, single-day (a.k.a. "cross-sectional") market data already contradicts the model assumptions. The classical example of this is the Black/Scholes implied volatility smile.
Date. This version: October 19, 2018. Part of the initial research for this paper was conducted at the Financial Mathematics Team Challenge 2016 at the University of Cape Town. We would like to thank Sam Cohen for helpful comments on earlier drafts of this paper. The usual disclaimers apply. 1 See Box and Draper (1987). Examples of where this type of risk is considered explicitly in the literature include Löffler (2003), Bannör and Scherer (2013) and Kerkhof, Melenberg and Schumacher (2010).
• Change in parameters due to recalibration -let's call this "Type 2" model risk for short.
Once one moves from one day to the next, this aspect of model risk becomes apparent: In order to again fit the market as closely as possible, it is common practice in the industry to recalibrate models. This recalibration results in model parameters (which the models assume to be fixed) changing from day to day, contradicting the model assumptions. • The "true" dynamics of state variables don't match model dynamics 4 -let's call this violation of model assumptions "Type 3" model risk. 5 The classical example of this is the econometric rejection of the hypothesis that asset prices follow geometric Brownian motion, thus invalidating the key assumption in the seminal model of Black and Scholes (1973). This type of model risk would impact in particular the effectiveness of hedging strategies based on a model. 6 Note that there is a gradual transition between the different types of model risk, and depending on one's modelling choices, to a certain extent one can trade off one type of model risk against another. For example, • Less stringent requirements of an exact fit to market observations (Type 1) allows less frequent recalibration (Type 2). • Instead of different model dynamics (Type 3), one could consider a parameterised family of models (Type 2). • Regime-switching models "legalise" changes in parameters, so Type 2 becomes more like Type 3. • Adding parameters shifts model risk from Type 1 to Type 2 (or, to a certain extent, to Type 0). • Adding state variables shifts model risk from Type 2 to Type 3. Glasserman and Xu (2014) propose relative entropy as a consistent pre-metric by which to measure model risk from different sources. 7 What matters in the application of mathematical models in finance is the probability distributions which the models imply, 8 either under a "risk-neutral" probability measure (for applications to relative pricing of financial instruments) or the "physical" (a.k.a. "real-world") probability measure (for risk management applications such as the calculation of expected shortfall). Each type of model risk manifests itself as some form of ambiguity about the "true" probability measures which should be used for these purposes, and being able to quantify different types of model risk in a unified setting using a pre-metric for the divergence between distributions (like relative entropy) allows one to make an informed choice about the trade-offs between different sources of model risk. Glasserman and Xu (2014) postulate a "relative entropy budget" defining a set of models sufficiently close (in the sense of relative entropy) to a nominal reference model to be considered in an evaluation of model risk expressed as a "worst case" expectation -i.e., a worst-case price or a worst-case risk measure. However, they say little as to how one typically would obtain a specific number for this "relative entropy budget". In a sense, we invert this problem by noting that higher relative entropy between model distributions indicates higher model risk, and propose a method to jointly evaluate model risk 4 This type of model risk is considered for example in Kerkhof et al. (2010), who also relate this to identification risk, which they define as risk which "arises when observationally indistinguishable models have different consequences for capital reserves." 5 Boucher, Danielsson, Kouontchou and Maillet (2014) present a method for making value-at-risk more robust with respect to this source of model risk by "learning" from the results of model backtesting. 6 Detering and Packham (2016) take the approach of measuring model risk based on the residual profit/loss from hedging in a misspecified model. 7 Instead of using a relative entropy pre-metric, one could approach quantifying model risk in terms of optimaltransport distance, using for example Wasserstein distance, which most recently has become popular for this purpose (see Bartl, Drapeau and Tangpi (2018), Blanchet, Chen and Zhou (2018) and Feng and Schlögl (2018)). In the present paper, we follow the more established approach using relative entropy, which has its roots in the seminal work of Hansen and Sargent (see e.g. Hansen and Sargent (2006)). of two types, based on how this model risk manifests itself when option pricing models are calibrated and recalibrated to liquid market instruments.
We focus on the model risk inherent in the calibration and recalibration (i.e., in the above terminology, Types 1 and 2) of option pricing models, and to illustrate our approach we consider the models of Black and Scholes (1973) and Heston (1993), thus comparing the most classical option pricing model with its popular extension incorporating stochastic volatility. Clearly, if (as is often the case in practice) one focuses solely on calibration error, Heston (1993) will always be preferred to Black and Scholes (1973), and more frequent recalibration preferred to less. We quantify calibration and recalibration risk in both models applied to equity option data, and also explore the trade-off between these two types of model risk, finding that there is no longer a trivial answer to the question which model and which recalibration frequency should be preferred when these two sources of model risk are considered in a unified framework.
The rest of the paper is organised as follows. Section 2 introduces a framework for the joint evaluation of model risk due to calibration error and due to model recalibration. The numerical implementation of the method is discussed in Section 3. Section 4 presents the results obtained by applying this method to option price data, and Section 5 concludes.

CALIBRATION ERROR, MODEL RISK DUE TO RECALIBRATION, AND TREATMENT OF LATENT STATE VARIABLES
As noted above, model risk is reflected in the ambiguity with regard to the "correct" probability distribution to use for relative pricing or risk assessment. Following Glasserman and Xu (2014), we quantify this ambiguity using the divergence between probability measures. In the present context, these can be classified as divergence measures defined as a function where S is a space of all probability measures with a common support. More specifically, most divergence measures belong to the class of f -divergence, which gives the divergence between two equivalent measures as: 9 where f is a convex function of the Radon-Nikodym derivative satisfying f (1) = 0. Kullback-Leibler divergence (a.k.a. relative entropy) is the most common f -divergence, which assigns f (m) = m ln m. It is noted that the methodology of this paper applies to all types of statistical distances in principle, though in the empirical study the Kullback-Leibler divergence is adopted due to its simplicity and widespread use.
If we wish to quantify calibration error (Type 1 model risk) in this fashion, then in equations (1)-(3), the probability measure P corresponds to the calibrated model and thus is parametric in some form. The probability measure Q, on the other hand, serves as a reference measure exactly matching observed market prices at a given point in time, unrestricted by the assumptions of the model under consideration. On calibrating an option pricing model, we may regard the measure Q as some non-parametric risk-neutral measure that explains the market in full assuming absence of arbitrage. In practice, however, the measure Q is not unique as the market is usually incomplete. We therefore define the space of all probability measures that explains the market in full by S Q ⊂ S.
We may further define the space of probability measures given by all possible choices of parameter values for the target model by S P ⊂ S. The new calibration methodology proposed 9 See e.g. Ali and Silvey (1966), Csiszár (1967) or Ahmadi-Javid (2012).
here aims to minimise the calibration error as quantified by the divergence between the two measures P and Q, taken from their respective spaces, i.e.
(Q * , P * ) = arg min This is to say, the new approach attempts to calibrate a model measure P * (i.e., a set of model parameters θ * ) and non-parametric perfect fit to the market (at a given point in time) Q * , in a fashion which minimises the calibration error expressed by This is not an end in itself -it is required in order to compare model risk due to calibration error and model risk due to recalibration (as specified below) in a unified framework.
The classical approaches of model calibration, such as minimising the mean-squared error between model and market prices for options, would be inappropriate in this context, as they would lead to unnecessarily high model risk quantities. It is the choice of divergence measure which informs the calibration procedure, resulting in a pair of probability measures, (Q * , P * ), one of which corresponds to the calibrated model while the other provides a consistent reference measure fitting the market exactly.
To quantify the model risk due to recalibration, let us consider the more specific case where the model is Markovian in a vector of observable state variables X, the model is characterised by a vector of model parameters θ, and market prices are given for European option prices of a single maturity T . 10 Suppose we solved (4) yesterday (at time t i−1 ) to obtain a P * -to be as explicit as possible, denote this by I.e., this is a (conditional) probability measure defined on all F T -measurable events, where the conditioning is on the state variables at time t i−1 , X t i−1 , and we write X t i−1 (ω) to express that the time t i−1 realisations of the state variables are known at the time that these probabilities are evaluated. We write the subscript θ * t i−1 to express that these probabilities are evaluated in a model with parameters calibrated by solving (4) at time t i−1 . Furthermore, denote the non-parametric measure Q * resulting from solving (4) at time t i−1 by Q t i−1 . Now, if we recalibrate today (at time t i ) by solving (4), we obtain Q t i and We can then define the model risk quantity due to recalibration as which is the divergence between the (conditional) probability measures evaluated at time t i , where one measure is based on the recalibrated parameters θ * t i and the other is based on the previously calibrated parameters θ * t i−1 (thus expressing, in terms of divergence, the inconsistency with the model assumptions due to the fact that we are going "outside of the model" to change parameters in recalibration). The aggregate of calibration error and model risk due to recalibration is then i.e., the divergence between the non-parametric probability measure Q t i obtained by solving (4) at time t i , and the non-recalibrated parametric probability measure, consisting of probabilities conditional on the state at time t i , but based on model parameters obtained by solving (4) at time t i−1 . However, this approach minimises the divergence between the reference distribution 10 This last assumption of a single maturity T avoids the need to constrain the choice of Q to ensure the absence of calendar spread arbitrage between non-parametric risk-neutral measures for different time horizons -parametric models typically ensure this by construction. If we appropriately constrain Q, this assumption can be lifted. and the recalibrated distribution, thus arguably overstating the divergence to the non-recalibrated (i.e. model-consistent) distribution, and therefore overstating the aggregate model risk η 3 .
Alternatively, we may choose as the non-parametric reference distribution at time t i : resulting in a lower aggregate model risk of Note that θ * t i is still obtained by solving (4), because both Q t i andQ t i represent non-parametric probability measures fitting observed market prices exactly, so θ * t i remains the best available parametric fit to the market at time t i (Q t i is only used to determine minimum divergence of the non-recalibrated model to a measure giving a perfect fit).
In the heuristic schematic of Figure 1 , being the parametric probability measure "closest" to the set of non-parametric probability measures fitting the market exactly, where point C represents Q t i . If we do not recalibrate at time t i , we end up with the parametric probability measure P is the "closest" non-parametric probability measure fitting the market exactly.
In the case of Kullback-Leibler divergence, note that if Type 1 (calibration error) and 2 (recalibration) model risk involve independent Radon-Nikodym derivatives, then, in the first case considered above, aggregate model risk equals the sum of the two components. In fact, the Radon-Nikodym derivatives, as random variables, take the key role in evaluating the two types of model risk. At the time the model is recalibrated, we again consider the optimisation (4), with S Q now changed to reflect the change in observed market prices, so we have the following Radon-Nikodym derivatives: For calibration error: For model risk due to recalibration: For aggregate model risk: Abbreviating dP t i ,Xt i (ω),θ * t i−1 as P and dP t i ,Xt i (ω),θ * t i as P * , the aggregate risk can be expressed in terms of m 1 and m 2 as: If m 1 and m 2 are independent, cov P (m 2 , m 1 ) = cov P [m 2 , m 1 ln(m 1 )] = 0. The total model risk is equal to the sum of the calibration risk and the recalibration risk. Surprisingly, in our empirical exploration below we found that this equality is followed quite well by the Black-Scholes model. However, it typically does not hold in the Heston model, suggesting substantial dependence (of Radon-Nikodym derivatives) between the calibration error and model risk due to recalibration.
We also consider models which involve one or more latent state variables. An example of that is the class of stochastic volatility models where the volatility is taken as a latent state variable rather than a model parameter (in the empirical examples below, we specifically consider the model of Heston (1993), which falls into this category). Under the framework of a single 11 Note that these graphs are for the purpose of heuristic illustration only -in particular, we are not requiring that the two sets of probability measures are convex. stochastic volatility state variable, a model specified by a given set of parameters forms a onedimensional manifold ( Fig. 1(b)) for possible realisations of the state variable, rather than a point in the Black-Scholes world ( Fig. 1(a)). Thus, the model which we are now considering is Markovian in a vector of state variables (X, V ), where the state variables X are observable and the state variables V are latent (unobservable). Then, the initial calibration problem (4) becomes where S v and S θ are the sets of legitimate values of the state variables and the parameters, respectively. θ * is the set of model parameters calibrated to the market, and v * is the best estimate of the latent state variables under the calibrated model. 12 The notation in (6) is amended to At time t i , we have for the calibration error The model risk due to recalibration is The aggregate model risk using or alternatively, usingQ t i andv t i determined analogously to (10), i.e., 12 This effectively treats the latent (unobserved) state variable as an additional parameter to be calibrated, but the recalibration of which does not contribute to (Type 2) model risk due to recalibration, because it is consistent with the model assumptions for this latent state variable to evolve stochastically. This does shift Type 2 model risk to Type 3, the risk that the state variable dynamics are not (econometrically) consistent with the dynamics assumed in the model. However, in the present paper we deliberately set aside Type 3 model risk for the purposes of our analysis, leaving the integration of all four types of model risk for future research. which results in We then have the following Radon-Nikodym derivatives: For calibration error: For model risk due to recalibration: For aggregate model risk: Note that the key difference between (12)- (14) and (25)-(27) is that the change in v, being permitted by the model assumptions, does not contribute to the model risk quantities. In (4) and (18), we are deliberately prioritising the minimisation of calibration error, as this is congruent to the (often exclusive) focus of practitioners on calibration error (with little or no regard to model risk due to recalibration). If desired, one could reformulate this approach to prioritise the minimisation of aggregate model risk, or of model risk due to recalibration.

NUMERICAL IMPLEMENTATION
In this section, we outline the numerical scheme for solving the minimisation problems arising when taking into account calibration error and model risk due to recalibration in the manner described in the previous section, including problems of the type (4) involving the optimal choice of two probability measures. In this case, an iterative process is required, optimising two probability measures Q and P in turn until convergence, in the following manner: 1): Produce P (0) from a parametric model based on an initial guess of the model parameters (and latent state variables, where required). 2): Solve for Q (0) via Lagrange multipliers for the constrained problem that minimises D(Q (0) ||P (0) ). 3): Solve for P (1) to obtain model parameters for the P (1) that minimises D(Q (0) ||P (1) ). 4): Iterate steps 2 and 3: P (n) → Q (n) → P (n+1) until convergence.

In
Step 1, the initial guess may be obtained in several different ways. A common way is to minimise the mean-squared error between model and market option prices at all available strikes. We opted for the Broyden/Fletcher/Goldfarb/Shanno (BFGS) algorithm for conducting this initial calibration of the model parameters and (where required) latent state variables. In Step 2, we solve the following constrained minimisation problem using Lagrange multipliers: Note that here we specify the constraints in the form of expectations under the measure Q, where these expectation are the model prices for our calibration instruments for the model based on the non-parametric reference distribution Q. In general, B, Z and A are vectors; thus (29) is a "stack" of inequality constraints representing observed market prices. Also notice that for generality we "relax" each equality constraint into two inequality constraints. This is in order to account for the bid-ask spread of each option traded on the market. The vector B denotes a list of bid prices while the vector A contains ask prices. In a simplified scenario, where exact option prices are given, we may set B = A. Z denotes the vector of discounted option payoffs. By introducing vectors of Lagrange multipliers λ B and λ A , we convert the constrained problem to an unconstrained dual problem, In the case of Kullback-Leibler divergence, solving the inner problem gives the probability density function q (0) of Q (0) in terms of the density p (0) of P (0) , Substituting (31) into (30), we get a maximisation problem, If B = A, then the last term vanishes, representing the problem with exact market prices. If (component-wise) B > A, then the last term reflects a penality on the objective function that is proportional to the difference of the two Lagrange multipliers. We may therefore transform the Lagrange multipliers by and the objective function becomes We may numerically solve the maximization problem by taking its gradient with respect to λ + , where the element-wise sign function sgn(λ + ) assigns 1, -1 or 0 to each element of λ + . However, due to the discontinuouity of the sign function (37) cannot be solved directly in a stable way. To bypass this problem, we approximate the sign function with a continuous step function: We use Powell's hybrid method to solve the multidimensional equations (37), where δ controls the steepness of the function and carefully choosing this value is critical for a fast and stable convergence of the method. In Step 3, we use L-BFGS-B algorithm to minimise the divergence with respect to model parameters (or latent variables or both).
Step 2 and Step 3 are repeated until convergence. The convergence criterion adopted here is that all the percentage changes of parameters after one iteration do not exceed a certain threshold, say 0.1%.

TO RECALIBRATION
As an application example of the method described in the previous two sections, we consider historical data consisting of daily market prices for call options on AAPL and GOOG stock over a period from 6 January 2004 to 19 December 2008 for AAPL and 4 January 2005 to 19 December 2008 for GOOG. This gives us a reasonably straightforward application example free of extraneous complications, 13 while still covering reasonably liquid options and including a period of "interesting" market volatility (2007/8). From this data, we remove options very far away from the money, restricting the range of strikes from delta 2.5% to delta 97.5%. Furthermore, we remove prices of options which had zero trading volume on a given day, in order to avoid using prices which are likely to be stale.
On this data we consider two parametric models, Black and Scholes (1973) and Heston (1993) -arguably the two most popular option pricing models available, where the latter introduces a latent variable for stochastic volatility. The unified methodology, quantifying calibration error, model risk due to recalibration, and the aggregate of the two, allows us to explore the trade-off between calibration error (which is, unsurprisingly, reduced by moving from Black and Scholes (1973) to Heston (1993)) and model risk due to recalibration (which has hitherto been largely ignored) when moving from one parametric model to another as well as when changing the frequency with which the model is recalibrated.
We start by evaluating the calibration, recalibration and aggregated model risks under a Black/Scholes model, i.e. where the underlying asset price is assumed to follow geometric Brownian motion, with dynamics under the risk-neutral measure given by where r is the continuously compounded risk-free rate of interest and σ is a constant volatility parameter. We note that in the Black/Scholes model we obtain a simple closed form expression for the recalibration risk defined in (8): where σ * is the correctly recalibrated Black/Scholes volatility parameter and σ is the parameter value obtained in a previous calibration. This formula is a consequence of the log-normal distribution of returns assumed in the Black/Scholes model.
We can express the aggregate model risk as the sum of the calibration error, the recalibration risk and a residual. As noted in equation (15), the residual is zero if the likelihood ratios involved in the calibration and recalibration risks are two independent random variables. In practice, the residual usually takes a small non-zero value. In Figure 2 we demonstrate the decomposition of the total model risk into the three components. 14 Unsurprisingly (as it is well documented that the Black/Scholes model cannot fit the implied volatility "smile" observed in most options markets), we see that calibration error typically predominates.
In the Heston model, the dynamics (39) are extended to allow for stochastic volatility, i.e.
This model involves two state variables, the underlying asset price S(t) and the volatility σ(t), and five model parameters: r, κ, θ, η, ρ where ρ is the correlation coefficient between the two Wiener processes: d W 1 , W 2 t = ρdt r is the risk-free rate, 15 and κ, θ and η relate to the volatility process, κ being the rate of mean reversion, θ the long-run mean and η the volatility of this process. 13 Although these options are of the American type, i.e. permitting early exercise, AAPL and GOOG did not pay any dividends during this period. Thus the possibility of early exercise may be ignored (see Merton (1973)). 14 The vertical axis denotes the numerical value of the relative entropy. 15 In our empirical application examples, we take the risk-free rate as one of the financial variables observed in the market, but we do not explicitly take into account interest rate risk in our empirical analysis. For the short-dated options considered here, interest rate risk is known to be of relatively little importance -for a discussion of this issue, see e.g. Cheng, Nikitopoulos and Schlögl (2017) and the literature cited therein. Following Gatheral (2006), the risk-neutral probability of exercise of a European call option with strike K in the Heston model is given by where v is the current value of the volatility state variable σ(t), τ = T − t is the time to maturity, and x is the logarithmic forward moneyness of the option, i.e.
x : = ln with B(t, T ) the time t price of a zero bond maturing in T . Furthermore, Parameters α and β are functions of u (Fourier transform variable of x): It is noted that P 0 = E Q (1 S T >K ) since by definition P 0 is the probability of exercise. The probability density function of the risk-neutral measure is therefore obtained: For simplicity e y denotes the ratio of the forward price at t to its spot price at maturity T , i.e. e y = B(t, T )S T /S t , we derive the risk-neutral probability with respect to y: p(y) can be calculated by fast Fourier transform (FFT).
The decomposition of the total model risk into the three components (the components due calibration and recalibration, and the positive or negative residual measuring the departure from independence between the first two components) when using the Heston model as the baseline is given in Figure 3. Again, since it is well documented that the Heston model can fit observed option prices better than Black/Scholes, it is unsurprising that in this case the relative entropy measuring calibration error is much lower -however, already in this set of example days it is evident that this comes at a price of increased model risk due to recalibration.
These observations are reinforced when we consider aggregate model risk, calibration error and model risk due to recalibration over the entire sample period, as presented in Tables 1 and 2. Note that the absolute numbers refer to relative entropy and thus lack direct financial interpretation -what matters are the relative values when comparing the model across models and different recalibration frequencies, in particular when considering the aggregate model risk.
Here, we consider recalibrating the Black/Scholes and Heston models either daily, every three days, every week, every two weeks, or every quarter year. We see that recalibrating more frequently has little effect on the aggregate model risk, neither when using the Black/Scholes model  nor when using the Heston model. Essentially, recalibrating more frequently simply shifts calibration error into model risk due to recalibration, 16 highlighting the dangers in the common practice of focusing solely on the calibration of derivative pricing models, at the expense of all other sources of model risk.
In addition, we observe that if we are interested in "robustness" at a high level of confidence (looking at, say, the 99% quantile of aggregate model risk), moving from Black/Scholes to Heston also does not appear to deliver any advantage (it does yield some improvement at lower quantiles, or average or median, aggregate model risk). This means that when high levels of confidence are required, any gain in calibration accuracy delivered by the Heston model is offset by higher model risk due to recalibration. One should note that this last point holds even before considering Type 3 model risk, which may well be worse when additional state variables are introduced (as in the Heston model). For these results, in Tables 1 and 2 we calculated means, medians and quantiles across all available option maturities. If we consider only particular maturity "buckets", the same qualitative conclusions are evident -Tables 3 and 4 illustrate this in the case of daily recalibration.

CONCLUSION
Under our approach, less relative entropy implies less model risk, and we are able to evaluate two hitherto disparate sources of model risk (calibration error and model risk due to recalibration) in a unified fashion, and examine the potential trade-off between the two. We have 16 Note that on days on which we do not recalibrate, the model risk due to recalibration is zero, because (consistent with the model assumptions) we are keeping previously calibrated parameters unchanged -so on those days aggregate model risk is entirely due to calibration error (which increases because the fit to market prices deteriorates when we do not recalibrate). considered a simple choice between two models, and between different recalibration frequencies. "Putting a number on model risk" by calculating quantiles for the maximum model risk (quantified by relative entropy) over a time series of market data allows one to assess the added value (if any) of more complicated stochastic models of financial markets. In our application, we are deliberately prioritising the minimisation of calibration error, as this is congruent to the (often exclusive) focus of practitioners on calibration error (with little or no regard to model risk due to recalibration). 17 Even in this case, we see that by including recalibration as one of the sources of aggregate model risk, recalibrating a model frequently to a changing market simply interchanges one source of model risk for another, and more complicated stochastic models may well underperform when aggregate model risk is taken into account.