Dependent Credit Default Risk Modelling Using Bernoulli Mixture Models

: Generalized Linear Mixed Models (GLMMs) can be used to model the occurrence of defaults in a loan or bond portfolio. In this paper, we used a Bernoulli mixture model, a type of GLMMs, to model the dependency of default events. We discussed how Bernoulli mixture models can be used to model portfolio credit default risk, with the probit normal distribution as the link function. The general mathematical framework of the GLMMs was examined, with a particular focus on using Bernoulli mixture models to model credit default risk measures. We showed how GLMMs can be mapped into Bernoulli mixture models. An important aspect in portfolio credit default modelling is the dependence among default events, and in the GLMM setting, this may be captured using the so called random effects. Both fixed and random effects influence default probabilities of firms and these are taken as the systemic risk of the portfolio. After describing the model, we also conducted an empirical study for the applicability of our model using Standard and Poor’s data incorporating rating category (fixed effect) and time (random effect) as components of the model that constitute to the systemic risk of the portfolio. We were able to find the estimates of the model parameters using the Maximum Likelihood (ML) estimation method.


Introduction
From the literature, there is enough evidence that credit default events show significant dependence. One of the stylized facts about credit default data is that periods with many defaults are generally preceded and followed by other periods with many defaults and this has been presented clearly in McNeil and Wendin [1]. Credit contagion is another issue of interest that affects credit defaults; Egloff [2] and Gieseck and Weber [3,4] for detailed discussions. Credit contagion refers to the propagation of economic distress from one firm to another. In a way, a company may itself face increased risk if one of its major customers default. Financial institutions lending money or holding credit-risky assets are keen to capture dependent defaults since a disproportionately high number of defaults within a set time period can have serious consequences.
The Basel Committee on Banking Supervision (BCBS) released the Basel III Accord in 2011 which encourages banks to create "in-house" risk models that can improve their ability to anticipate and withstand financial and economic stress. A core input to modern credit risk modelling and managing techniques is probabilities of default for each obligor and also default correlation. As such, the accuracy of the default probability and default correlation estimations will determine the quality of the results of any credit risk model. The models have to be calibrated for the specific portfolio of assets held by a bank. Bangia [5] asserted that knowing the distributions of economic variables and the probabilities that characterize their dynamics will help tailor risk models to a bank's needs even further. By ignoring unexpected outcomes for the composition of assets in a portfolio, the underlying credit events assessing credit riskiness of the portfolio in question can be restricted.
On the one side, we want to develop default credit risk models that can resolve the problem of cross-sectional dependency in default rates over time due to common economic conditions (the so-called systematic risk). On the other hand, we want to capture the serial dependency induced by the cyclical behavior of economic factors. Most portfolio credit risk models start with the assumption that conditional on the economic factors, obligor default occur independently. As a result, the systematic risk of a portfolio would be a major concern of ours.
The class of generalized linear mixed models (GLMMs), which includes the Bernoulli mixture models, can be used to model both observed and unobserved elements of the systemic risk. The Credit Risk+ model for Credit Suisse [6] is one of the industry models that belongs to the class of generalized linear mixed models. As we'll see, well-chosen fixed and random effects provide a lot of model flexibility, allowing us to capture time-inhomogeneity in default rates as well as heterogeneity through individual obligors, credit rating categories, industry sectors, and any other desired groupings.
In this paper, we show how economically meaningful assumptions about the underlying factors causing defaults of obligors can be used to produce restrictions on cross-obligor default correlations using a Bernoulli mixture model. We show how these constraints can be applied by estimating model parameters from historical default data using maximum likelihood methods. We investigate the small sample properties of these estimators and apply them to credit rating default data from the S & P 500 database.

Portfolio Notations
We consider a portfolio of d obligors d > 0 and fix the time horizon T. Let X = X , … , X be a d-dimensional vector with continuous distribution functions F x i = P X ≤ x . Let the random variable X for 1 ≤ i ≤ d to be the asset value of the i obligor and we assume that default occurs when the asset X is less than the total liabilities of the obligor. We assume that the default dependence among obligors stems from the dependence among the components of the vector X. We introduce another random variable which is a state indicator for the obligor. Assume take integer values in the set {0, 1, … , n } representing credit rating classes; we interpret the value 0 as default and non-zero representing states of increasing credit worthiness. We assume that at time ! = 0 all obligors are at non-default state. We concentrate on the binary outcomes of default and non-default and ignore the fine categories of the non-defaulted obligors. We write and the marginal default probability p = ℙ Y = 1 . In particular, we want to model the default correlation ρ + Y , Y , and For ≠ ., we have that Default risk is mostly generated by associated defaults at the portfolio level. Default correlation is heavily influenced by a variety of parameters related to default risk. In a paper by Molins and Vives [7], it is well demonstrated that at a certain threshold, a small shift in default correlation can trigger credit portfolios or the whole market system to suffer a phase transition, such as the collapse of credit portfolios or the whole market system.

Bernoulli Mixture Models
We begin by giving the definition of Bernoulli mixture models. Let B = B , … , B C, ′ be a D-dimensional random vector. Then we say that the vector " = " , , … , " E, ′ follows a Bernoulli mixture model with factor vector B if there are functions F , : ℝ C → [0, 1] for 1 ≤ ≤ L, such that conditional on B , the components of " are Bernoulli random variables with success probability F , M = ℙ6" , = 1NB = M 7. In this setting, the subscript ! denotes time period and the random vector B denotes the vector of risk factors. In these types of models, we assume that the default risk of an obligor depend on the set of risk factors (eg. climatic, geographic, economic, etc) and these factors can also be modelled stochastically. We also assume that if we know the realizations of the risk factors, then the default of the individual obligors are independent. The dependence between defaults is introduced by the dependence of the individual default probabilities on the set of risk factors.
We let O = {1, … , P } be a set of rating classes where higher values indicate high creditworthiness. We suppose that we can collect historical default data over Q time periods (yearly data) for P different classes of credit rating classes indexed by R = 1, … , P. We denote by S T, the number of obligors for the ! year cohort in the credit rating class R and U T, to be the number of defaulted obligors in the credit rating class R in the ! year, so that S ! = ∑ S T, The vectors d , and b , are known vectors corresponding to the covariates of the obligor in period ! , c and `T = {` , … ,`W} are vectors of unknown regression parameters and _ is a smooth, strictly increasing function taking values from ℝ to the unit interval, called the response function. We can easily show that the probit response function is a natural choice for the response function, and for other choices, interested readers can see Joe [8].
For e ∈ {0, 1} [ , the conditional joint default probability for the random variables " , , … , " [ , is given by We are interested in the unconditional probability and by applying the De Finetti theorem on (5) (see Frey and McNeil [9]for a brief discussion on the De Finetti theorem), then the unconditional distribution is given by integrating (5) over the distribution of the risk factors, so we have By definition of Bernoulli mixture models, we can show that threshold models inspired by the Merton [10] type model can be interpreted as Bernoulli mixture models. To show this, let n , , … , n [ i , be independently identically distributed random variables with a standard normal distribution function o which are also independent of B . Following Merton, we assume that default of the obligor during time period ! occurs when its asset value p , is less than its total liabilities, then we can set p , = n , − b , c − d , B . Using equation (4), we have that the default probability for the obligor at time period ! is and it is easy to see that this is a Bernoulli mixture model with F , being the probit distribution function. In this case, `T is treated as the total liabilities of the obligor. If we can be able to collect default data over a number of time periods, then we can statistically find the estimates of c, `T = {` , … ,`W} and the so called hyperparameter q of the distribution of B based on the realization of " and the covariates b , and d , . To carry on with our discussion, we first give the following general theory about generalized linear mixed models.

Generalized Linear Mixed Models
Bernoulli mixture models are a type of GLMMs that are commonly used in statistics. This group of models will deal with data in continuous, discrete, or binary format, as well as data with multiple sources of random error. The CreditRisk+ [11] industry model fits in this general framework where the number of defaults, conditionally on gamma-distributed latent factors, is Poisson distributed. GLMM are characterized by (i) random effects r with distribution s and hyperparameters t, (ii) a distribution from the exponential family for the conditional response variable " , given r , and (iii) a response function _ (its inverse is known as link function) relating the systemic risk of the portfolio p , u + v , r to the responses. In the absence of r , the model is simply a Generalized Linear Model (GLM), see Lindsey [12]for concepts.
We study the responses e , and the covariates variables p , for obligor = 1, … , S ! and year ! = 1, … , Q . We let p = p , , … , p [ , . Given a random effect vector r of an arbitrary dimension w and covariates p , we assume that the conditional density of the responses e , belong to the exponential family, such as Bernoulli or Poisson, with conditional mean 06" , Nr 7 = ℎ6y , 7, y , = p , u + v , r .
For = 1, … , S ! and ! = 1, … , Q . In a GLMM model, the vector p , is designed to specify fixed effects through the fixed parameter vector u and v , is designed to specify random effects through the vector r , hence the quantity p , u + v , r determines the systemic risk of the model. Fixed effects may be entirely obligor-specific or shared across the portfolio (parts of it). Time-inhomogeneity in default rates is caused by shared covariates that change over time, such as macroeconomic variables or other observed risk factors. Heterogeneity among obligors is created by obligorspecific covariates, such as balance sheet data. Timedependent covariates that are identified at the start of time period ! or covariates that are realized during time period !, contemporaneously with the default indicators, can be included in the design vectors p , and v , . In this setting, the vector-valued random effect r could have each component interpreted as the general state of the economy according to, for example, industry sector and/or geographical location hence its components would then typically be strongly correlated and the (observable) design vector v , holds the corresponding (possibly weighted) exposures of obligor .
For the purpose of this paper, it is enough to consider the case where v , = 1 ∀ . Conditional on the random effects { the responses | = e , , … , y [ , on unit ! are treated as independent. By applying the De Finetti theorem (see Section 2.2), the unconditional joint distribution of | is obtained by integrating out the effect of { and thus creates dependence among the responses | = e , , … , y [ , on unit !.

Maximum Likelihood Estimation of GLMMs
According to McNeil [13], models that are used in the industry (such as the KMV, CreditMetrics and CreditRisk+) are less formally statistical in the estimation of the model parameters. The reason for this is that there is not enough default data for higher rated firms to give reliable approximations of the model parameters. In this section, we discuss about the general framework of the maximum likelihood (ML) method for fitting GLMMs. There are other methods that have been proposed in literature (such as Bayesian estimation methods) but in this work, we focus on the ML method.
We recall the notations used in Section (2.3). The unconditional density or mass function } of response vector e = e , , … , y [ , is given by } e |p , u, t = l ℙ6" , = e , Np , u, t7} r Lr ℝ~ (9) where w = L S r and } is the density of r . In order to catch the between year dependence of default, we assume that the random effects r , … , r • are dependent. Since the " , are conditionally independent (knowingr , … , r • ), then the likelihood can be written in the form € u, t| = ∏ } e |p , , u, t [ X which is an ℝ >×• dimensional integral and is the data. This is a high-dimensional integration and it is difficult to master numerically and results are often inaccurate. In that context, Breslow [14] suggests that numerical approximation methods such as the penalized quasi-likelihood (PQL) and the marginal quasi-likelihood (MQL) usually come in handy for solving this problem. Models with correlated random effects, on the other hand, are typically far too complicated to fit using numerical maximizing of the likelihood. For correlated random effects, alternatives such as the expectation maximization (EM) algorithm or simulation of the full likelihood function using the importance sampling technique might be used (see Gourieroux and Monfort [15]).

Bernoulli Model as GLMM
We continue with the notations we introduced in Section 2.2. We choose the response function _ such that 06" , NB 7 = o6‚ , 7 =: _ ‚ , where o is the cumulative standard normal distribution and ‚ , =`T + b , c + d , B . We can rewrite ‚ , in the form ‚ , = p , ′ƒ + d , ′B and this setup is the same as in the GLMM specification. Using equation (3), we have that conditional on B , the conditional default probability for the obligor in the ! period is given by ℙ6" , = 1NB = M 7 = o ‚ , (11) so that the conditional joint distribution is given by ℙ " = e |B = M = ∑ o6‚ , 7 h ;,i j1 − o6‚ , 7k =h ;,i [ X (12) In GLMMs, the unconditional distribution of the responses is obtained by integrating out the effects of the random effects B (by the De Finetti theorem) and this greatly complicates the use of ML estimation of the parameters of the model (due to high dimensionality). However, full ML estimation is possible for simple models. In that regard, we consider a one factor Bernoulli mixture model with ‚ , = T + b , c + B = p , ƒ + B where B ∼ … 0, † 1 and B , … , B • are identically independently distributed random variables and d , = 1 ∀ . Under these assumptions, the likelihood is easier to compute and is given by € c,`, q| = ∑ ‡l ∑ ℙ6" , = e , Nc,`, b , , B 7} ] M LM [ X ℝ• X (13) We suppose that we can collect default data over specific time periods (yearly data) for P different classes of credit ratings and let R = 1, … , P be the credit rating classes. As introduced earlier, we take S T, to be the number of obligors for the ! year cohort in the credit rating class R and U T, to be the number of defaulted obligors in the ! year. The conditional distribution of the vector U = U , , … , U W, is given by And the unconditional distribution is found by integrating over the distribution of the factor variable. Moreover, we assume that the vectors U : and U : • are conditionally independent given B and B • . Under these assumptions, then the joint distribution of U , … , U • is the product of the marginal distribution of the vector U so that we have Therefore, using equation (2), the matrix of the estimated within credit rating and between credit rating default correlations has R, Y -element given by where the diagonal elements are the estimated with credit rating default correlations.

Empirical Study
This section aims at showing how the GLMM ideas can be implemented in practice. We give one example of a fivedifferent credit rating model with a binomial default counts using an existing software.

Source of Data
The credit default data that has been used in this paper was retrieved from the S & P 500 database. The default counts have been collected for one year periods, ranging from January 1981 to December 2000 (Q = 20). The credit rating classes that were considered include A, B, BB, BBB and CCC hence P = 5. Obligors in high rating classes hardly default, as such, including them in this paper would not serve the purpose of the study. Table 1 shows the total number of obligors for each credit rating class together with the number of defaulted obligors in the 20-year period.   To put it another way, the increased number of defaults between the recessions of 1990 and 1993 can be attributed to global risk factors. As expected, the lower rated classes such as CCC possess high default rates compared to high rated credit rating classes (such as A). Over the years, the default rate of the A credit rating class is almost zero and this is the reason why we considered only 5 credit rating classes since credit rating classes rated higher than the credit rating A rarely default, so we were not going to observe anything for their plots.

Results
The estimates `̂T are presented in Table 2, with w − values less than 2 × 10 = ¥ . We have `̂= `̂¦ ¦¦ , … ,`̂ § = −0.84, −1.69, −2.40, −2.92, −3.43 . We see that the estimates decrease with an increase in creditworthiness. For the credit rating class A, the `̂ § = −3.43 and this means that the credit rating class A is associated with 3.43 lower logodds than the other credit rating categories for default, compared to non-default. To find the odds ratio, we exponentiate estimated value as presented in Table 1. The odds ratio for the A credit rating class was found to be 3.2%, which means that for "1 unit increase" of the credit rating class A to a higher rating class, we expect to see (approximately) 96.8% decrease in the odds of the total number of defaulting obligors. The same analysis applies to the other credit rating classes. As expected, low rated credit rating classes have very high odds. The default probability estimates were calculated and are presented in Table 3. We can see that the default probabilities decrease with increasing creditworthiness. Default probabilities for low rated credit classes are higher than those that are highly rated. Obligors with a credit rating C are more likely to default compared to those with a credit rating on A. Table 4 gives the calculated estimates for the within credit rating and between credit rating default correlations. From the table, we see that the within credit rating default correlations increase with decrease in credit rating class and visa versa. For the between credit rating default correlations, the estimates vary depending on the pair of credit classes considered. Note that the default correlations are correlations between event indicators for very low probability events and are necessarily very small.
In [16], the same data was fitted using the Gibbs sampler and virtually yielding similar results.
The hyperparameter estimate † š = 0.24 suggests that there is some significant variation within the random effects Ψ . In our model, we have assumed that the variance of the random effects is the same for all firms in all years and as such, we might be concerned that the model does not allow for enough heterogeneity in the variance of the systemic risk of the portfolio. Moreover, while controlling credit rating classes and the repeated measures within the time periods, we have found evidence that there is an association of credit rating with credit default of obligors. The p values are statistically significant, so we can say that if there was no association of credit default and credit rating, then the probability of observing the S & P 500 data we have used in this work is less than 2 × 10 = ¥ (almost zero).

Conclusion
This paper discusses Bernoulli mixture models (as a class of GLMMs) as a tool for modelling dependent credit default data. In Section 2 the most important concepts on model formulation and inference are summarized. The Bernoulli mixture model we looked at is relatively simple, but it allows for a handy formulation of systematic portfolio risk in terms of observed fixed effects and unobserved random effects in order to capture inhomogeneities in default rates throughout the portfolio and across time.
We solely took into account default and non-default results in this study. We have also mentioned that GLMMs apply to a number of well-known industry models used to model credit defaults. It would be reasonable to incorporate additional random effects in the model and allow more heterogeneity provided we had more information on the industrial and geographical sectors to which the obligors belonged. In this way it is our intention to make this literature more accessible to researchers in the field of quantitative risk management.