Comparison of survival analysis approaches to modelling credit risks

Credit risk is a critical area in finance and has drawn considerable research attention. As such, survival analysis has widely been used in credit risk, in particular, to model debt's time to default mechanisms. In this study, we revisit different survival analysis approaches as applied in credit risk defaulters' data and assess their performance in light of the Kenyan context. In practice, inconsistency in the validity of credit risk models used by many companies when predicting and analysis of loan default is a common phenomenon that occurs unexpectedly. Loan defaults often cause major loses to creditors' and can be of great benefit if quantified correctly in advance by using correct models. Here, we address the unbiasedness, analysis, and comparison of survival analysis approaches, particularly, the models of credit risk. vVe carry out data analysis using the Cox proportional hazard model and its extensions as well as the mixture cure and non-ctire model. VIe then compare the results systematically by investigating the most efficient awl prefera,blc model that produces best estimates in the Kenyan rcaJ data, sets. Results show the Cox Proportional Hazard (Cox PH) model is more efficient in the analysis of Kenyan real data set compared to the frailty, the mixture cure, and non-cure model.


Credit Risk
Credit risk is described as the danger of default on a debt that may arise when a borrower fails to make contractual remittance of payments. Furthermore, Credit risk arises in the case that two counter-parties engage in borrowing and lending (Jarrow,2009). "Obligor" a counter-party vvho has a financial obligation; for example, a debtor who owes us money, a bond issuer who promises interest, or a counter-party in a derivatives transaction. "Default" failure to fulfill part of the bargain that is obliged, for instance, failure to repay the specified loan or interest/coupon on a loan/bond; generally due to lack of liquidity or insolvency; may entail bankruptcy.
Credit risk remains a critical area both in banking and other lending institutions and is of great concern to many stakeholders,i.e, borrowers, institutions, and policy regulators. Since the advent of Value at Risk (VaR) models in the l990's, VaR has led to the evolution of risk management practices across the globe. This consequently led to the famous, Basel Committee of 1998, which allowed banks to seek mandatory supervisory approval for putting up capital requirements for market risks with respect to their internal models.
In order to mitigate the adverse effects associated with credit risk , the credit institutions need to first quantify the risks then come up with policies pertaining to risk management.
Consequently, the purpose of this dissertation is to review existing survival models in literature and systematically assess their performance using real data from the Kenyan setting. Given the above situation, credit providers required a more sophisticated credit risk measurement techniques that can b etter assess the risks of their clients. The country has currently adopted logistic regressions which model a dichotomous variable before predicting good and bad clients. This, however , does not inform subsidies on when the initial classification will depreciate over time. Moreover, with the adoption of the International Financial Reporting Standards 9 (IFRS9) , credit providers will be required to forecast not only likelihoods for default but also the time to default.

3 Survival Analysis
The pioneer of using survival analysis in the context of credit risk is documented to be N arain (1992); where he proposed a survival analysis approach as an improvement to logistic regression. Thomas et al, (2002)

Problem Definition
Given the emerging of many financial services provider firms and the growth in the financial Thus, in this study, we fit different survival models into the Kenyan credit data and systematically compare the performance of models in order to evaluate the best model that takes into cm1sideratiou the time-varying aspect and produces effective results.

Objectives of the study
• Fit different survival models to Kenyan credit data.
• Systematically investigates the most efficient and preferable model thFtt produces best estimates in Kenyan real data sets.

Research Question
The question that we intend to ans•v:er at the end of the study is which model is most effective in modeling credit risk according to Kenyan real data set.

5 Significance of the Study
Results obtained from this study will be mainly useful to the commercial banks in computing the probability of default(behFtvior scoring) as well as the time to default(profit scoring) of their different clients. This is of importance since the adoption of International Financial Reporting Standards 9 (IFRS9) and the IRB framework, banks are required to compute both the probability and time to default for asset loss provisioning as well as capital requirements assessment.
This work attempts to evaluate and give recommendations on the best survival model in credit risk context in relation to the Kenyan data. This will help financi al institutions that are credit providers in assessing their clients' risk of default thus lowering their chances of losses arising from default.

Chapter 2
Literature Review

Introduction
In credit risk management, decisions are made on the basis of the creditworthiness of an incli-· vidual or institution which is determined through the use of credit scoring models. According to the CBK's annual supervision report (2016) , banks and CRBs must work hand in hand to deliver credible credit scores. The consistent credit scored would then be incorporated into pricing models and credit risk appraisal.
In this section, we evaluate the conceptual foundations of credit risk analysis. Section 2.2 reviews the literature on survival analysis while section 2.3 discusses a case for survival analysis in Kenya.

. Survival Analysis
Survival analysis was historically used in the medical and engineering fields where the duration until the occurrence of an event of interest is examined, for instance, the time until death or machine failure (Collett, 2003;Kalbeisch and Prentice, 2002;Cox and Oakes, 1984) . Therefore The Cox proportional hazard model with time-varying covariates was estimated. The initial experimented outcomes indicated that the possibility of defaultjs sensitive to specie char-.
acteristics of both contracts and borrowers as well as macroeconomic conditions. The findings based on theory, on the adverse effects coming from distinct interest r.ates over the probability of default were affirmed by the data. A decrease in the economy real interest rate , would imply by an expansionist monetary policy, leads banks to assume more credit risks and ease the analysis of borrowers credit history. By expanding credit operations, banks could compensate for financial losses due to a lower real interest rate. This strategy will bring . borrowers with a higher probability of default to the financial market. Conversely, higher rates of interest on loans intensify the chances of default because it reduces the borrower's capacity to settle their debt.
Jose Angelo Divino, Edna Souza Lima and Jaime Orrillo (2013), however , warned that the previous results were based on a particular data set and set of variables. They might not hold for other samples or financial assets. The positive relationship between the probability of default and the loan interest rate might also be a result of risk-based pricing when the lenders charge higher rates to those portfolio segments that have historically shown higher default rates.
In recent studies, Dirick et al (2016), analyzes the performance of various survival analysis techniques applied to ten actual credit data sets. The sets of data were acquired from the UK and Belgian financial institutions consisting of loans of small enterprises and personal loans, with varying loan terms.
In their paper, (Dirick et al, 2016) analyzed ten dierent data sets from ve banks, using dierent classes of models, that is, Cox PH , Parametric/ AFT, Nonparametric, AFT /Cox PH + extensions, Multievent mixture cure and l\!Iixture cure, as well as using both statistical (AUC and default time predictions) and economic evaluation measures applicable to all model types considered, the "plain" survival models as 'vell as the mixture cure models.
Since techniques for survival analysis are incapable to cope with missing data, and with several data sets having a significant count of missing inputs they preferred to employ the rule of thumb used in a benchmarking paper by Dejaeger et al (2012).
-As a result , for continuous inputs, median imputation was put in use when ::; 25% of the values were missing, and the inputs were removed if more than 25%was missing. For categorical inputs, a missing value category was created if more than 15% of the values were missing, otherwise, the observations associated with the missing values were removed from the data set.
In their paper, they used two opposing denitions for censoring. First, censoreq cases are the loans that did not reach their preclenecl end elate during the time of data gathering (called "mature" cases) and neither experienced default nor early repayment by this time. According to the second denition, a censored case corresponds to a loan that did not experience default by the moment of data gathering. Early loan settlement and mature cases are marked censored. This kind of censoring is used in models where the default is the only event of interest.
The number of input variables in the resulting data sets did vary from 6 to 31, and the number of observations from 7521 to 80 ,641. For each observation, an indicator for default , early repayment and maturity were included, taking the value of 1 for the respective event of interest that took place, and 0 for the others (note that only one event type can occur for each observation). For censored observations according to the rst censoring denition, all indicators are zero. According to the second censoring denition, only defaults are considered uncensored.
In terms of our data sets, this means that censoring rates are ranging from around 20 to 85% according to the first definition (used for the multiple event mixture cure model) , vvhereas censoring percentages are not lmver t han 94.56% up to 98.169( according to the second definition.
A test set consisting of 2/3 and 1/3 of the observations, respectively was reached by splitting each data set randomly. Estimation .of the training sets are made on the models, and the corresponding test sets are used for evaluation. For all the models , the software R is used.
In companson, Cox PH-based models were all proven to work predominantly well, more so a Cox PH model in combination with penalized splines for the continuous covariates.
The Cox PH model often outperforms the multiple event mixturecure model. However, the mixture cure model is among the top models using economic evaluation. It does not perform signicantly dierent in most of the cases. This model does not require the survival function to go_ to zero when time goes to innity as often regarded as appropriate for credit scoring data, making it advantageous . However, the study also notes that finding a suitable evaluation measure to compare survival analysis persisted as an interesting setback, as the AUC did not seem to have the right properties to really differentiate one method from the other.
The fact that , in the existing literature, some questions remain inspired by the researchers. Except for Zhang and Thomas (2012) , no attempt has been made primarily to contrast the available methods in one paper. Secondly, in most recent papers conclusions on the type of survival methods to use could not be made explicitly, since only one data set was analyzed. Finally, the assessment remains mostly fixated on classication and the area under the receiver operating characteristics curve (AUC) as presented in most of the papers.

A Case for Survival Analysis in Kenya
Bellotti and Crook (2009) They opted for Multiple logistic regression given its ability to predict a nominal dependent variable from one or more independent variables. From their study, they conclude that the amount of loan being reimbursed was the main factor affecting default. However, they were faced with a challenge of lacking time to defaulting variables which are of interest in survival analysis.

Study Design
The study contributes to the body of knowledge of credit risk modeling using survival analysis npproachcs. The study fits the Cox PH model aud its cxtcusious, that is, penalized splines and frailty model, as well as the mixture cure and no-cure model to real Kenyan data, set.
Assessment is done to ascertain the most effective model in analyzing credit risk.

Data
The data used for the study was obtained from the rvletropol Credit Reference Bureau for the

Survival Analysis Framework
In survival analysis, we are usually concerned with the time variable, T, of an event of interest.
The survival function is usually, articulated as the likelihood of not experiencing the incident of concern at some observed timet, hence yielding S(t) = P(T > t). In the setting of credit risk, where the default is the event of interest. See Dirick et al (2016). Given the survival function, the probability density function f( u) is given by and the hazard function where T i~ the ~t (change in time).
The hazard function models the instantaneous risk.
·when carrying out survival analysis censoring is clone, that is, the incident of concern has not been witnessed at the time of assembling data. For instance, Dirick et all (2016) considers two types of censoring, one where some credit applicants had failed to pay, replayed in advance or some loans were completely paid back at the completion of the loan period. Censoring is done to the cases where none of the above events had been observed. In the second scenario, censoring is entirely labeled on the cases that matured or repaid their loans early.
Thus only censoring and default states are been considered.

Cox Proportional Hazard Model
The Cox proportional hazard model is more flexible than any accelerated failure time (AFT) model as it contains a non-parametric baseline hazard fun ction , h 0 (t) , along with a parametric part (Cox, 1972). The Cox model has the advantage of preserving the variable in its original quantitative form, and of using a maximum of information. However, very restrictive conditions of application of this model make its use rather limited (Bugnard F., 1994). The model's hazard function is denoted as; where the covariate vector is denoted by x and the parameter vector by (3'.
The. survival function is denorted as; S(ti x) = exp ( -ex p((3'x ) 1t h 0 (u)du with R(ti) denote the set of people that haven't failed to pay at time ti

Cox Proportional Hazard Model with Penalized Splines
The hazard function in the Cox PH model assumes a proportional hazards structure with a log-linear model for the covariates. Thus for any continuous variable, e.g., age, the default hazard ratio between 5-10 years is the same as the hazard ratio between 50-55 years. This assumption nonnally doesn't hold thus splines are used clue to their flexible functions defined by piecewise polynomials that are joined in points called "knots." (Therneau and Grambsch, 2000).
When a total sum of knots in a given spline turns out to be sufficiently huge, a fitted function of the spline depicts more variation than justified by the data. The penalized spline can be considered as a variant of smoothing spline with a more flexible choice of knots, bases, and penalties. A smoothness penalty was introduced by O'Sullivan (1986) ·when he implemented the procedure by incorporating the square of the ·second derivative of the fitted spline function. -Thereafter, Eilers et aL (1996), reve_alecl that this penalty could also be based on higher-order finite differences of adjacent B-splines. Vaupel et al. (1979), came up with the term frailty and used it in univariate survival models.

Frailty Model
Frailty models offer an improved way for integrating random effects in a given model to account for association and heterogeneity that is not observed. Generally, a frailty model can be consid-

S(tiX) = J S(tiZ, X)g( z )dz = E[S(tiZ, X)] = L(A(tiX)) (3.3.3.4)
Univariate frailty models are not identifiable from the survival information alone. However, Elbers and Ridder (1982), proved that a frailty model with finite mean is identifiable with univariate data when covariates are included in the model.

Mixture Cure and Non-Cure Model
Conventionally, mixture cure models have been inspired by the presence of disaggregated longterm survivors (Taylor, 2000;Peng and Dear, 2000).0n the other hand , under non-mixture survival models , the incident of concern is anticipated to occur eventually. Both mixture cure and non-cure models are used in the setting where a given fraction of the population under study will not experience the event of interest. Therefore, the mixture cure model can be viewed The survival function of the mixture cure model is given as; where Y is the susceptibility indicator (Y = 1 if an account is susceptible, and Y = 0 if not).
The conditional survival function modeling the cases that are susceptible is given by a Cox proportional hazards model: In a non-cure mixture context, the Breslow-type estimator is used for estimation of the cumulative baseline hazard similar to the Cox proportional hazards model. Excellent summary on the non-cure mixture model can be found in Tong et al (2012)

The Proportional Hazard Model
The (

exp( x~~~ ), B)
Where Yi(u)(u) = 1 when the individual is at risk atu.The Partial Likelihood is expressed as; The function is dependent on (3, the parameter of interest, and is free of the baseline hazard

Ao(t).
"We then express the Log partial likelihood function of (3 as; The log-likelihood has a novel maximizer and can be gotten by solving the partial likelihood equation

The Frailty Model
We use gamma distribution because it is easy to derive the closed form expressions of survival, density and the hazard function. This is due to the simplicity of the Laplace transform. The  ( 1997)).

The Mixture Cure Model
Note; Ym denotes the matured observation, hence repaid at the maturity date.
Yd denotes the occurrence of the event of interest, default.
Ye denotes early i·epayment.
For a singl_e event we use a semi-parametric regression model where the conditional survival · probability at time t is modelled yielding the unconditional survival function and the corresponding observed likelihood ; given full information of Y , the complete likelihood function is given as .  ( 2007) The gender for the individuals under study is male and female, we shall denote male as gender 1 and female as gender 2 for our analysis part. The individuals are grouped into the age brackets of 18-33, 34-43, 44-53 and above 54 years. The amounts of the individuals are under different products namely; current account, loan account , and the credit card. The individuals under study are considered to be either marriE~d , divorced, single or widowed.
The individuals amount is also grouped in ranges starting from 0-50,000, 50,001-100,000, 100,001-250,000, 250,001-500,000 , 500 ,001-1000,000 and over 1000,000. Note that all the covariates have a p-value of less than 0.001 apart from the marital status which has a p-value of 0.006. The data can be summarized in a  Collett (1994) , documents the Aikaike Information Criteria (AIC) for a given model as a function of its mCL'\:imized log-likelihood (e) and the number of (number of independently adjusted parameters within the model (K) The criteria used for this study is the Akaike Information Criterion (AIC), as is assigns scores to every single model and provides us which a choice of choosing the model with the best score. The lower the AIC compared to the null deviance, the better the model will be.
Akaike Information Criteria (AIC) provides a versatile procedure for statistical model identification which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure. The fact that the maximum likelihood estimates are under certain regularity conditions, asymptotically efficient shows that the likelihood function tends to be a quantity which is most sensitive to the small variations of the parame~ers around the true values.

Fisher Scoring
Fisher scoring iteration is concemed with how the model was estimated (Pauline, (2018)).
Newton-Raphson iterative algorithm is used by default in R for logistic regression. Based on an approximation of estimates a model if fit and the algorithm explores for an enhanced fit by using alternative approximations. Thus engrosses the same route using higher values for the estimates and fits the model again. The algorithm quits when it notices that searching over can't result in any other additional enhancements. In our model, v.re had 719 iterations before the process quit and output the results.

Results
A log rank test was first clone using R statistical software before any analysis to check the signifieR-nee of the variables. Spline graphs, box plots. and the Kaplan-Meier curves were then generated for visualization of variables relationship in determining the probability of default.
Thereafter the data was analyzed using survival analysis approaches in the study to determine the most efficient in modeling credit risks.

Log Rank Estimation
Log Rank test estimation was clone using R statistical software in an attempt to study the significance of the variables in our data set. The log-ra.nk test is used to test the null hypothesis that there is no difference between the populations in the probability of an event at any time point.
The analysis is based on the times of events. The log rank test is based on the same assumptions · as of the Kaplan 1\lleier survival curve3-namely, that censoring is unrelated to prognosis, the survival probabilities are the same for subjects recruited early and late in the study, and the events happened at the times specified. Deviations from these assumptions matter most if they are satisfied differently in the groups being compared, for example, if censoring is more likely in one group than another. Because the log rank test is purely a test of significance it cannot provide an estimate of the size of the difference between the group~ or a confidence interval.   Table 2 above, it 's clearly evidenced that the variables in our data set are significant to our study since they all had a significant difference value of less than 0.5.

Kaplan-Meier (KM) curves
The Kaplan-Meier was used in generating the variables KM curves. The curves were only obtained for the age bracket, gender, marital status, and product name covariates. Its shown from the KM curves that the young people between the age of 18-33 years, the male, single and individuals with a credit card account are more likely to default a loan as shown below;

Survival Analysis Models Estimation
The survival analysis models were analyzed using R statistical software. The Cox Proportional hazard model was significant generally. The model worked well with all the covariates in the estimation of PD. The frailty model as well worked better ·with all the covariates.
The individual in the age bracket of 18 years and 33 years are more likely to default.
This can be attributed to most of them being youths who are still studying and depend on their rmrent for firwnc.ial support. l\iost ly they rtre unemployed thus lacking financial stability.
The males are more likely to default a loan compared to female. This can be attributed to men being breadwinners to their families as well as the extended responsibilities. This can lead to one having many loans and if not financially stable can lead to default. Individuals with credit card account are more likely to default followed by the ones with a current account then the loan account individuals.
Those individuals that are single are also more likely to default a loan. This comes along in that they are not committed with any responsibilities as they have no one to depend on them. rviostly they live a luxurious life which may be unable to maintain thus they will borrow money to spend with no future thought of investments.

Discussions
In this study, we evalu-ate the effectiveness of five survival analysis models ir1_ credit risk scoring.
vVe used the Akaike Information Criteria (AIC) as the main performance evaluation measure.
From the study, it's clearly evidenced that all the models were significant in the analysis of a Kenyan real data set. However, the ·Cox PH model seemed to have outperformed the other models in comparison though the other models did not perform significantly different in most cases. The mixture cure and non-cure model performed significantly the same however the frailty model performed better.
Comparison between the Cox PH model, frailty model , and mixture and non-mixture models assuming different distributions was assessed using the AIC, where a lower AIC value indicates a better model fit. The study concludes that the Cox PH model is more efficient in the analysis of Kenyan real data set compared to the frailty, penalized spline, and the mixture cure and non-cure model. This was as a result of it having the smallest AIC Of 39,747, followed by the frailty model which had an AIC of 42 ,100. The J\1 1ixture Non-Cure Model had an AIC of 44,478 and lastly, the l\Iixture Cure Model <-merged as the less efficient !:lmvival analysis model with an AIC of 44,503 as shown below; Non-rvuxture Cure model 44_478

Conclusions and Recommendations
Survival analysis is advantageous in that the time to default can be modeled, and not just whether an applicant will default or not. Furthermore, the models revisited collectively have the advantage of not requiring t he survival function to go to zero when t ime goes to infinity ; a situation that is seldom and appropriate for credit risk data In t he study, there was a challenge of finding an appropriate evaluation measure t hat is e~idenced across all the methods for survival analysis comparison. In the future, it could be appropriate to ext end the mixture cure and non-cure model and study the performance of these models in contrast ·with a Cox PH model and some of its extensions. It would be also interesting to run all the models again over data that have been coarse-classified and compare its results with other researchers studies.