Adaptive Neuro-Fuzzy Inference System for Mortgage Loan Risk Assessment

: Mortgage lending is one of the major businesses of mortgage institutions which usually involve the granting of loan to potential customers who want to own a home but do not have sufficient capital to do so. The granting of mortgage loan to customers usually comes with a lot of risks which may eventually affect the continuity of such institution if not properly managed. In recent times, several techniques for mortgage loan risk assessment have been proposed. However, a technique that can learn and adapt at the same time incorporate current knowledge of mortgage loan practices is still lacking. Therefore, this research proposed a hybrid decision support system in which neural networks was used to build learning and adaptive capabilities into a fuzzy inference module for mortgage loan risk assessment. The performance of the proposed hybrid system was investigated based on the accuracy of loan risk prediction and the mean absolute deviation metrics. Experimental results show that the hybrid system has better performance than the non-adaptive fuzzy inference system. Our findings suggest that the proposed method would efficiently predict the risk associated with mortgage loan applicants and thereby promote mortgage lending in such institutions.


Introduction
The concept of Mortgage was established based on the fact that people need places to live, but they cannot afford to pay the high cost associated with such accommodations and thereby need to get either partial or full financial support from a bank (Mortgage bank). Most often, a Mortgage is referred to as a security backed loan, that is when a borrower gets a mortgage from the bank, indirectly the physical property purchased or built with the mortgage loan is owned by the bank, and it is used as a collateral for the loan until the borrower pay up the money. When a borrower default (unable to pay back within the stipulated period of time), the mortgage institution uses the property title to auction it for sale, which may or may not be in favor of both parties after the property might have been sold by the bank. In a situation where customers meet up with earlier agreed terms of mortgage loan, the institution makes its profit, become more stable, and function effectively. However, when customers fail to meet up with the earlier agreements; this puts the institution in a risky state which often results to a loss and affect its smooth operations. Such losses usually arise as a result of unpaid loan installments comprising of the principal and interest, loss of value at the auction sale with respect to the current market price, and incurred administrative expenses. These losses have been largely attributed to the poor manner in which credit applicants are assessed before been granted such loans. Previously, the traditional methods of granting loans to borrowers is based on judgmental concept using the experience of credit officers and these methods are associated with subjectivity, inconsistency and individual preferences motivating decisions [1,2]. In an attempt to address the challenges associated with the traditional methods of granting mortgage loans to applicants and help credit managers make good decisions, a number of methods for mortgage loan assessment have been proposed. First was the development of credit scoring systems and the use of statistical methods such as discriminant analysis and linear regression model to evaluate loan risk. With advancement in technology, several techniques such as fuzzy logic, neural networks, genetic algorithms, analytic hierarchy process, k-nearest neighbor among others have emerged. Increase in the complexity of decision making, loan defaults, competition in the banking market, and limitations of statistical methods have led to the use of the above mentioned techniques in the management of loan risks. For instance, Artificial neural networks (ANNs) is one of the artificial intelligence concept that researchers have used for analyzing the relationship among economic and financial phenomena, prediction, generating time series, optimization and decision making [3]. This technique have successfully provided effective means for the granting of loans because it is capable of modeling very complex linear and non-linear relationships, mathematical and logical relationships that are unknown to the credit managers and as well has learning capability [4,5]. However, a major setback of the ANN approach is its black box nature which makes it difficult for credit managers to understand how a particular credit decision was made and it is difficult to modify once it has been trained. On the other hand, Fuzzy logic (FL) is also a powerful problem solving technique whose approach imitates the strategies of decision making in humans. FL has been successfully used to model a number of nonlinear functions of arbitrary complexity to a desired degree of accuracy and can be used to conveniently map input space to an output space [6]. However, unlike ANN, major challenges of FL include ineffective learning and adaptation capabilities, requires considerable amount effort to develop good set of membership function parameters and it's rule base construction usually require extensive knowledge of some domain experts.
In the current competitive and turbulent business environment, one of the basic tasks which banks are expected to handle is the minimization of its loan risk [7]. Seeking to reduce potential loss due to non-reliable borrowers, banks must be able to measure mortgage loan risk properly. To achieve this, there is need to combine both ANN and FL techniques in order to develop a formidable system for mortgage loan risk assessment since the drawbacks pertaining the two techniques are complementary. In this study, the concept of hybridizing neural networks and fuzzy logic (Neuro-Fuzzy) proposed in 1990 [8,9] is adopted to assess mortgage loan risk with the aim of providing a tool that would aid mortgage institutions to minimize their loan risk, maximize profit and enhance the quality of their services. A major significance of this study was to provide an objective and efficient platform for evaluating mortgage loan applicants which is currently lacking in mortgage institutions.

Related Works
A credit risk assessment model in mortgage lending using discriminant analysis to classify customers' credit risk as either high or low was proposed [10]. An evaluation of the model using 250 customers' records reveals that the four inputs used had different influence on the classification accuracy. Debt to income ratio appears to be the best contributive input while credit card debt was the least. Despite the good classification performance recorded, the model used restrictive statistical assumptions that are rarely satisfied in real life. Fernando et al., in [11] described a framework to adjust trade-offs among mortgage loan evaluation criteria by applying analytic hierarchy process technique to an existing credit scoring system used by a bank in Portugal. This model was developed by carrying out a pairwise comparison based on the data elicited from a set of questionnaire. The outcome of the analysis proved the framework to be effective for few number of dataset. Aida in [12] presented a bank credit risk analysis with K-nearest neighbor (K-NN) classifier. The aim was to predict the defaults in short term loans for a Tunisian commercial bank. The K-NN classifier's performance was evaluated using credit records of various customers who were granted loans between 2003-2006. Classification were done with different values of K (2, 3, 4 and 5) and the best (88.6%) was achieved when K=3. Ghatge et al., in [13] proposed an ensemble neural networks strategy for predicting credit default evaluation. The model was used to forecast the credit risk of a panel of nationalized bank with result showing that the NNs model have statistical significant in terms of predictive advantage over the manual calculation. Shorouq et al., in [4] developed a neural network based model for loan decision making using Jordanian commercial banks as a case study. However, identifying the key variables that influence the loan decision approval was a critical issue and as well the NN needed long training time to train it. Also, building the NN model was built based trial and error which eventually took a lot of effort. Umar et al., in [14] uses fuzzy logic to develop a credit scoring model for micro-finance institution in Ghana in order to minimize credit defaults and ensure continuous existence. A comparative analysis was carried out between existing and their developed model, result shows that the developed model is more effective in evaluating credit applications where human judgment is involved. However, the use of neural network to incorporate learning and adaptation into fuzzy logic systems for mortgage loan risk assessment has not been done. More so, previous studies on mortgage loan risk have been limited to two possible output status. Therefore, a hybrid decision support system based on neural networks and fuzzy logic concepts for mortgage loan risk assessment with four output status is proposed in this work.

Proposed System's Architecture
In this section, the framework of the proposed hybrid system for mortgage loan risk assessment is presented in Figure 1 and described as follows. This Figure represents the block diagram of the conceptualized hybrid system that begins with a user which enters the required information of loan applicants into the database. This information are later retrieved from the database and fed into the fuzzification module as crisp values representing variables of mortgage loan risk assessment considered for a particular loan applicant. The fuzzy rule based of the system was built based of the knowledge elicited from a set of mortgage loan experts and this represent the backbone of the fuzzy inference engine. The neural network technique was employed to incorporate learning capability into the inference engine by automatically tuning the membership function parameters and fuzzy rules of the fuzzy inference system (FIS). The defuzzification module provides a means of displaying the outcome of the assessment for a particular loan applicant and shows whether or not the loan applicant is worthy of been granted the loan he/she has applied. A total of 233 records of mortgage loan applicants was collection from a mortgage bank (Resort Savings and Loan Plc., Lagos Nigeria). While preprocessing the dataset, we found ten influential and relevant input variables (Age, Service Years, Income per annum, Dependent number, Loan Tenor, Employment Type, Civil Status, Loan History, Cash Flow and Nature of Occupation) that could aid decision making and each of them was graded into 3 different categories (low, moderate, and high) with the assistance of loan officers of the bank.

Adaptive Neuro-Fuzzy Inference System (ANFIS) for Loan Risk Assessment
The hybrid system was developed using Sugeno fuzzy inference technique and gradient descent in combination with least square estimate learning algorithm in order to incorporate learning and adaptation into the FIS. The hybrid system consists of five layers as shown in the structural diagram presented in Figure 2. where , , … , represents the ten input variables of mortgage loan gotten from the data of each applicant, , , , , , , … . , , are the categories of their corresponding linguistic terms (low, moderate, and high), , , … , represents the firing strength of the rules and , , . . . , represents the normalized firing strengths. In the first layer, each node corresponds to one linguistic term of each of the input variables. In other words, the output of this layer , , represent the membership value which specifies the degree to which an input variable belongs to a fuzzy set as shown in equation (1).
where , , … , are the inputs to node i and A i , B i or J i is a linguistic term ("low" or "moderate" or "high") associated with this node. Equation (2) represents generalized bell shape membership function (MF), employed in this study to define the degree of membership of each of the input variables.
where c determines the center of the corresponding MF; a is the half width; and b (together with a) controls the slopes at the crossover points. The final shapes of the input variables MFs will be fine-tuned during network learning.
T-Norm operators were employed in the second layer to compute the antecedent part of each rule that makes up the fuzzy rule base. Here, each node calculates the firing strength of a rule via multiplication operation as shown in equation (3).
The output of this layer represents the firing strength of the corresponding fuzzy rule.
In the third layer, every node calculates the ratio of the i th rule's firing strength to the sum of all rules firing strength as shown in equation (4).
And the outputs of this layer are called normalized firing strengths.
In the fourth layer, every node has a node function which multiplies the node values with their corresponding weights as shown in equation (5).
where ( is the normalized firing strength from layer 3 and 3 , 5 , . . . , 7 represent the parameter set of the nodes. The parameters in this layer are referred to as consequent parameters and they are modified during the network learning using the least means square algorithm. The single node in the fifth layer computes the overall output as the summation of all incoming signals as shown in equation (6) and this was further simplified into equations (7) and ( The overall output of the proposed system was further diffuzified into a crisp form which represents the prediction result of a given mortgage loan applicant. The loan risk level, output prediction membership linguistic terms and their associated value range are shown in Table 1.

Experiment and Results
The proposed hybrid mortgage loan risk assessment system was implemented using MATLAB (Matric Laboratory) R2014a and as well for the pre-processing of the acquired dataset. During the testing phase of this work, the 233 dataset of mortgage loan applicants was divided into three portions (training, testing, and validation data) in which 60% (140) of the dataset was used for training, 20% (46) was used for testing while the remaining 20% (46) was used to validate the system. The Sugeno type FIS was construct using generalized bell membership function with three membership grades for all the mortgage loan input variables as low, moderate and high membership values. Figure 3 shows the bell shaped membership function of one of the input variables. The fuzzy rule base was populated with 30 rules based on experts' knowledge in the field of mortgage loans administration. The antecedent parts of the rules were combined using the product operator and each of the rule's weight was determined using the product implication method.
The developed FIS was trained using a hybrid learning algorithm (least-square estimate and gradient descent method). The training error tolerance and epochs were set to 0 and 30 respectively and the training process stopped when the training error goal was achieved. In the forward pass, the algorithm uses least-square estimate method to identify the consequent parameters and in the backward pass the errors were propagated backward while the premise parameters were updated by the gradient descent method. During the training process, the membership function parameters of the FIS were adjusted accordingly thereby enhancing its prediction performance.  Table 2 summarizes the activities of the two passes of the hybrid learning algorithm adopted in developing the proposed mortgage loan risk assessment system.
The outcome of the training, validation, and testing sessions of the FIS is presented as follows: An optimal training result was achieved with an error of 1.3516e-7 which corresponds to the optimal number of epochs (30 th epoch) as shown in Figure 4. The performance of the trained fuzzy inference was later examined by computing the training and checking errors against the optimal number of epochs as shown in Figure 5. The plot in Figure 5 shows the checking error as ♦♦ on the top while the training error appears as ** on the bottom. The checking error assumed a constant value at all point in the training which implies that there is no model overfitting. In situations like this, ANFIS chooses the model parameters associated with any of checking error since they are all constant. It can be seen again that an average error of 1.3522e-07 was recorded which suggest that the trained FIS is okay.
The next step was to check the trained FIS against both the testing and checking datasets to be double sure that FIS has been properly trained with the appropriate sets of membership function values and as well ready for use. The checking dataset was tested against the trained FIS output where plus points in blue color represents the checking data and the star points in red color represents the predicted outputs of the trained FIS leading to an average checking error of 0.0099 ( Figure 6). The essence of the step was to test the generalization capability of the trained FIS at each epoch. And finally, the testing dataset was used to test the performance of the trained FIS where the diamond points in blue color represents the testing data output and the star points in red color represents the predicted outputs of the trained FIS with an average testing error 0.0079 (Figure 7). This procedure was however carried out to avoid all forms of overfitting problem since ANFIS structure is fixed and there is tendency for it to overfit the data on which it is trained. It can be observed that the testing and checking dataset are very close to the output of the trained FIS (Figures 6 and 7) which suggest that the trained FIS system would most likely perform well on new mortgage loan applicants datasets. After developing and testing the proposed hybrid system, a relationship was established between each of the loan input variables and the corresponding prediction results. From this relationship, we found out those variables (age, credit history, income, and occupation) that exhibit progressive linear-like correlation with the system's prediction as shown in Figure 8.
From the plots presented in Figure 8, it can be observed that the variables have direct influence on the prediction outcome since an increase in the values of the variables would result to an increase in the prediction outcome. Again, considering the relationship between any two input variables to the mortgage risk prediction outcome, it was observed that the following pairs of input variables have significant influence on the mortgage risk prediction outcome. The results obtained for the two variable pairs are show in the 3 dimensional surface viewer plots presented in Figure 9.  In the above 3-D surface plots, the blue color surface simply indicate low prediction values while yellow color surfaces shows high prediction outputs and the intermediate surfaces (green like surfaces) which lies between the blue and yellow regions shows moderate prediction values. The essence of investigating the relationship between the input variables and the prediction outcome was to identify the variables that are most influential in the granting of mortgage loan to applicants and place more emphasis on such variables for better decision making.

System Evaluation
In this section, the performance of the proposed hybrid decision support system was investigated and compared with that of the non-adaptive FIS based on 10 randomly selected records of mortgage loan applicants using prediction accuracy and the mean absolute deviation metrics. The result in Table 3 shows the outcome of risk prediction for the conventional, non-adaptive FIS and proposed system (ANFIS).  The performance accuracy of the FIS and ANFIS (proposed system) are computed as shown in equations (9) and (10) respectively using mean absolute deviation. The results of the evaluation show that the proposed hybrid system (ANFIS) provides better prediction results in assessing mortgage loan risks compared to the FIS.

Conclusion
Minimizing mortgage loan risk is one of the primary challenges faced by most mortgage institutions and as a result, several methods have been proposed in time past for mortgage loan lending. However, none of these methods have provided both learning and adaptive capabilities which are two core features of intelligent systems among others. In this work, a Neuro-fuzzy based decision support system for mortgage loan risk assessment is proposed. A neural network model was built to train a developed FIS which predicts the risk levels of mortgage loan applicants. After several training, the hybrid system was able to learn the basic underlying relationships that exist between applicants' input variables and their corresponding targets. The hybrid system eventually became adaptive, thereby correctly predicting the risk levels for new dataset of mortgage loan applicants. By using mean absolute deviation metric, the proposed hybrid system attained an overall average prediction accuracy of 95.9% compared to 91.7% for the FIS. The finding of this study suggests that the proposed hybrid decision support system would greatly reduce the risk often associated with the granting of mortgage loans to applicants and also provide an objective evaluation platform for mortgage institutions. Further studies would employ more dataset for training and the use of a multi criteria decision making techniques such as analytic hierarchy process in ranking the input variables in order of their importance and then assign weight to them to improve the robustness of the system.