Predictability of Financial Crisis via Pair Coupling of Commodity Market and Stock Market

: The complex interactions between stock market and commodity market in financial crisis has been investigated by many researchers, but there is less known about how useful the pair coupling of the two markets for predicting financial crisis, where the pair coupling is the hidden essence of market interactions. This article investigates three kinds of couplings, namely time coupling, frequency coupling and space coupling, which are the different aspects of the pair coupling. In addition, a two-layer model, namely CHMM-ANN, is proposed to investigate the couplings and evaluate the predicting abilities based on the couplings. Coupled Hidden Markov Model (CHMM) is adopted at the bottom level to capture the hidden couplings, and then the couplings are put as input to classical Artificial Neural Network (ANN) at the top level to predict financial crisis. The experiment results on real financial data confirm the advantages of the pair coupling in predicting financial crisis.


Introduction
Since the contagion effect of subprime mortgage crisis began in 2007 has caused severe damaging on global economy, considerable attention has been paid to complex transmissions and co-movements between different financial markets. In particular, the correlations of stock market and commodity market in financial crisis is a crucial research area since both market indexes are intrinsically linked with the economy [1]. In the literature, there is robust evidence documenting information transmission between the two markets and which leading to "market fluctuations" [2], which means that the transmission is the key driver of market indexes changes (e.g. WTI oil price). Moreover, different transmission features are observed in terms of structural changes in economy [3]. Therefore, exploring the underlying pair coupling between the two markets could be helpful to deepen the understanding of financial crisis. Here the pair coupling refers to the interactions and transmissions between two financial markets.
The main aim of this study is to investigate whether the pair coupling of commodity market and stock market can yield accurate predictions of financial crisis, which has not triggered much attention in the existing literature. In order to fully capture the pair coupling, the following three kinds of couplings which reflect the different aspects of pair coupling should be considered: time-coupling (TC) which represents the short-term (e.g. weekly) interactions between the two markets; frequency-coupling (FC) which indicates the market interactions across various time scales, this study investigates two kinds of FC, where FC(M) represents the mid-term coupling and FC(L) denotes long-term coupling; space-coupling(SC) which captures the market interactions in different spaces (e.g. different countries) (In this study SC-A represents the couplings between stock market in country A and commodity market). In addition, the complex couplings are hidden behind the observations (e.g. market indexes), which means that they cannot be observed directly from the original data. And this would highly increase the difficulties to explore the complex couplings.
To address the issue, this study builds a two-layer model to conduct the research. At the bottom layer, Coupled Hidden Markov Model (CHMM) is adopted to learn the three kinds of couplings of commodity and stock markets since CHMM is a powerful model to capture multiple processes with coupling [4]. Then the learned hidden couplings are put into the classic Artificial Neural Network (ANN) to conduct the financial crisis prediction.
The remainder of this study is organized as follows. Section 2 provides the related literatures, in terms of financial crisis forecasting methods. Section 3 presents the methodologies, including the methods applied in the study, and the proposed CHMM-ANN model. Section 4 describes data, experimental settings and corresponding results. The conclusion reports in Section 5.

Literature Review
Financial Crisis refers to the situations in which some financial assets suddenly lose a large part of their nominal value. Since a crisis like the subprime mortgage crisis which began in 2007 has a large and damaging effect not only on individual investors but also on societies, there have been several attempts devoted to crisis detection in order to avoid big losses. Generally, the recent efforts at detecting financial crisis have taken the forms of the following three related types.

Signal Approach
The Signal approach was proposed by Kaminsky and Reinhart [5]. The basic idea is to find the difference between economy behaviors on the eve of financial crises as opposed to normal periods. As illustrated in several studies [6][7][8][9], market indexes such as the exchange rate or stock market index are often used as indicators. If they exceed a specified threshold, then a crisis signal will be produced. Since the main limitation here is the method relies on the selection of indicators and the value of threshold, Kaminsky [5] proposes four methods to do information integration. But this does not solve all the problems, Yu et al. illustrate that a very high noise-to-signals ratio will be produced if some of the indicators are strongly correlated [10], and the markets indexes are always closely related in the real world. And this would lead to biased results.

Time Series Models
The basic idea behind this kind of approach is to predict the probability of the occurrence of financial crisis for the following time period by using the historical data of some selected explanatory market variables [11]. The typical models are Logistic and Probit models. For instance, Kumar et al. adopt the Logistic approach to predict the emerging market currency crashes with pooled data on 32 developing countries from January 1985 to October 1999 [12]. And it is easy to find similar works using logistic model to predict financial crisis. [13][14][15]. In addition, Eichengreen et al. use the Probit approach to detect the exchange market crisis by using the data of 20 OECD countries from 1959 to 1993 [16]. Likewise, Berg and Pattillo apply a Probit regression technique to predict the Asia currency crisis [6]. Although these approaches can capture all the information contained in the selected market variables, the occurrence of financial crisis is a rare event and reveals non-linear characteristics, so the models with linear assumptions may lead to disappointing results.

Machine Learning-Based Models
As the computational technology has been widely used in business prediction, model based approaches have begun to develop. This kind of approach adopts artificial intelligence and machine learning techniques to provide financial crisis detection [10]. Techniques such as Neural Network (NN) [17], Support Vector Machine (SVM) [18], Fuzzy Logic (FL) [19] and Decision Tree (DT) are adopted by researchers. Some recent studies reveal that the Artificial Neural Network (ANN) is an useful tool for crisis detection with promising results. For example, Fioramanti applies ANN to predict sovereign debt crisis using data from 1980 to 2004 in developing countries, and the results demonstrate the superiority of ANN when compared with consolidated methods [20].
As cited above, the related approaches mostly focus on the selection of market variables to predict financial crisis, little attention has been paid to the complex interactions between markets. Since it has been demonstrated that different interaction features of stock and commodity markets are observed in terms of structural changes in economy [3], this paper predicts financial crisis by capturing the complex pair couplings between the two markets with a CHMM-ANN model. The proposed model firstly learns pair coupling by CHMM, and the couplings are then fed into ANN to predict financial crisis.

Coupled Hidden Markov Model
CHMM is proposed to model multiple processes with coupling relationships. It was extended from Hidden Markov Model (HMM) [21] in which the system being modeled is assumed to be a Markov process with hidden states. CHMM consists of more than one chain of HMMs, and each HMM represents one process.  Detailed explanation is as following: Prior probability of initial state In order to explore the pair coupling of commodity market and stock market, each market will match to a Markov chain in this study, which means each market index sequence will map to observations of one Markov chain as input. Also, different time scales are selected to specify the frequency-coupling, and stock markets in various countries are adopted to describe the space-coupling. Extended Forward-backward Procedure provided by Zhong and Joydeep [22] is used to estimate the corresponding parameters.

Artificial Neural Network
An artificial neural network (ANN) is an interconnected group of nodes, akin to the vast network of neurons in a brain 1 . The nodes (neurons) are the processing elements of 1 https://en.wikipedia.org/wiki/Artificial\_neural\_network. ANN, and the processing ability is reported by the connection weights W which allow ANN to learn directly from the inputs [23]. There are many types of neural networks. In this paper, a back-propagation network is used with feed-forward architecture, which is one of the most frequently used forecasting techniques [24].

Figure 2. A three-layer back-propagation network.
As shown in Figure 2, the network has three layers, namely input layer, hidden layer and output layer. In this study, X in the input layer represents the couplings learned from CHMM. H 2 in the hidden layer reports the relationship between X and Y, where Y in the output layer denotes the period is financial crisis or not. The transformations from X to H and H to Y following similar mechanism: ' ; ∑ < 4 (1) Where ; • is a sigmoid function given by ? @ AB . Then for the prediction of Y (i.e. crisis period or not), a threshold value of 0.5 is used since the sigmoid transfer function results in a continuous value output between 0 and 1. If the output value is less than 0.5, the prediction is a non-crisis period, otherwise it is a crisis period. In addition, the details of back-propagation learning algorithm employed in this study are detailed described by Bishop [25]. Figure 3 shows the flow chart of the proposed model. It contains following three main stages:

The Proposed Model
(1) Data gathering and pre-processing. As shown in the figure, a commodity market data with different time scales (CM(S) represents short term commodity market data, while CM (M) and CM (L) represent mid-term and long-term data, respectively) are collected. In addition, as illustrated in Section 1, in order to investigate space coupling, n stock markets are employed, and each stock market has three times scales data. After data gathering, some pre-processing methods are utilized (see Section 4.1).
(2) Pair coupling capturing. This stage uses CHMM model to capture complex pair coupling between stock market and commodity market. As illustrated in the above section, the pair coupling includes three kinds of couplings, namely time coupling (TC), frequency coupling (FC (M) and FC (L)), and space coupling (SC). Since the different kinds of couplings are combined in the real world, three kinds of combinations are captured here (shown in Figure 3): 1) SC-i|TC reveals the short time coupling (TC) of stock market in country i and commodity market, namely the combination of TC and SC. 2) SC-i|FC (M) represents the mid-term coupling (FC (M)) of stock market in country i and commodity market, namely the combination of TC and FC (M). 3) SC-i|FC (L) represents the mid-term coupling (FC (L)) of stock market in country i and commodity market, namely the combination of TC and FC(L).
(3) Financial crisis forecasting. This stage adopts ANN model to forecast financial crisis, and the input here is one of the combinations obtained in stage 2 listed above. The output of this stage is the predicted crisis and non-crisis records. In addition, in order to overcome bias, 10-fold cross-validation is involved in this stage to obtain the results.

Data
This study aims to investigate the predictability of financial crisis through exploring the pair coupling of commodity market and stock market. Then the data set of interest is the indexes of the two markets. Here WTI oil price index is selected to represent commodity market while DJIA index represents stock market [26]. Thus, the time-coupling is reflected by the interactions of the two indexes with weekly time scale which represents the short-term variation; the frequency-couplings are captured by the interactions based on bi-weekly and monthly time scales, which represent mid-term and long-term interactions, respectively. In addition, other three stock markets are employed to investigate the space-coupling: Japan stock market from Asia (Nikkei 225 index), France stock market from Europe (CAC 40 index) and Canada stock market from North America (S&P/TSX Composite index).
All the data listed above sourced from the Economic Research (http://research.stlouisfed.org/), and encompass the period from January 1990 to December 2010. The prices of each market are decoded into [0, 1] based on 7C C D C E F / C EHI D C E F , here C EHI and C E F are the maximum price and minimum price in market c, respectively. Since different markets operate with different holidays, the days with missing data are deleted, and the selected days would match all the indexes. Then according to the National Bureau of Economic Research (NBER) Business Cycle Dating Committee, the data are divided into two parts: training set from January 1990 to December 2004 containing two crisis periods 3 , and testing set from January 2005 to December 2010 containing one crisis period 4 .

Parameter Settings
Good starting values for parameters in the algorithm can help in speeding up the algorithm and ensuring promising results. Table 1 indicates the main parameters setting of CHMM and ANN, and it is worth noting that all the selected parameters are based on achieving the best results on the training set.

Comparative Methods
To evaluate the proposed approach, the performance of following three models will be evaluated and compared: 1. Logistic Regression (LR): Here LR is used as a baseline model since it is a widely-used approach with simple operation and balanced error distribution [27].
And the corresponding parameters are obtained through MLE. 2. ANN: This is a sub-model of the proposed approach, which only use market indexes as input without considering the complex couplings captured by CHMM. 3. CHMM+ANN: This is the proposed approach in this study, which first uses CHMM to capture the complex pair couplings, and then the couplings are fit into ANN to conduct financial crisis forecasting.

Evaluation Metrics
To evaluate the performance of different methods, evaluation measures including accuracy, precision, recall and AUC are used. DescriptionS of these methods can be clarified based on the confusion matrix illustrated in Table 2.
Here the distressed samples are set as positive since the rare class is more meaningful in binary classification problem. And corresponding metrics are listed as follows: 1. Accuracy= TP+TN TP+FP+FN+TN , which represents the correctly predicted samples, including crisis and normal samples.

Precision=
TP TP+FP , is the proportion of the number of correctly identified crisis samples divided to the number of samples predicted as crisis period.

Recall=
TP TP+FN , is the ratio of correctly predicted crisis samples. 4. AUC (area under the ROC curve). It is an alternative tool used in binary classification analysis to evaluate model performance. It is based on the receiver operating characteristic (ROC) curve, which is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied 5 . It is a good and popular performance measure for the highly imbalanced dataset [27][28]. AUC ranges from 0.5 (no discriminative power between the two classes) to 1 (perfect discriminative power between the two classes).  Table 3 reports the accuracy performance with different approaches. From the table some interesting findings can be fetched:

Experimental Results
First, the proposed CHMM-ANN model performs the best when compared with the two benchmarks (compare the columns with same row). For instance, in the first row of the table, the proposed CHMM-ANN has around 8% and 14% improvements over ANN and Logistic, respectively. The main reason here is that the proposed model can better capture the hidden couplings between the two markets which is the key driver of market fluctuations, and the fluctuations are the early signs of financial crisis. Second, when pay attention to space-coupling (SC) part in the table, it is interesting to find that stock markets in different countries show different predictive powers. The SC-US performs best, followed by Japan, France and Canada, across the three approaches and with different TC and FC. Namely the couplings between oil market and US stock market have more predictability of financial crisis. For example, the SC-US with CHMM-ANN has a gain of around 13% compared with SC-Canada with TC. This can easily be interpreted since US is the first largest world net-importers of crude oil [29] and the 2007 global financial crisis in the testing period is triggered from the US. Interestingly, the performance of Canada is the worst while it is closer to the US than France and Japan. The main reason here is that financial system in Canada is dominated by bank rather than market, namely Canada has stable financial system which lead it far from financial crisis, while financial markets in France and Japan are more related to the US.
Moreover, there are more interesting findings through analyzing the time-coupling (TC) and frequency-coupling (FC) performance. From the table it is easy to find that the SC-US with time-coupling (TC) outperforms the frequency-coupling (FC), which means that the short time coupling of oil price and US stock market index can better predict the financial crisis than mid-term and long-term couplings, and this finding is consistent with former researchers [2] which reports that linkages between oil price and US stock market index is weakening in the long-term. But for other countries, the results are opposite, which means that the mid-term and long-term couplings achieve better performance than short time coupling. The reason may be that the fluctuations of other countries' stock markets are influenced by US stock market, and the information and risk transmissions lead to time lag.   Figure 5 show the precision and recall performance of the pair coupling with three different approaches, where the horizontal axis stands for the number of predicted crisis records, and the vertical axis represents the values of technical measures. Here the couplings between US stock market and commodity market (i.e. SC-US) is selected since the good performance listed above. The results from the two figures clearly show that the proposed CHMM-ANN approach outperforms other two methods on all coupling aspects. For instance, the precision improvement with TC (CHMM-ANN(TC)) is as high as 20% against the ANN(TC), and around 30% against the LR-TC. Figure 4 shows that the CHMM-ANN achieves higher recall than other two approaches with any type of pair couplings. Figure 6 depicts the AUC performance of the various approaches. It is obvious that the CHMM-ANN approach is with the best prediction performance. For example, the proposed method is with the highest AUC increase about 20% compared to ANN, and 40% over LR with the different kinds of pair coupling. It is interesting to find that the CHMM-ANN resulted from time-coupling (TC) has better results in contrast to frequency-coupling (FC(M) and FC(L)), which means that the short time interactions between the two markets can capture more deep features of financial crisis. And this is consistent with accuracy performance listed above.

Figure 4 and
In sum, all these results verify that the pair coupling of stock market and commodity market has strong predictive power on financial crisis. And the proposed model CHMM-ANN is an useful tool to capture the complex couplings.

Conclusion
The main interest of this paper is to investigate the predictability of financial crisis through capturing the pair coupling of commodity market and stock market. In the paper three different couplings are tested, including time-coupling (TC) which represents the short-term interactions; frequency-coupling (FC) which indicates mid-term interactions (FC(M)) and long-time interactions (FC(L)), and space-coupling(SC) which captures the interactions between commodity market and stock market in different countries. A two-layer model (CHMM-ANN) is designed to capture the complex hidden couplings by the CHMM level and then the couplings are fed into ANN level to predict financial crisis. Eleven years data from four countries are selected to conduct the experiments. The experimental results show that: 1) The performance of the proposed model beat the LR and ANN baselines on various couplings. 2) The predictability of space couplings from high to low are US, Japan, France and Canada. 3) The performance of TC is better than FC(M) and FC(L). All these findings verify the great importance of the pair coupling in understanding financial crisis. In addition, the good performance of the proposed model show the superiority of CHMM-ANN on capturing the complex couplings. Future directions include: 1) extending the SC to more countries; and 2) employing deep learning methods to improve the model.