Chaotic Recurrent Neural Networks for Financial Forecast

: In the past few decades, with the development of artificial intelligence and computer hardware, machine learning has been widely used in various applications including industrial, healthcare, education, finance, etc. Predicting financial time series sequences with effective AI tools for more accurate results has always been one of the hottest topics in finance and AI community. In this paper, the author introduces a new type of recurrent neural network algorithm, called Chaotic Recurrent Neural Network (CRNN), which is based on Dr. Raymond’s original research on Lee-Oscillator and Recurrent Neural Network (RNN) for worldwide financial prediction. We replaced the traditional activation function with a Lee Oscillator Neural Network, which not only can solve the vanishing gradient problem of traditional recurring neural networks during algorithm training, but can also provide an excellent memory correlation mechanism during long-term time series processing. The Experimental results reveal that CRNN outperforms than some popular neural network which widely applied to predict financial data, such as FFBPN, RNN, LSTM, in terms of forecast accuracy in certain cases. The experimental environment is based on Pytorch and Python 3.8, using 10 years (2010-2020) major financial index data, including DJI, HSI, IXIC, SPX, SSE, SZSE, APPL, to forecast 31th day closing price with previous 30 days closing price. Besides financial forecasting, our CRNN algorithm also has many potential applications, such as Natural Language Processing, weather forecasting, etc.


Introduction
In recent years, with the rapid development and application of big data, cloud computing, artificial intelligence and blockchain technology, financial industries are currently undergoing a massive technological transformation, which have been permeated into every domains and phases of traditional financial industries, such as banks, securities, insurances, funds and cryptocurrencies, etc. The tremendous advancement of artificial intelligence and computational power makes it possible to forecast financial trend by an individual even though without rich financial experience and enormous financial knowledge. Nowadays, besides traditional financial analysis methods, for instance, Fundamental Analysis, Technical Analysis, an increasing number of investors adopt artificial intelligence tools to forecast finance trend.
In the initial development stage of neural network algorithms, Artificial Neural Networks and Deep Neural Networks are usually used for various complex tasks. Artificial Neural Networks is defined as a machine learning algorithm that simulates the way of the human brain works. By analogy with the way of neurons' work, original neural network algorithms express neuron behavior by adjusting the weights and the biases between neurons through activation function. By training neuron system, the Artificial Neural Networks could learn patterns and perform classification, optimization, association memory, even other more complex operations [1]. DNN can be regard as an extension of ANN, which has more complex hidden layers and neural dynamics, used to solve more complex problems, financial forecasting is one of the application areas [2].
With the continuous improvement of various practical application requirements, the architecture of the neural networks at that time could no longer meet the needs of users, other neural networks are gradually proposed and demonstrated their remarkable capabilities in various application area, such as FFBPN, CNN, RNN and GAN [3][4][5]. Current researches on time-series financial prediction mainly based on Recurrent Neural Networks, especially RNN and LSTM [6][7][8][9]. However, traditional RNN algorithms with sigmoid-based activation face two challenges for the time-series data in algorithm training process: 1) Vanishing Gradient 2) Gradient Explosion [10]. LSTM is a variant of RNN which could avoid the discussed problem above through its specific gate structure introduced by Hochreiter & Schmidhuber in 1997 [11]. A large number of studies have shown that neural networks can act on the direct relationship between influencing factors. In addition, by performing multiple hidden mappings on various data, these neural network algorithms can determine the hidden relationships between these data. [12][13][14][15].
The concept chaotic neural network was originally introduced by Aihrar, Takabe and Toyoba [16] who pointed out that operating mechanism of actual neuron is more complicated than simply using thresholds. Therefore, in order to simulate the biological behavior of neurons, they proposed a new kind of neural network, chaotic neural network. From the perspective of simulating the way the human brain works, it is more suitable to make chaotic neurons act as activation functions for neural network algorithms than traditional sigmoid-based activation functions. Researches on neuron and human brain in recent decades have illustrated that there are various chaotic phenomena in human brain [17], and the neuronal behavior is triggered by the activity between excitatory neurons and inhibitory neurons [18]. From previous studies, in order to simulate neural behavior of real human brain neuron, CONN (Chaotic Oscillatory-based Neural Network) and Lee-Oscillator were proposed [19][20], whose author deemed that through adopting chaotic oscillator neuron as activation function, the neural network can generate highly random transitions than using traditional Tanh activation function. Such chaotic neural network algorithm can predict or model highly complex problems in various application fields.
Based on the above characteristics, we decided to further research RNN which widely used in Natural Language Processing and Time Series Prediction. Compared to LSTM, which changes the cell structure with complex gate structure and point wise multiplication operation, like Forget gate, Input gate, Candidate Cell State, Cell State, Output gate, we attempt to construct a new recurrent neural network which could avoid RNN's shortcomings but with minor adjustment. Inspired by Dr. Raymond's study about Lee-Oscillator which can not only provide an excellent memory correlation mechanism during long-term time series processing [19], but also can solve overtraining problem and deadlock problem during training process [21], we created a new recurrent neural network algorithm, called Chaotic Recurrent Neural Network (CRNN), using Lee-Oscillator as Recurrent Neural Network's activation function rather than sigmoid-based function.
In this paper, in order to test the performance of CRNN, we collected 10 years financial time series data, including DJI, IXIC, SPX, HSI, SSE, SZSE, APPL as predict object, to predict 31th closing price with previous 30 days closing price.
Then, compared CRNN with the most popular neural network that used to forecast financial trend, such as FFBPN, RNN, LSTM, to observe their performance respectively.
The research's main contributions and originality includes: 1) The research introduces chaotic recurrent neural network (CRNN) algorithm, through applying Lee-Oscillator as activation function to resolve RNN's defects, such as Vanishing Gradient, Overtraining and Deadlock.
2) Successfully designed and implemented the global financial index prediction competition experiment with popular time series AI tools (including FFBPN, RNN, LSTM and CRNN) 3) Under the same experiment environment and same hyper-parameters, CRNN outperforms than other neural networks in certain cases.

Recurrent Neural Networks
RNN is a special ANN which is able to process input sequences through its internal feedback working mechanism between neurons. A basic RNN neuron is composed of three layers, Input layer, Hidden layer, and Output layer, in which Hidden layer can store information from previous cell units and further process subsequent information through the current cell unit. The Processing starts with Input layer inputting the value Xt at time step t. Hidden layer contains current Input value Xt and previous Hidden layer state value ht-1 at time step (t-1) with the activation function (Tanh activation function or ReLU activation function) which is given by: where W is the weight of Hidden layer, U is the weight of Input layer, bs is the bias of Hidden layer. As for Output layer, the result of Hidden layer at time step (t) will be used to compute the Output value with softmax activation function and the Output bias, which is given as follow: where V represents the weight of Output layer, bo is the bias of Output layer, St is the Hidden layer state value at time step (t). Figure 1 shows the architecture of RNN.

Long-Short-Term Memory Neural Network
LSTM is the most widely used variant of RNN. The purpose of its construction is mainly to solve common gradient vanishing problem of the traditional RNN during training process. Therefore, LSTM can analyze and process long sequence data, such as financial data. Compared to classical recurrent neural network, the LSTM cell substituted the primordial RNN's hidden layers, which is composed of various gates structure and cell state that enables it not only control the input flow but also keep information cycling continuously. Thanks to its special gates' mechanism, LSTM could avoid gradient vanishing problem in neural network training process. An LSTM cell consists of Forget gate, Input gate, Candidate cell state, Cell state, Output gate. We can represent it using following function:  Figure 2 shows the architecture of LSTM.

Chaotic Oscillatory Neural Network
Lee-Oscillator is a chaotic oscillator neuron which can successfully simulate discrete-time and highly random characteristics of neural behavior, which helps to play a critical role in the perfect CTU for simulating complex and chaotic problems, for instance, financial prediction and weather prediction. Basically, the Lee oscillator neuron is composed of four neurons: Exhibitory neuron, Inhibition neuron, Input neuron and Output neuron, which correspond E, I, Ω and L, respectively. E (t+1)=Sigmoid (e1. E (t) -e2. I (t) + S (t) + ζ ) (9) Ω (t+1)=Sigmoid (S (t)) (11) where e1, e2 are the weights of Excitatory neuron, i1, i2 are the weights of inhibitory neuron, ζ and ζI are the threshold values of Excitatory neuron and inhibitory neuron respectively and S (t) is the external input stimulus at time step (t). Figure 3 (a) represents the neural architecture and Figure 3 (b) represents the bifurcation of Lee-Oscillator.

Chaotic Recurrent Neural Network
As mentioned in the previous section, since RNN suffers from the disappearance of the gradient by increasing the time step, it has the limitation of learning dependence. By using the Lee-Oscillator neural as the activation function of RNN, we constructed a chaotic RNN, called CRNN. On the one hand, owing to the characteristic of Lee-Oscillator, it could provide an excellent memory correlation mechanism during long-term time series processing. On the other hand, it also can resolve the vanishing gradient problem of neural network using traditional sigmoid-based activation function. The neural dynamics of CRNN are given by: where W, U are the weights of Hidden layer, Input layer respectively, V is the weight of output layer, bs and bo are the biases of Hidden layer and Output layer, St is Hidden layer state value as time step (t), LeeOsc represents the Lee-Oscillator neuron, Softmax is a common activation function. The neural architecture represented in Figure 4. Compare to the complex structure and a lot of point wise multiplication operation of LSTM, CRNN's architectrue is obviouly simpler than LSTM's which mean the CRNN alogtithm need smaller operand in processing data and training model process. If the performance of CRNN is better than RNN or LSTM in certain machine learning taskss, CRNN maybe a good choice for the users.

Experiments Implementation
We conducted the forecast experiment for our algorithm in various financial index datasets from 2010 to 2020. All datasets utilized in experiments are fetched from the free and open financial data community Tushare [22], a professional financial data sharing platform. In our experiment, eight datasets from two categories are used, including 1) Financial index: Dow Jones Index (DJI), Hang Seng Index (HSI), Nasdaq Index (IXIC), S&P 500 Index (SPX), Shanghai Securities Composite Index (SSE), Compositional Index of Shenzhen Stock Market (SZSE); 2) the stock of Apple Inc. We downloaded the raw time series data, around 10 years, and we split the 70% front of data as the training set, the rest data was utilized as out-of-sample verification. The forecast object is the closing price of each dataset, using 30 days closing price to predict the 31th closing price. The test was conducted from the perspective of closing price forecast error which involved three indicators, such as 1) MSE (Mean Squared Error); 2) RMSE (Root Mean Squared Error); 3) MSE (Mean Absolute Error). We experimented in Python 3.8 using packages such as Tushare, Numpy, Math, Matplotlib and Pandas, etc. In the model construction, we developed CRNN and compared it with FFBPN, RNN, LSTM with Pytorch which is a powerful machine learning framework for neural-network-based academic research and development. Moreover, we trained all algorithm with 100 epochs using the same Optimizer and Loss Function, ADAM (Adaptive Moment Estimation) optimizer with learning rate 0.001 and MSE (Mean square error) Loss Function. We conducted experiment in one hardware computing environment: 1) OS: Win 10; 2) CPU: Intel (R) Core (TM) i5-8265U @ 1.60GHz; 3) Memory: 8g.

Figures
In

Tables
The predict results were evaluated with three indicators: 1) Mean Squared Error (MSE), 2) Root Mean Squared Error (RMSE), 3) Mean Absolute Error (MAE). As shown in Table 1 and Figure 5, from the forecast performance and evaluation perspectives, it is clear that our Chaotic Neuron Network (CRNN) algorithm achieves better prediction accuracy than other neural networks algorithms in DJI, IXIC, HSI, SPX, SZSE. However, as for SSE and APPL, LSTM obtains more accurate prediction results than other neural network algorithms. From the experiment results, we can find that compared with RNN using traditional sigmoid-based activation function, by using Lee-Oscillator as activation function, we can efficiently make our CRNN algorithm to capture the complex, highly random, chaotic, irregular time series patterns. Furthermore, in some financial forecasting tasks, CRNN could achieve more accurate forecasting results than LSTM with smaller operand and simpler structure. One point needs to be mentioned that in order to simulate traditional technical analysis methods, the Moving Average, which widely used in stock market, we adopted 5 days, 10 days, 20days, 30days as length of input time-series sequence, the prediction results demonstrated that the longer the time-series sequence, the more accurate the prediction results, which also meet the characteristics of CRNN algorithm. However, if the user wants to apply the CRNN algorithm in actual financial forecasting tasks, it is recommended that you'd better use specific historical financial data to test its performance in advance.

Discussion
A number of current researches and applications have shown that nonlinear dynamical systems with complex and irregular behavior can be characterized using chaotic neural network algorithms [23][24][25]. As shown in Figure 3 (b), the diagram of Lee-Oscillator is composed of "Sigmoid area" and "Chaotic area" whose core theory is that the "Sigmoid area" constitutes the classical sigmoid function, while the "Chaotic area" imitates the chaotic property of complex and highly dynamics of random movement of various time-series datasets in real world. In addition, when the range of chaotic function output controlled from -1 to 1, shown in Figure 3 (b), the chaotic neuron never generates 0 and the input value will oscillate in "Bifurcation area" to generates a random value near 0, which mean that the input value at any time does not disappear in the following process. In other word, Lee-Oscillator is able to provide an excellent memory correlation mechanism during long-term time series processing. Owing to above feature, we can infer that our CRNN algorithm has potential to competent for complex Natural Language Processing research and development, etc. Meanwhile, due to its unique oscillatory non-liner characteristic, chaotic neuron can simulate the highly irregular changes better and achieve more accurate prediction results with smaller error than other neural network algorithm using traditional sigmoid-based activation function. In further research, we have two different research directions based on CRNN: 1) Use CRNN to process Natural Language Processing tasks which require the machine learning algorithm can process the longest possible sequence; 2) Integrate CRNN with Quantum Finance model, Intelligent Agents and Reinforcement learning algorithm, to establish a financial trading system.

Conclusions
In this research, the authors propose a chaotic recurrent neural network (CRNN) which uses chaotic neural network, Lee-Oscillator, as activation function, instead of traditional sigmoid-based activation function. Unlike existing recurrent neural network algorithms which use sigmoid-based activation function, inspired by the chaotic oscillatory nonlinearity and sigmoid-based neural dynamics demonstrated by previous researches of chaotic neural network, we integrated RNN with Lee-Oscillator to create a new recurrent neural network algorithm, CRNN. The new algorithm not only can solve the vanishing gradient problem of traditional recurring neural networks during algorithm training, but can also provide an excellent memory correlation mechanism during long-term time series processing. In addition, CRNN was constructed to test its applicability for predicting by using 30 days closing price of various financial time series datasets to predict the 31th closing price. Compared CRNN with current popular neural network algorithm, such as FFBPN, RNN, LSTM, to observe their performance in various financial prediction tasks, the experimental results illustrated that our CRNN algorithm outperforms than FFBPN, RNN, even LSTM in DJI, IXIC, HIS, SPX, SZSE financial index prediction tasks.
Based on the experimental results, we can draw the conclusion that compared with traditional sigmoid-based RNN, even LSTMs in some cases, our CRNN algorithm can successfully track price trends from past chaotic, random and complex data and predict more accurate results. Besides, CRNN has the potential to handle complex problem of long-sequence datasets in other application domain, such as Natural Language Processing.