Application of ARIMAX Model on Forecasting Nigeria’s GDP

This paper proposes an appropriate ARIMAX model that is used to forecast the Nigeria’s GDP. The data used for the study is sourced from the World Bank for a period of 1990-2019. The ARIMA model is fitted on the residuals using Box-Jenkins approach. The Bayesian Information Criterion (BIC) is adopted to assess the adequacy of the models. The raw data satisfy the assumption of multicollinearity when export is eliminated and the residual series is stationary after the first differencing. This study shows that import is a significant exogenous variable for the GDP dynamics. The ARIMA (0,1,1) with BIC value of 35.253 is considered the appropriate model to be combined with the exogenous variable. The results showed that the ARIMAX (0,1,1) is more ideal and adequate for forecasting Nigeria’s GDP based on the Theil’s U forecast accuracy measures.


Introduction
Gross Domestic Product (GDP) is the total monetary or market value of all the finished goods and services produced within a country's borders in a specific time period. As a broad measure of overall domestic production, it functions as a comprehensive scorecard of the country's economic health. GDP includes all private and public consumption, government outlays, investments, additions to private inventories, paid-in construction costs, and the foreign balance of trade (exports are added, imports are subtracted). GDP is an important indicator to measure a country's wealth and economic strength. GDP is part of the National income.
Ekhosuehi et al [7] analyzed the link between debt servicing and export earnings of Nigeria using Koyck-kind (KARMAX) model approach, using data extracted from World Bank Database for a period 1970-2018. The result of their findings revealed that the KARMAX model obtained through the maximum likelihood (ML) method is more ideal and inspiring, after comparing the result obtained to the prediction error and the instrumental variable methods. Zheng [20] investigated the methods of least squares in the identification of ARMAX by using 2000 sampled input-output data simulated. The result of the analysis revealed that the BELSX method is considered the best method among the selected methods in identifying ARMAX, due to the fact that its accuracy is very close to predictor error (PE) of ARMAX. Liu et al [9] proposed a method for locating and quantifying damages in shear structure, based on damage indicator of ARMAX model residual-based KLD using numerical simulation of damage detection on a six-story shear building structure. The results showed that through the evaluation of the models, the proposed CSDF curves of ARMAX model residual can locate clearly the structural damages and the proposed damage indicator of residual-based KLD can locate and quantify the damages in a data-driven way.
Durka and Pastorekova [6] carried out a comparative study between ARIMA and ARIMAX in order to find which of the model is better in the forecasting of macroeconomic time series in Slovakia. They applied Box -Jenkins modelling approach using Gross Domestic Product (GDP) per capita as an output series and unemployment rate as an input series, and the ARIMAX model as fitted explained 92.7% of the variations in the GDP. Musundi et al [12] modeled and forecasted Kenyan GDP using ARIMA model with annual data spanning from 1960 to 2012 which they sourced from the Kenya National Bureau of Statistics. The result of their findings showed that ARIMA (2,2,2) is the best ARIMA model to forecast Kenya's GDP based on minimum AIC.
Ning et al [13] examined Shaanxi GDP, using data which they extracted from Shaanxi Statistical Yearbook from 1952 to 2007.
By applying the Box-Jenkins approach, the result of their findings showed that ARIMA (1, 2, 1) is the best ARIMA model for forecasting Shaanxi's GDP. Ghazo [8] carried out a research on the Jordan GDP and CPI, using data from 1976 to 2019. Applying ARIMA approach and Box-Jenkins approach, the results revealed that ARIMA (3,1,1) is the best model to forecast the Jordan's GDP based on the AIC and SIC criteria. On the other hand, Atanu et al [3] examined the Nigeria GDP using data from 1981 to 2019, employed Box-Jenkins approach and obtained ARIMA (1, 2, 1) as the best forecasting model, and Zakai [19] investigates forecasting of Gross Domestic Product (GDP) for Pakistan using quarterly data from 1953 until 2012, using ARIMA (1, 1, 0) model, the findings revealed that Pakistan's GDP will increase for the years 2013-2025. Maity and Chatterjee [10] examined the GDP growth rate of India data spanning for 60 years, an ARIMA (1, 2, 2) was obtained as the best model for forecasting the GDP which was used to forecast the India's GDP, the result showed that the predicted values intrinsically increases.
Abonazel and Abd-Elftah [1] studied the Egyptian GDP using annual data from World Bank for 1965-2016. ARIMA (1,2,1) was selected as the best model based on minimum AIC, BIC and MSE to forecast the country's GDP for 2017-2026. Their result revealed that Egyptian GDP will steadily rise. Salah and Tanzim [15] predicted Bangladesh's GDP from 2019 to 2025, using ARIMA (1,2,1) model. Their study showed that Bangladesh's GDP trend is steadily improving. Chikumbe and Sikota [5] examined the GDP of Zambia using annual data from 1960 to 2018. Applying the Box-Jenkins approach, ARIMA (5,2,0) model was obtained as the best model to forecast the Zambia's GDP based on the minimum AIC and BIC. The model was used to forecast for next eight (8) years and the result revealed that there will be a decline in the GDP trend for the period of 2020-2022. Touama [16] studied the Jordan GDP series over the period 2003-2013 and applied the ARIMA process. The result revealed that ARIMA (0,1,2) model is the adequate model for the forecast of Jordanian GDP. Wabomba et al [18] evaluated Kenyan GDP for 1960-2012 using ARIMA approach. Their study established ARIMA (2,2,2) model as the best model to forecast for future GDP.
Arneja et al [2] analyzed the GDP of India using annual data spanning from 1980 to 2017, and ARIMA (1,1,7) obtained as the best model was used to forecast the Indian GDP. Their result revealed that India GDP will be rising carelessly in the future. Miah et al [11] examined the GDP of Bangladesh using data ranging from 1960 to 2017, applying the ARIMA process, the outcome of the result showed that ARIMA (1,2,1) model is the best model for forecasting the Bangladesh GDP. Using the model in forecasting for the next thirteen years, the result showed that the GDP of Bangladesh is expected to improve for the period of forecast. Awel [4] carried out an analysis on the real GDP of Ethiopia using annual data spanning from 1981 to 2014. ARIMA (1,1,1) model was selected as the best forecasting model, which was used to forecast for the period 2015-2017. Rana [14] examined the ARIMA forecast model using monthly data from Mid-July 2016 to Mid-July 2018, applying Box-Jenkins approach to select ARIMA (0,1,2) as the best model to forecast the GDP of Nepal.
The aim of this study is to fit an appropriate ARIMAX model for Nigeria's Gross Domestic Product (GDP) and other economic variables. This ARIMAX model is similar to a multivariate regression model, but allows to take advantage of autocorrelation that may be present in residuals of the regression which improves the accuracy of the forecast Ulyah et al [17]. By using ARIMAX model, it is assumed that the future values of the Nigeria's GDP and other economic variables linearly depend on their past values, as well as on the values of past (stochastic) shocks.

Functional Variables
This study adopts a dependent variable and four predictor variables , , , , where is Gross Domestic is the Export (million US$), and is the Import (million US$). The Gross Domestic Product (GDP) was measured using three methods, the Product Method, Income Method and the Expenditure method. This study adopts the most common method, the Expenditure Method, this method is defined as where is the Consumer Expenditure, is the Investment, is the Government Expenditure, and is the Net Export ( − ).
The variables adopted in this study all have effect in the determination of GDP by expenditure method.

ARIMA with Exogenous Variable (ARIMAX)
ARIMAX model is a composition of autoregressive integrated moving average (ARIMA) model and significant exogenous variables (exogenous variable is a covariate , that influence the observed time series values . In order word, ARIMAX model is a multiple linear regression model with AR (Autoregressive) terms and/or MA (Moving Average) terms. ARIMAX model is denoted by ARIMAX ( , , ) which is the ARIMA ( , , ) with significant exogenous variables, expressed as

Multiple Regression
Multiple linear regression model is the model that describes the relationship between one Dependent (Response) variable and two or more Independent (Predictor) variables # , , ⋯ , % & and shown as where " is the error term for * = 1,2, ⋯ , -; / = 1,2, ⋯ , Autoregressive (AR) Model AR model is the regression of the current observations against one or more past observations. That is the current observation are generated by a weighted averages of past time series data going back periods, together with a random disturbance in the current period. The process is denoted as 0 ) and it is defined as where , , ⋯ , % are the parameters of the AR model; is the current observation, 1 , 1 , ⋯ , 1% are past observations. ) is the Autoregressive polynomial and it is expressed Moving Average (MA) Model MA model is the regression of current errors and past few forecast errors. MA process of order each observation is generated by a weighted average of random disturbances going back from periods backwards. The moving average model of order is denoted by 2 ) and it is defined as where ! , ! , ⋯ , ! 3 are the parameters of the MA model; " is the current disturbance, " 1 , " 1 , ⋯ , " 14 are the past disturbances. ! ) is the Moving Average polynomial and it is expressed Autoregressive Integrated Moving Average (ARIMA) Model ARIMA model is the combination of the autoregressive (AR) model and the moving average (MA) model. The ARIMA model is usually denoted by ARIMA ( , , ). where is the order of the AR model, is the number of times that the actual observations need to be differenced so as to become linearly stationary, and is the order of the moving average model. ARIMA model is given as

ARIMAX Model Fitting
The ARIMAX model follows two stages: viz fitting the multiple linear regression to the response variable and the predictor variable and modelling ARIMA using Box-Jenkins approach on the residuals series.

Under Stage One: Multiple Linear Regression Fitting
Here, the first step is to consider the diagnosis of the data, making sure that all the assumptions under the multiple regression are met. Secondly, the data is modelled using multiple linear regression, and the final step is testing the parameters for significance, since ARIMAX deals with significant exogenous variables, the Residual series was obtained which will help in ARIMA modelling.

Under Stage Two: ARIMA Modelling
This stage is also divided into three major steps under the application of Box-Jenkins. But firstly, we have to check if the residual series is stationary that is it has constant mean, constant variance, and there is existence of serial correlation. If the series is not stationary, then there is need for first or two differencing, so as to bring the series to stationarity. The differencing of the actual observations is given as ∆ 6 , where d is the number of differencing to attain stationarity. The first and second difference are given as: First differencing of is ∆ = 7 and it is defined as Second difference is given as ∆ = 77 , which defined as The model with the lowest Akaike' Information Criterion (AIK) or/and Bayesian Information Criterion (BIC) is considered the best model among others.

Autoregressive (AR) model with significant exogenous variables (ARX)
Moving Average (MA) model with significant exogenous variables (MAX) = + ! )" Autoregressive Moving Average with exogenous variable ) = + ! )" where is the exogenous variable at time , is the coefficient of the exogenous variable.

Measure of Forecast Accuracy
The measure of forecast accuracy adopted in this study is Theil's U Forecast Accuracy. The Theil's U shows how the forecast conforms to the values of the future periods. It is defined as where is the actual value of a point for a given time period , F is the forecast value,is the number of the data points.
? is the measure of forecast quality and shows how adequate the ARIMAX model is. It is defined as (17) when ∪= 0, it means the ARIMAX model obtained forecasts perfectly; ∪> 1, it means the ARIMAX model obtained does not forecast as well as the naïve model; ∪< 1, it means the ARIMAX model obtained forecasts better than the naïve model; and ∪= 1, it means the ARIMAX model obtained forecasts as well as naïve model. Figure 1 shows the timeplot of GDP, Exchange Rate, Interest Rate, Export, and Import from 1990 to 2019. The estimated coefficients of the predictor variables (Exchange Rate, Interest Rate, Export and Import) and the variance inflation factor (VIF) are shown in Table 1.  Table 1 showed that the VIF for Exchange Rate and Interest Rate is less than 4, while the VIF of Export and Import is greater than 4, this implies that there is an existence of multicollinearity in the predictor variables. However, from the correlation matrix of Table 2, Export and Import are highly correlated, and Export is the cause of the multicollinearity and it is discarded from the predictor variable list. The estimated coefficients of the predictor variables (Exchange Rate, Interest Rate, and Import) and the variance inflation factor (VIF) are shown in Table 3.  Table 3 showed that the VIF of Exchange Rate, Interest Rate and Import is less than 4, which implies that there is no existence of multicollinearity. Thus, multiple linear regression can then be modelled to the data of interest.

Results/Findings
Again, inferences on the parameters showed that Exchange Rate and Interest Rate are not significant, while the Import is significant with a p-value less than 0.05. Then the regression model is given as F = 9032802.234 + 4.778 a. The underlying process assumed is independence (white noise).
b. Based on the asymptotic chi-square approximation. Table 4 showed that the Q-statistic of the 16 th lag has a pvalue 0.005, which is less than 0.05, implying that the residual series do not have constant mean and constant variance. Thus, Residual Series is not stationary, however, requires to be differenced. The estimated ACF and PACF for the first difference Residuals and the Box-Ljung statistic (Qstatistic) are shown in Table 5. a. The underlying process assumed is independence (white noise).
b. Based on the asymptotic chi-square approximation.  The Q-statistic of the 16 th lag of the first differenced Residual Series has a p-value 0.005 which is less than 0.05, denoting constant mean and variance as shown in Table 5, implying that the first difference Residual series is stationary. Hence there is need to estimate ACF and PACF.
The correlogram of ACF and PACF shown in Figure 2 and 3 respectively indicates that the actual residual series is not stationary based on the lags which do not show a rapid fall, as indicated by Table 5. Figures 4 and 5 shows the correlogram of the first difference ACF and PACF respectively. The first difference ACF shows a rapid fall of the lags indicating that the first difference residual series is stationary. PACF plot is the basis for the model selection.   Table 6 shows the various ARIMA models selected from Table 5 under the range of exploration, with Bayesian Information Criteria (BIC), and Q-statistic. ARIMA (0, 1, 1) has the lowest BIC value 35.253, implying that the best ARIMA model is the ARIMA (0, 1, 1) as shown in Table. The selected model ARIMA (0, 1, 1) has a Q-statistic of 12.904 and a p-value of 0.743, which is greater than 0.05, implying that the model is adequate and that an exogenous variable can be added to it. Hence, the best ARIMAX model is ARIMAX (0, 1, 1). Table 7 shows the estimated parameters of ARIMAX (0,1,1). The moving average model has an estimated parameter is -0.244, with a p-value of 0.222, which shows that the estimated parameter of MA is significant; the ARIMAX (0, 1, 1) is given as F 7 = F 0 2 ) = 4.778 1 − .244" 1 The estimated ARIMAX is given in Appendix 1. The Theil's U Forecast Accuracy is given as The result of Theil's ? forecast accuracy in equation (20) which is 0.1861 is less than 1, thus, it implies that the proposed ARIMAX (0,1,1) is a good model and the quality of the forecast 0.05886 given by Theil's ? shows that the ARIMAX (0,1,1) model obtained is adequate.

Conclusion
The aim of this paper is to identify an adequate ARIMAX model that will be used to forecast Nigeria's GDP. Multiple linear regression was fitted to the data under the conditions that the assumptions were met, and the stages of ARIMA model fitting on the residuals were reached and the explored models were compared based on the Bayesian Information Criteria (BIC), and ARIMA (0, 1, 1) was selected. The ARIMA model and the exogenous variable obtained were combined together. The Theil's U statistic showed that ARIMAX (0,1,1) is appropriate for forecast the Nigeria's GDP.
The fitted ARIMAX (0,1,1) model proposed in this study can play an important role in Monetary Policy. The Central Bank of Nigeria can use the proposed model to predict future short-term macroeconomic conditions and decide about their future operations. The ARIMAX (0,1,1) will help to know whether it is justified to continue or withdraw the Monetary Policy, and it will act as a useful tool in the decision making process of Monetary authorities so as to optimize resources.