On Consistency of Tests for Stationarity in Autoregressive and Moving Average Models of Different Orders

The most important assumptions about econometrics and time series data is stationarity, This study therefore suggests that, in trying to decide by classical methods whether economic data are stationary or not, it would be useful to perform tests of the null hypothesis of stationarity as well as tests of the null hypothesis of a unit root. The study compared power and type I error of Augmented Dickey-Fuller (ADF), Kwiatkowski, Phillips, Schmidt and Shin (KPSS) and Phillips and Perron (PP) to test the null hypothesis of stationarity against the alternative of a unit root at different order of autoregressive and moving average and various sample sizes. Simulation studies were conducted using R statistical package to investigate the performance of the tests of stationarity and unit root at sample size 20, 40, ..., 200 at first, second and third orders of autoregressive (AR), moving average (MA) and mixed autoregressive and moving average (ARMA) models. The relative performance of the tests was examined by their percentage of their powers and type I errors. The study concluded that PP is the best over all the conditions considered for the models, sample sizes and orders. However, in terms of type 1 error rate PP still is the best.


Introduction
The most important methods for dealing with econometrics and time series data, in the case of model fitting includes autoregressive (AR) models, moving average (MA) models, and mixed autoregressive moving average (ARMA) models. The basic assumption of these models is stationarity, that the data being fitted to them should be stationary.
Autoregressive and moving average models are mathematical models of the persistence, or autocorrelation, in a time series. The models are widely used in, econometrics, hydrology, engineering and other fields. There are several possible reasons for fitting AR, MA and ARMA models to data. Modeling can contribute to understanding the physical system by revealing something about the physical process that builds persistence into the series. The models can also be used to predict behavior of a time series or econometric data from past values. Such a prediction can be used as a baseline to evaluate the possible importance of other variables to the system. They are widely used for prediction of economic and industrial time series. Another use of AR, MA and ARMA models is simulation, in which synthetic series with the same persistence structure as an observed series can be generated. Simulations can be especially useful for established confidence intervals for statistics and estimated econometrics quantities.
These studies suggest that, in trying to decide by classical methods whether economic data are stationary or integrated, it would be useful to perform tests of the null hypothesis of stationarity as well as tests of the null hypothesis of a unit root. This paper provides a straight forward test of the null hypothesis of stationarity against the alternative of a unit root at different order of autoregressive and moving average and various sample sizes. There have been surprisingly few previous attempts to test the null hypothesis of stationarity.
[1] Consider a test statistic which is essentially the F statistic for 'superfluous' deterministic trend variables; this statistic should be close to zero under the stationary null but not under the alternative of a unit root. [2] considers the Dickey-Fuller test statistics, but estimates both trend-stationary and difference-stationary models and then uses the bootstrap to evaluate the distribution of these statistics.
The outcomes of this research will assist a researcher to understand which of the tests of stationarity can be used for a particular order of autoregressive and moving average models and at what sample size is reliable.

Autoregressive Processes
An autoregressive model is simply a linear regression of the current value of the series against one or more prior values of the series. The value of (p) is called the order of the AR model. AR models can be analyzed with one of various methods, including standard linear least squares techniques.
Assume that a current value of the series is linearly dependent upon its previous value, with some error. Then we could have the linear relationship X =∝ X +∝ X + ⋯ +∝ X + e (1) Where, ∝ , ∝ , … ∝ are autoregressive parameters and e t is a white noise process with zero mean and variance ( ).
Autoregressive processes are as their name suggests regressions on themselves. Specifically, a pth-order autoregressive process {Xt} satisfies the equation 1. The current value of the series Y t is a linear combination of the p most recent past values of itself plus an "innovation" term e t that incorporates everything new in the series at time t that is not explained by the past values. Thus, for every t, we assume that et is independent of X t−1 , X t−2 , X t−3 ,.... [3] carried out the original work on autoregressive processes.

Moving Average Model
The general moving average model can be given as follows; We call such a series a moving average of order q and generally represented by MA(q). The terminology moving average arises from the fact that X t is obtained by applying the weight , ,. .., to the variable , , … , .
Moving average model was first considered by [4] and later by [5].

Autoregressive Moving Average Model (ARMA)
In some application, the autoregressive or moving average model discussed in previous sections becomes cumbersome because we may need a higher order model with many parameters to adequately describe the dynamic structure of the data. To overcome these, Autoregressive Moving Average Model (ARMA) models are introduced [6]. Basically an ARMA model combines the ideas of autoregressive and moving average models into a compact form so that the number of parameter used is kept small. The concept of ARMA model is highly relevant in volatility modeling [7].
If we assume that the series is partly autoregressive and partly moving average, we obtained a quite general time series model (ARMA).
X =∝ +∝ + ⋯ +∝ + We say that X t , is an autoregressive and moving average of order p and q respectively. Symbolically, the model is represented by ARMA(p, q).

Stationarity
In Statistics, a stationary process is a stochastic process whose joint probability distribution does not change when shifted in time. Consequently, parameters such as the mean and variance, if they are present, also do not change over time.
The most important assumption made about time series data is that of stationarity. The basic idea of stationarity is that the probability laws that govern the behavior of the process do not change over time. In indeed, the process is statistically equilibrium. Specifically, a process {Yt} is said to be strictly stationary if the joint distribution of Y t is the same as that of Y t−k for all t and k; t = 1, 2, … k. In other words, the Y's are (marginally) identically distributed [8,9]. It then follows that E(Y t ) = E(Y t−k ) for all t and k so that the mean function is constant for all time. Additionally, Var(Y t ) = Var(Y t−k ) for all t and k so that the variance is also constant over time. Also, the basic assumption of stationary time series is white noise, i.e the error term of the model must be normally distributed with mean zero and variance σ 2 .
Based on these, the parameter of autoregressive model fixed for our simulation were derived as p The first order autoregressive is stationary when | | < 1 obtained from the characteristic equation 1 − ( ) = 0.
For the second order autoregressive we introduced the autoregressive characteristic polynomial ( ) = 1 − ( ) − ( ) and the corresponding AR characteristic equation It may be shown that, subject to the condition that e t is

a stationary solution to Equation (4) exists if and only if the roots of the AR characteristic equation exceed 1 in absolute value (modulus).
We sometimes say that the roots should lie outside the unit circle in the complex plane. This statement will generalize to the pth-order case without change In the second-order case, the roots of the quadratic characteristic equation (6) are easily found to be ) !
The AR (2) is stationary if the absolute value of (7)

Methodology
Simulation studies were conducted to investigate the performance of Tests of stationarity in different order of autoregressive and moving average. Several data sets were simulated following assumption of stationarity earlier mentioned at different sample sizes and orders of autoregressive (AR) models, moving average (MA) models, and mixed autoregressive moving average (ARMA) models. Effect of sample size, order of autoregressive and the stationarity of the two models were examined on the power of stationarity tests. The tests that were considered in this study are augmented Dickey-Fuller test (ADF), Kwiatkowski, Phillips, Schmidt and Shin (KPSS) and Philip Perron (PP) tests To illustrate the important statistical issues associated with autoregressive unit root tests, consider the simple AR(1) model . = . + , ~01(0, ) The test statistic is Where 7 is the least square estimate and SE(: ;) is usual standard error estimate The test is a one-sided left tail test. If y t is stationary (i.e. | | < 1), then it can be shown by [10] that 5 )6 ~1(0,1)

Augmented Dickey-Fuller Test
In statistics and econometrics, an augmented Dickey-Fuller test (ADF) is a test for a unit root in a time series sample. It is an augmented version of the Dickey-Fuller test for a larger and more complicated set of time series models. The augmented Dickey-Fuller (ADF) statistic, used in the test, is a negative number. The more negative it is, the stronger the rejection of the hypothesis that there is a unit roots at some level of confidence.

Testing Procedure
The testing procedure for the ADF test is the same as for the Dickey-Fuller test but it is applied to the model ∝ +∝ + ⋯ +∝ + (8) By including lags of the order p the ADF formulation allows for higher-order autoregressive processes. This means that the lag length p has to be determined when applying the test. One possible approach is to test down from high orders and examine the t-values on coefficients. An alternative approach is to examine information criteria such as the Akaike information criterion, Bayesian information criterion or the Hannan-Quinn information criterion.
The unit root test is then carried out under the null hypothesis α = 0 against the alternative hypothesis of α <0 Once a value for the test statistic is computed it can be compared to the relevant critical value for the Dickey-Fuller Test. If the test statistic is less (this test is non symmetrical so we do not consider an absolute value) than the (larger negative) critical value, then the null hypothesis of α = 0 is rejected and no unit root is present. [11] [12] proposed an LM test for testing trend and/or level stationarity (the KPSS test). That is, now the null hypothesis is a stationary process. Taking the null hypothesis as a stationary process and the unit root as an alternative is in accordance with a conservative testing strategy. Hence, if we then reject the null hypothesis, we can be pretty confident that the series indeed has a unit root. Therefore, if the results of the tests above indicate a unit root but the result of the KPSS test indicates a stationary process, one should be cautious and opt for the latter result Null hypothesis 4 ? : @ = 0

Kwiatkowski, Phillips, Schmidt and Shin (KPSS)
Under the null hypothesis of ~1AA<(0, @ ), the test statistic is where e t are the residuals from the regression of y t on a constant and a time trend

Methods of Simulations and Analyses
A set of replication of data sets are generated from first, second and third orders from autoregressive, moving average, and autoregressive moving average stated in 1, 2 and 3.
Where, H are present responses simulated from random samples of normal distribution and H I , J = 1, 2, … ,5000 MNO P = 1, 2, 3 are past responses of first and second order respectively. H is a random error which is known as white noise which is also distributed from normal distributions as follows: For the simulation study, the choice of parameters considered by [9] was considered to ensure stationarity of the data generated for AR, MA and ARMA models. These are fixed as fixed as 0.1, 0.2, 0.3 and 0.5. The sample sizes simulated from each models are; 20, 40, 80, 130, … 200. At a particular choice of sample size, the simulation study was performed 5000 times for different models.
Different methods of stationarity/ unit root tests were used to analyze each set of data simulated. The test statistic are ADF, KPSS and PP, the comparison were made by counting the number of times each test accepted the stationarity of the data simulated in 5000 replication. The percentages of their acceptance were recorded in table 1-6 for each case of the models, orders of autoregressive and moving average and sample sizes. The test with higher acceptance was considered as the best.
The variation in comparison of three stationarity/unit root tests will provide an indication of the sensitivity of the methods. Thus, the best method(s) were recommended for the research with various sample sizes, models and orders of autoregressive and moving average.
The outcomes of this research will assist a researcher to understand which of the tests of stationarity can be used for a particular order of autoregressive and moving average models and at what sample size is reliable.

Data Analyses
The results of the three methods of stationarity tests were presented in table 1 -6 for the first, second and third orders of AR, MA and ARMA models. Data were simulated using R software package, to investigate the performances ADF, KPSS and PP on the simulated data at sample size 20, 40, 60, 80, 100, 120, 140, 160, 180 and 200. The relative performance of the tests were observed at order (p) = 1, 2 and 3 for data generated from different forms of autoregressive (AR) moving average (MA) and autoregressive moving average models (ARMA). The assumptions of stationarity were observed for cases of data generated. The relative performance of the stationarity tests was examined. The experiments were repeated 5000 times for the three Statistics and each sample size. The percentages of their acceptance were recorded in table 1-6       Table 1 presents the result of the power for all the test procedure when the underlying time series model is stationary AR. All the procedures produced a reasonably high power over all the sample sizes and order considered except at order 2 where ADF (Augmented Dickey Fuller) and KPSS produced extremely low power compared to PP. Under this condition, Philip-Peron (PP) has the highest power over all the sample sizes and AR orders considered. Table 2 presents similar analysis on stationary MA, the power of the tests are extremely high over all the sample sizes and orders considered. Similar conclusion as in AR was also observed here. Table 3 presents the power of the mixed model (Stationary ARMA), all the test procedures produced high power over all the sample sizes at order 1 but ADF and KPSS produced low power over all the sample size at order 2 & 3. Table 4 presents the empirical percentage type 1 error rate of the test over the sample sizes and order for Non-stationary AR models, the main focus here is to compare the empirical type 1 error rate with the nominal 5%. Out of all the test procedures only PP produced a reasonably closer estimate to the nominal 5% level with considerably improvement as the order increases. The worst of all the test procedure is ADF under this condition. Table 5 also presents the result under Non-stationary MA models, PP seem to produced the best estimate but not as in table 4 them followed by KPSS and ADF is the worst. Finally, table 6 presents ARMA results on empirical percentage type 1 error rate, PP test is still to be the best procedure under this condition but the performance seems to be poor at order 1.

Conclusion
Also, from Fig 1-6 have the same results as stated in table 1-6 above.
Generally speaking, in terms of validity of the test based on percentage power, PP is the best over all the conditions considered which are models, sample sizes and order. But in terms of usability of the test that is based on type 1 error rate PP still is the best.

Recommendation
The PP is better applied since it produced reasonably high power and closer estimate of type 1 error to the nominal value. The above conclusion implies that pp test correctly rejects and at the same time identifies when not to reject.