Bayesian and Frequentist Approach to Time Series Forecasting with Application to Kenya’s GDP per Capita

Real GDP per capita is an important indicator of a country’s or regional economic activity and is often used by decision makers in the development of economic policies. Expectations about future GDP per capita can be a primary determinant of investments, employment, wages, profits and stock market activities. This study employed both the frequentist and the Bayesian approaches to Kenya’s GDP per capita time series data for the period between 1980-2017 as obtained from the World Bank data portal. The autoregressive integrated moving average (ARIMA) and the state space models were fitted. The results of the study showed that the local linear trend model and the ARIMA(1,2,1) model are appropriate for forecasting the GDP per capita but the former outperforms the latter. The local linear trend model was used to perform a 3-step ahead forecast and the forecasted value was found to be U.S $ 1717.694, U.S $ 1844.446 and U.S $ 1971.198 for 2018, 2019 and 2020 respectively. The findings of this study showed that the state space models, which utilize the Bayesian approach, outperform the ARIMA models which use the frequentist approach in time series forecasting.


Background
Gross domestic product (GDP) is the total value of all the finished goods and services produced within a country's borders in a specific time period, usually one year [1]. Real GDP per capita on the other hand, is the average income per person in a country or region after isolating the effect of price changes (inflation/deflation). GDP per capita is used by economists to monitor the status and growth of output in an economy and when combined with measures of the purchasing power parity (PPP) it's used to measure people's living standard [2]. It has a close correlation with the trend in living standards over time and is used to compare the living standards across countries with different populations. An increase in real GDP per capita of a country means an improvement in the living standards of that country.
Real GDP per capita is a better measure of economic growth because it puts inflation and population change into consideration. As real GDP grows it is assumed that everyone in the chain will benefit and the growth will have a trickle-down effect on the population, thus improving the standard of living of every person in that economy. A positive or negative change in GDP per capita has a significant effect on the stock market and investors pay attention to this change when coming up with an investment idea or strategy.
The world's average GDP per capita dropped from US $ 10,871.178  and 2017 respectively. Compared to the world's, Seychelles' and Luxembourg's, Kenya's GDP per capita is far much lower and it's necessary to formulate policies that aim at its improvement.
Per capita GDP indicates whether an economy is expanding or contracting and can be used as an indication of a nation's economic growth, decline, or recession. It's significance as a measure of economic development can be seen in three aspects. Firstly, GDP per capita reflects the level and degree of economic development in industrialized countries. Secondly, is that if individual income levels in a country do not vary much between residents, the data can be used to measure social justice and equality. Finally, GDP per capita has been seen to be related to the level of social stability in a country. Forecasting future economic outcomes is important in the decision-making process in central banks for all countries. Scientific prediction of GDP per capita has important theoretical and practical significance on the formulation of economic development goals.
The ARIMA model, developed by Box and Jenkins, has been one of the most appropriate models for modelling and forecasting future values of a time series data. The study [3] found that the ARIMA (2,2,2) model was the most appropriate model for predicting Kenya's real GDP, [4] identified the ARIMA (1,1,1) as the best model for predicting the GDP of China and [5] used the Box-Jenkins technique to show that the ARIMA (1,1,0) is the best model for predicting the GDP of Pakistan. However, other empirical studies such as that of the studies [6][7] have shown that there are other models that can outperform the ARIMA models.
According to a study [8] state space model (also known as dynamic linear model) provides a methodology for treating a wide range of problems in time series analysis. In this model, the development of the system over time is determined by an unobserved series of vectors (θ1, θ2,...θn) associated with a series of observations (y1, y2...yn). The application of the Kalman filter in state space modelling leads to Minimum Variance Linear Unbiased Estimates(MVLUE) of the model parameters. State space models can be discussed in three different perspectives; the local level, local level with trend and the basic structural model-more details are found in section 3.3.3. The estimates of future observations of a time series can be made modelled dynamically using the Kalman filter, while the minimum mean square error estimators(MMSE) of the model parameters can be computed by a smoothing algorithm [9].
The study [10] compared the forecasts made by a basic form of the structural model with the forecasts made by ARIMA models and concluded that there may be strong arguments in favor of using state space models in practice. In another study [11] showed that a structural time series model is appropriate for analyzing a time series with trend, seasonality, cyclical and a regression component-both in the time and frequency domain. According to a study [12] the posterior probability is spread widely among many models and Bayesian models are superior over choosing any single model. The study [13] suggested that Bayesian model averaging is a useful alternative to other forecasting procedures especially due to the flexibility by which new information can be incorporated. A study [14] showed that Bayesian models can outperform ARIMA models, especially when forecasting over a short horizon.
An accurate prediction of real GDP per capita is necessary to get an insightful idea of the future trend in living standards. Raw current and historical data cannot be used to develop suitable economic policies and strategies or in the allocation of funds to a particular industry. This requires an accurate, efficient and reliable estimate of GDP per capita for some period ahead. A wide range of models can be used for prediction; each has its own characteristics, advantages and disadvantages. This study aimed at identifying the best statistical model for predicting Kenya's real GDP per capita. The frequentist (ARIMA model) and the Bayesian (State space model) approaches were used.

Statement of the Problem
Both real GDP and GDP per capita measure a country's economic activity but the former is the most widely used measure. However, the latter has a close correlation with the trend in living standards and is a better measure because it puts population growth in to consideration. It is used to compare the standards of living across countries with different populations or from one period to another. The world's average GDP per capita dropped from U.S$ 10,871.178 in 2014 to U.S$ 10,714.466 in 2017 with Luxembourg recording the highest GDP per capita of US $105,803. In Kenya it rose from U.S$ 1335.123 in 2014 to U.S$ 1,594.834 in 2017 but the living standard in Kenya is still low. A report by KNBS(2018) showed that it's growth dropped from 2.9% in 2016 to 1.9% in 2017. Accurate prediction is necessary so as to understand the future trend in standards of living. Raw current and historical data cannot be used to develop suitable economic policies and strategies or in the allocation of funds to a particular industry. This requires a reliable estimate of GDP per capita for some period ahead. This study aimed at identifying the best model for predicting Kenya's real GDP per capita.

General Objective
The general objective of this study was to identify the most appropriate model for forecasting Kenya's real GDP per capita.

Specific Objectives
The specific objectives of the study were: 1. To identify an appropriate state space model for forecasting Kenya's GDP per capita. 2. To fit an appropriate ARIMA model for forecasting the GDP per capita. 3. To compare the models identified in (i) and (ii) above and use the best model to forecast 3 years ahead.

Significance of the Study
Kenya's average GDP per capita was U.S $ 1,594.8350 in 2017. This is too low compared to the World's average GDP per capita of US $10,714.466. Most of the Previous studies on economic growth have put more emphasis on the real GDP rather than the real GDP per capita. A country's aggregate economic growth is not what matters most; What matters most is whether the people living in a country are getting wealthier. Real GDP per capita puts the aspect of inflation and population growth into consideration making it a good measure of economic growth and indicator of the standards of living. The ARIMA model has been widely used in modelling economic time series data but still there are other models that can be used to model this type of data such as the state space models. One of the advantages of the state space model is that the model parameter estimates are updated as newer information becomes available. This used the most recent data to build statistical models based on the Bayesian and Frequentist approach. The best model was identified and used to perform a 3-step ahead forecast of Kenya's real GDP per capita.

Literature Review
This section presents the theoretical, empirical and conceptual framework. Section 2.1 covers the theoretical literature, section 2.2 is on Empirical literature and section 2.3 presents the conceptual framework. These three sections are further discussed in subsections.

Classical Theory of Economic Growth
The classical theory of economic growth was developed by Adam Smith, David Ricardo, and Robert Malthus in the eighteenth and nineteenth centuries. The theory states that every economy has a steady state GDP and any deviation off that steady state is temporary and will eventually return [15]. This is because when there is a growth in GDP, population will also increase, leading to a higher demand on the limited resources and the GDP will eventually lower back to the steady state. When GDP goes below the steady state, population will decrease and lead to a lower demand on the resources which in turn will raise the GDP back to its steady state.

Neoclassical Growth Theory
This theory was developed by Robert Solow and Trevor Swan in 1956. According to this theory, economic growth is affected by labor, capital, and technology; but more specifically, technological advances [16]. The output per worker increases with the output per capita but at a decreasing rate, diminishing marginal returns. As per this theory economic growth will not take place unless there are technological advances -which lead to the adjustment of labor and capital. According to this theory, if all nations have access to the same technology, then the standard of living will all become equal. This model fails to explain how technology is a factor of growth.

Forecasting GDP Using the ARIMA Model
The ARIMA methodology has been widely applied by many researchers in modelling and forecasting future GDP rates. The Autoregressive integrated moving average (ARIMA) model was first popularized by a study [17]. The Future values of a time series are predicted as a linear combination of its own past values and a series of random shocks or innovations. ARIMA is an iterative process that involves four stages; identification, estimation, diagnostic checking and forecasting of time series. This model is applied in stationary time series. It can also be applied in non-stationary time series which can be transformed to a stationary time series [18].
The study [3] used Kenya's GDP time series data for the period between 1960-2012 to build a class of ARIMA models using the Box-Jenkins procedure and showed that the ARIMA (2,2,2) model was the most appropriate model for modelling Kenya's GDP. [19] used time series GDP data from Shaanxi for the period 1952-2007 to perform a 6-year forecast for the country's GDP. They identified the ARIMA (1,2,1) model as the most appropriate model for predicting the GDP of Shaanxi.
The study [4] identified the ARIMA (1,1,1) as the best model for fitting the GDP of China using time series data for the period between 1978 to 2006. This was followed by a prediction from 2007 to 2011 and the error between the actual value and the predicted value was small indicating that the ARIMA model is a high precision and effective method to forecast the GDP time series. [20] applied the Box-Jenkins methodology in modelling and forecasting the real GDP rate in Greece using time series data for the period between 1980-2013. They used the fitted model to forecast GDP for the year 2015, 2016 and 2017. The statistical results showed that Greece's real GDP rate was steadily improving.
The study [5] used Pakistan's GDP time series data from 1953 to 2012, which was obtained from the IMF, to construct an ARIMA model for predicting future GDP for Pakistan. After investigating a set of ARIMA models following the Box-Jenkins technique, ARIMA (1,1,0) was found to be the best fit for the data. This model was used for predicting the GDP for Pakistan from 2013 to 2025. The predicted GDP was found to be 23477 Billion and 103918 billion rupees for 2013 and 2025 respectively.
[6] used time series data for the period 1993 to 2009 to study the GDP per capita for the top 5 ranked countries in Sweden. They used ARIMA, VAR and AR (1)) models to fit the regional GDP per capita using data for the period between 1993 to 2004, and then the data for the last 5 years was used to evaluate the performance of the prediction. After comparing the performance of the three models based on several statistical measures they found that the three models are valid for forecasting real GDP per capita but AR (1) outperformed the other models. Application to Kenya's GDP per Capita In another study [7] compared the performance of VAR, ARIMA, and Bridge models in forecasting the quarterly GDP growth for Albania. Their empirical results showed that the VAR model outperformed the other models and the ARIMA models portrayed the worst forecast performance compared with the two other models. Again, the ARIMA models performed better than naive models because the Theil's U Statistic was lower than 1.

Forecasting GDP per Capita
GDP per capita is often used as a measure of economic development and is one of the most important measures in macroeconomics. It is a widely used indicator for country-level income and has been used in modeling health outcomes, mortality trends, cause specific mortality estimation, health system performance and finances, and several other topics of interest. Again, it is one of the most regularly measured economic indicators, with estimates produced quarterly or annually by countries themselves as well as agencies such as the World Bank, UN and the IMF. When combined with the purchasing power parity (PPP), GDP per capita is used to measure people's standards of living.
The study [21] applied the extrapolation method to estimate the real GDP per capita for more than 100 countries using data for 16 countries. The data for the other countries were estimated using a short-cut method which extrapolates the relationship found for the 16 countries between real GDP per capita and certain independent variables. These estimates were subject to large margin of error but were closer to the true figures than the most commonly used comparisons of nominal GDP per capita.
The study [22] studied the real GDP per capita growth rate of 19 selected OECD member countries for the period between 1950 and 2007. His findings indicated that the growth rate of real GDP per capita is represented as a sum of two components -a monotonically decreasing economic trend and fluctuations related to the change in some specific age population. According to this study, the economic trend is modeled by an inverse function of real GDP per capita with a constant numerator. The Statistical analysis showed that there is a very weak linear trend in the annual increment of GDP per capita for the case of USA, Japan, France, and Italy, and there is a larger positive linear trend in annual increments for the case of UK, Australia and Canada.
In another study [23] investigated the evolution of real GDP per capita in the United States using a two-component model; the first component was the growth trend and the second component was the fluctuations around the growth trend. The trend component was found to be inversely proportional to the attained level of real GDP per capita. The Second component was defined as a half of the growth rate of the number of 9-year-olds. The VAR, VECM, and linear regression were used in estimation of the goodness of fit and RMSE. The highest R2 of 0.95 and the lowermost RMSE was obtained in the VAR representation. The cointegration tests showed that the deviations of real economic growth from the growth trend are driven by the change in the number of 9-year-olds.
The study [24] considered two methods of forecasting real per capita GDP at various horizons. The univariate time series models estimated country-by-country and the cross-country growth regressions. The results of the study showed that there was only modest differences between these two approaches. Both models performed similarly to forecasts generated by the World Bank's Unified Survey. The results did not highlight which model outperformed the other but suggested that there are potential gains from combining time series and growth regression-based forecasting approaches.
In summary, there are many models that can be used to forecast macroeconomic time series variables such as real GDP per capita. The study [25] argued that ARIMA models are robust especially when generating short-run GDP forecasts and have frequently outperformed more sophisticated structural models in terms of short-run forecasting ability.
Other empirical studies such as that of the studies [6][7] have shown that ARIMA models can be outperformed by other models.

Bayesian Analysis of State Space Models
The Bayesian philosophy was developed by Reverend Thomas Bayes in late 18th century and at this time it was not widely used because of its complexity. Due to its advantages and computational advances, it was revived in the 20th century and its use in econometrics has increased rapidly since then. In this method, the prior information we possess before seeing the data can be incorporated. The Bayesian paradigm is natural for prediction and takes into account all model parameters and model uncertainty.
The study [11] developed the Bayesian Structural Time Series (BSTS) model which falls under state space models. In this model an unobserved latent state is predicted using noisy measurements of the observed quantity. This model assumes that the noise is normally distributed and that we have some idea of how the latent state evolves over time. The time dependency in this model is computed using a combination of Kalman filtering, Kalman smoothing, and sampling from posterior distributions using Markov Chain Monte Carlo methods.
The study [26] a system for short-term forecasting that averages over different combinations of predictors. The system combined a structural time series model for the target series with regression component capturing the contributions of contemporaneous search query data. Even though their system focused on search engine data to forecast economic time series, they also suggested that the underlying statistical methods could also be applied to more general short-term forecasting with large numbers of contemporaneous predictors.
The study [27] suggested that the use of MCMC methods has made complex time series models docile to Bayesian analysis. Their study focused on ARIMA models and their fractionally integrated counterparts, state-space models, Markov switching and mixture models, models allowing for time-varying volatility and recent approaches to non-parametric Bayesian modelling of time series. They recommend that Bayesian models are alive and well and there is need to explore their advantages.
According to a study [8] state space models provide a methodology for treating a wide range of problems in time series analysis. The development of the system over time is determined by an unobserved series of vectors (θ1, θ2,...θn) associated with a series of observations (y1, y2...yn). The Kalman filtering leads to Minimum Variance Linear Unbiased estimates of the model parameters and can be used to model future observations of a time series. The minimum mean square error estimator of the model parameters can be computed by a smoothing algorithm [9].
The study [10] compared the forecasts made by a structural model with the one made by ARIMA models and concluded that there may be strong arguments in favor of using structural models in practice. In another study [11] showed that a structural time series model is appropriate for analyzing a time series with trend, seasonality, cyclical and a regression component. [12] found that the posterior probability is spread widely among many models and suggested that Bayesian models are superior over choosing any single model.
The study [28] described Practical methods for implementing Bayesian model averaging with factor models. They simulated algorithms that can efficiently select the model with the highest marginal likelihood. The simulated methods were used to forecast the GDP of U.S using data on 162 time series. The results of this simulation indicated that models containing factors outperform autoregressive models in forecasting GDP at short horizons.

Summary of Reviewed Literature
The reviewed literature focused mostly on ARIMA, state space models, GDP and GDP per capita. Several studies such as that of studies [3,5,17,[19][20] showed that the ARIMA model is appropriate for modelling and forecasting GDP. However, other empirical studies such as that of studies [6][7] indicate that the ARIMA models can be outperformed by other models. The study of studies [26][27][28]14] have shown that the state space models are the most appropriate models for fitting and forecasting short time series data. Most of the statistical models can be expressed in state space form, including all ARIMA and VARMA models.

Research Gaps
Most of the past empirical studies identified the ARIMA models as the most appropriate models for fitting and forecasting real GDP or GDP per capita. In these studies, there is little research on real GDP per capita in developing countries with almost none in Kenya. Real GDP per capita responds differently to changes in technology, health and other socio-economic variables. Bayesian models are appropriate for forecasting using time series data, especially over a short horizon, but have been under-utilized because they are complex. The development of computational algorithms in statistical software has made it easier. [14] recommended that Bayesian models are alive and well and there is need to explore the advantages that can be gained from using Bayesian methods on time series data. This study was meant to bridge this literature gap.

Research Design
In this study the experimental research design was used since the data was subject to two different models; with the aim of identifying the best model and then performing a 3-step ahead forecast using the identified model. [29] stated that experimental design is used where there is time priority in a causal relationship, consistency in a causal relationship and where the magnitude of the correlation is great. The longitudinal study design was applied since measurement about our study population was taken sequentially over time at regular time intervals, i.e. annually.

Target Population, Sampling Frame, Sample Size and Sampling Technique
Target population refers to the the entire group of individuals or objects which the researcher wishes to study and draw conclusions. The target population for this study was Kenya's GDP per capita. The sampling frame was the real GDP per capita for the period between 1980-2017. If the target population is less than 100, the whole population should be included in the study and a census survey undertaken, (Sperling, Gay, & Airasian, 2003). For this study, a census survey was undertaken since the sampling frame was less than 100, hence no sampling was done. The data used in this study was retrieved from the World Bank's data portal. The data was available for the time period between 1980 and 2017.

Data Processing and Analysis
In this study R statistical software package was used to assist in data processing and analysis. The Kenya's real GDP per capita time series data was used to construct the ARIMA and the state space model. This study focused on the additive time series model which takes the form: where y t is the observed value, µ t is the trend component, γ t the seasonality and ε t is a random component assumed to be white noise.

Autoregressive Integrated Moving Average (ARIMA) Model
An ARIMA(p,d,q) process is obtained when an Autoregressive (AR) process of order p is combined with a moving average(MA) process of order q and a d th difference taken so as to make the ARMA(p,q) process stationary. An AR(p) process is expressed as: while an MA (q) process is expressed as: When equation 2 and 3 are combined they yield the ARMA(p,q) process represented by equation 4 below.
A non-stationary ARMA(p,q) process can be transformed to a stationary process through differencing yielding an ARIMA(p,d,q) process as shown below.
Which can be rewritten as: A time series {Y t } is said to follow an ARIMA model if the d th difference, W t = ∆dY t is a stationary ARMA process. The ARIMA (0,0,0) process is known as a white noise process; φ i and β j are parameters to be estimated and ε t is a white noise process for i = 1, 2,... p, j = 1, 2,..q.

(i). Testing Stationarity
The basic concept of stationarity is that the probability laws that govern the behavior of the process do not change over time. A process {Y t } is said to be strictly stationary if the joint distribution of Y t1 , Y t2 , Y t3 ,..., Y tn is the same as the joint distribution of Y t1−k , Y t2−k , Y t3−k ,..., Y tn−k for all choices of time points t 1 , t 2 , t 3 ,..., t n and all choices of time lag k. If a process is strictly stationary and has finite variance, then the covariance function must depend only on the time lag. A stochastic process {Y t } is said to be weakly stationary if it the mean function is constant over time and cov(y t , y t−k) = cov(y 0 , y k ), for all time t and lag k ≥ 0.
A time plot for the real GDP per capita gave the general trend of the series. The ADF unit root test was run to determine whether the series was stationary. Since the series was found to be non-stationary, it was transformed to a stationary time series through differencing.

(ii). Box-Jenkins Methodology
Box and Jenkins (1976) developed the ARIMA model for the purpose of forecasting and estimation of a univariate time series. In order to use this methodology, one should have either a stationary time series or a time series that can be transformed to a stationary time series. After making the process stationary, the ARIMA(p,d,q) process can be represented by an ARMA(p,q) process as shown in equation (5 & 6) above. If {y t } is an ARMA (p, q) process then: which can be written as: where φi and βi are model parameters of the AR and MA parts respectively. Equation (10) above can be re-written as: where:

(iii). Model Identification
The ACF and PACF plots of the transformed time series was used to identify the order of the AR and MA terms in the ARIMA model. AIC and BIC are given by: BIC = −2log(L) + (p + q)log(n) (15) where: L ≡ the likelihood of the data n ≡ the sample size p and q ≡ the lag orders of the AR and MA terms respectively. NB: The model with the lowest AIC or BIC value was be taken to be the best fit for the data.

(iv). Diagnostic Checks
After fitting the model, Diagnostic checking was carried out to ensure that the selected model was the most appropriate. The significance of the model parameters was tested by checking whether the sequence of the residuals formed a white noise process. This was achieved by running the Ljung Box test for independence and the Shapiro-wilks test for normality respectively.

(v). Forecasting Real GDP per Capita
One of the primary objectives of building a model for a time series is to be able to forecast the future values for that series. This study we focused on both the in-sample and out-ofsample forecasting. Based on the available time series data up to time t, i.e., {Y t } = {Y 1 , Y 2 , Y 3 ,...Y t−1 , Y t }, our aim was to forecast the value of Y t+m that will occur m time units into the future. The minimum mean square error forecast is given by:

Bayes Theorem
A model is recognized as Bayesian when a probability distribution and the Bayes Theorem are used to describe uncertainty regarding the unknown parameters. The Bayesian approach for model formulation begins by first quantifying the researcher's existing state of knowledge and assumptions [32]. The prior knowledge is then combined with the likelihood function-the joint probability of the data under the stated model assumptions. The posterior distribution is obtained by combining the prior and likelihood information. This combination constitutes the Bayes' theorem and can be illustrated by the relationship shown below.
posterior ∝ prior × likelihood Bayes theorem states that the probability that event A occurs, given that event B has occurred, is equal to the probability that both A and B occur, divided by the probability that B occurs: i.e.
From equation (17) above, if we let A to be a parameter(s) and B the observed data (y t ), then we have: where: P (θ|y t ) ≡ the Posterior probability. P (y t |θ) ≡ the likelihood of obtaining the data under the null hypothesis. P (θ) ≡ the Prior probability of θ. P (y t ) ≡ the probability of obtaining the data under all admissible parameter estimates.

State Space/Dynamic Linear Model
The state space models are also known as dynamic linear models (DLM). The idea behind state space models is that an observable y t is generated by an observation or measurement equation, i.e., Y t = F' t ′θ t + v t (19) where v t ∼ N (0, V t ), and is expressed in terms of an unobservable state vector θ t . θ t is modeled dynamically through a system or transition equation as shown below.
With ω t ∼ N (0, W t ) and the error terms ω t and v t are mutually independent. Normality is usually assumed and a prior distribution is required to describe the initial state vector θ 0 . The general univariate state space model is represented by equations (19 & 20) where: Y t ≡ the observation series at time t F t ≡ a vector of known constants θ t ≡ the vector of model state parameters v t ≡ a stochastic error term G t ≡ a matrix of known coefficients that defines the systematic evolution of the state vector across time ω t ≡ a stochastic error term having a normal The two stochastic series {v t } and {ω t } are assumed to be independent. The dynamic linear model can be broken down further into: Local level model, local linear trend model and basic structural model as follows: Local level model Local linear trend model Basic structural model β t = β t−1 + η t , η t ∼ N (0, H t ) (28) s−1 γ t = − ∑ γ t−j + k t (29) j=1

(i). The Kalman Filter
The Kalman Filter is a recursive set of equations used to update the estimated parameters as new observations become available. Filtering updated our knowledge of the system each time a new observation y t was brought in. The idea of updating in the Kalman Filter is related to the Bayesian approach, indeed the theory behind the Kalman Filter is Bayesian. The Kalman smoothing algorithm was used to obtain the best estimate of the state at any point in the sample. Kalman filtering accumulates information about the time series as it moves forward through the list of the parameters while the Kalman smoother moves backward through time, distributing information about later observations to earlier parameters. Let y t−1 be the vector of observations (y 1 , y 2 ..., y t−1 )′ for t = 2, 3... and assume that the conditional distributions θ t |Y t−1 ∼ N (a t , P t ), θ t |Y t ∼ N (a t|t , P t|t ) and θ t+1 |Y t ∼ N (a t+1 , P t+1 ) where a t and P t are known. Our objective is to calculate a t|t , P t|t , a t+1 and P t+1 when y t is brought in. We refer a t|t as the filtered estimator of the state θ t and a t+1 as the one-step ahead predictor of θ t+1 . An important part is played by the one-step ahead prediction error v t of y t . Application to Kenya's GDP per Capita v t = y t − a t According to Durbin and Koopman (2012a), if we let Z t = V ar(v t |y t−1 ) and = , then it can be showed that: for t = 1,..., n = + (33) The relations showed in the above equation is referred as the Kalman Filter. The Kalman filtering accumulates information about the time series as it moves forward through the list of (a t , P t ) elements i.e. mean and variance. The Kalman smoother moves backward through time, distributing information about later observations to successively earlier (a t , P t ) pairs.

(ii). Updating Prior to Posterior and Forecasting
Model forecasts were derived from the prior information and the observation equation. The Prior information on the state vector for time (t+1) was summarized as a normal distribution with mean a t+1 and covariance R t+1 . θ t+1 |D t ∼ N [a t+1 , R t+1 ], where D t denotes the state of knowledge at time t. From the prior information, forecasts were generated using the observation equation. The forecast quantity Y t+1 is a linear combination of normally distributed variables, θ t+1 |D t and v t+1 . The forecast quantity was normally distributed with mean f t+1 and variance Q t+1 . Y t+1 |D t ∼ N [f t+1 , Q t+1 ] Given the prior for time (t+1) the implied prior for time (t+2) from the same standpoint, with no additional information is p(θ t+2 |D t ). This prior was obtained by applying the system equation as in The likelihood, a function of the model parameters, is the conditional forecast distribution evaluated at the observed value and has the normal form given as; The prior information is combined with the likelihood to yield the posterior distribution as shown below.
Forecasting k steps ahead requires the prior information to be projected into the future through repeated application of the system equation.

Results and Discussion
This section presents the results of the fitted ARIMA and state space models that were found to be appropriate models for predicting Kenya's real GDP per capita. The results of the two models were compared and the best model was used to perform a 3-step ahead forecast of the real GDP per capita.

Data
The data used in this study was obtained from the World Bank data portal and organized as shown in the table 1 below.

Trend
A time plot of the observed and smoothed real GDP per capita was fitted so as to get the general trend. The time plot showed an exponentially increasing trend. A plot of Y t against Y t−1 shows that there is a strong correlation between Y t and Y t−1 .

Stationarity of the Data
The Augmented Dickey-Fuller(ADF) test was run at 5% level of significance to check the stationary of the observed and differenced time series. The null hypothesis for this test is that the data was non-stationary. The results of running the ADF test are as shown in table 2 below. The p values obtained on running the ADF test on the observed, first and second difference was found to be 0.99, 0.1025689 and 0.01 respectively. The p-values obtained on the observed and first difference series was greater than the significance level, therefore we failed to reject the and concluded that the first difference and the observed GDP per capita series was non-stationary. However, the p-value obtained by running the ADF test on the second difference was smaller than the level of significance, therefore we rejected the null hypothesis in favor of the alternative hypothesis and concluded that the second difference transformation made the series stationary.

Model Identification
The ACF and PACF plots of the observed and transformed series is as shown in figure 3 below. The ACF plot of the second difference suggested that the transformed series followed an ARMA(1,1) process. Therefore the appropriate model for fitting the GDP per capita was identified as ARIMA(1,2,1).

Model Fitting
The ARIMA(1,2,1) model identified in the section above takes the form: where φ 1 and β 1 are parameters to be estimated. These model parameters were estimated through the Maximum likelihood approach and the estimates were found to be: The fitted ARIMA model for the real GDP per capita was identified as: where the series εt follows a white noise process and ε t ∼ N (0, σ 2 )

Diagnostic Checking
The residuals of the fitted model were examined to check whether they are independent and normally distributed. A plot of the standardized residuals, ACF and P values of the Ljung-Box statistic obtained.
The ACF plot values are within the confidence band except at lag 3 where there is a very weak serial correlation. A plot of the Ljung-Box statistic p values at different lags contains values that are above the significance level indicating that the residuals are independent. Again, the Ljung Box test was run on the standardized residuals of the fitted model to check whether they are independent. The null hypothesis for this test was that the residuals are independent. The p-value obtained from this test was 0.5402776. Based on this p value we failed to reject the null hypothesis and concluded that the residuals were independent. To check whether the residuals were normally distributed, the Shapiro-Wilk test for normality was run and the p-value for this test was 0.1505323. The null hypothesis for this test was that the residuals are normally distributed and based on this p value we failed to reject the null hypothesis; the conclusion was that the residuals were normally distributed. A plot of the residuals, ACF, histogram, density and quantile plot of the residuals is as shown in figure 5. The plots above together with the Shapiro-Wilk and Ljung box test showed that the residuals of the fitted model are independent and normally distributed. We therefore concluded that the ARIMA (1,2,1) model with φ 1 = 0.328775 and β 1 = −0.8340786 is a good fit for the real GDP per capita.

Forecasting Real GDP per Capita
A 3-step ahead forecast of the Kenya's real GDP per capita was performed using the fitted ARIMA (1,2,1) model. The point forecasts at 95% confidence level are as shown in below. The in-sample forecast together with the associated 95% confidence interval was found to be:  Figure 6. Plot of observed, fitted and predicted values.

Introduction
In this section we focused on two forms of state space models, i.e the local level model (LLM) and the local linear trend model(LLTM). The two models were fitted by use of the Kalman filter and the Kalman smoother. The best state space model was identified and GDP per capita forecasted using this model. The general representation of the state space model is: Observation Equation: System Equation: θ t = G t θ t−1 + ω t ωt ∼ N (0, Wt) The parameters of this model were obtained by Maximum Likelihood method and the variances were assumed to be unknown.

Local Level Model (LLM)
A state space model is a local level model if the coefficient of θ t and θ t-1 in equation is 1. The local level model is represented by the equations: Observation Equation: System Equation:

Building the Local Level Model
The local level model was built by setting the order of the model to 1. The maximum likelihood estimates were obtained and the convergence component was zero indicating that the algorithm converged successfully. The estimated model parameters were used to fit a model that was used to generate the filtered and the smoothed estimates with: F t = 1, G t = 1, V t = 10−4 and W t = 4840.971764 V and W are the variances of the random components in the observation and transition equations respectively. The BFGS optimization algorithm was used to obtain the maximum likelihood estimates and the optimum estimated hessian matrix. The asymptotic variance matrix of the maximum likelihood estimators is given by the inverse of the Hessian matrix of the negative log likelihood function. The predicted states, filtered estimates of state vectors and smoothed estimates of the state together with the corresponding variances were also obtained.

Diagnostic Checking
The residuals of the fitted model were examined to check whether they are independent and are normally distributed. Several diagnostic plots are shown in figure 7 below. The Ljung Box test for independence and the Shapiro-Wilk test for normality was performed and the results were as shown in table 5 below. The p value obtained upon running the Shapiro-Wilk test was 0.2199502. Based on this p value we failed to reject the null hypothesis and concluded that the residuals were normally distributed. The p value obtained by running the Ljung box tests was 0.0013537. According to this p value we rejected the null hypothesis and concluded that the residuals were not independent but they were correlated. Since the residuals were correlated we concluded that the local level model is not a good fit for the real GDP per capita. What followed is that We considered the Local linear trend model, which is of order 2.

Local Linear Trend Model
A local linear trend model takes the form:

Building the Local Linear Trend Model
The local linear trend model was fitted using the same steps and functions used to fit the local level model except that the order is set to 2 in the dlmModPoly function and a trend component is added. The maximum likelihood estimates for the parameters were obtained as follows. Where V t is the observational error variance(V) and W t is the transitional error variance(W).

Diagnostic Checking
The results obtained upon running the Ljung Box and Shapiro Wilks tests are as follows. The p value obtained from the Shapiro-Wilk normality test is 0.1436483. We failed to reject the null hypothesis and concluded that the residuals are normally distributed. On the other hand, the p value obtained from the Ljung Box test was 0.7146218. We failed to reject the null hypothesis and concluded that the residuals are independent. The results of the Ljung Box and shapiro Wilk tests showed that the local linear trend model is a good fit for the real GDP per capita.

Forecasting
Both in-sample and out-of-sample forecasting was performed using the fitted local linear trend model. A plot of the observed, fitted(filtered), smoothed and forecasted values is as shown in figure 8 below. The 3-step ahead out-of-sample forecast using the local linear trend model is as shown in table 8 below. The in-sample forecast (filtered estimates), smoothed estimates and the associated 95% confidence intervals are as shown in the table 9 below.

Performance of the Fitted ARIMA and State Space Model
The ARIMA(1,2,1) was identified as the best ARIMA fit for the real GDP per capita while the local linear trend model(LLTM) was identified as the most appropriate state space model. The mean error(ME), standard error(SE), mean absolute error(MAE), mean percentage error (MPE), mean absolute percentage error(MAPE), mean absolute scaled error(MASE), AIC, BIC and the log likelihood were used to compare the accuracy and the performance of the two models. The table below summarizes the performance of the two models. The rule of the thumb is that we select the model that maximizes the likelihood, minimizes the errors as well as the AIC and BIC and the one whose R -Squared is closest to one. In this case, the Local linear trend model was identified as the most appropriate model for fitting and forecasting the real GDP per capita for Kenya.

Summary of the Forecasts
The predicted values using state space and ARIMA models are as shown in table 11 below. The Local linear trend model was identified as the best fit and a 3-step ahead forecast of the real GDP per capita is as in table 12 below. A plot of the observed, predicted and smoothed estimates is as shown in figure 9 below.

Summary
Expectations about future GDP per capita can be the primary determinant of investments, employment, wages, profits and stock market activities. The ARIMA model uses the frequentist approach in forecasting the future values of a time series while state space models use the Bayesian approach. This study used time series data from the World Bank for the period between 1980-2017 to compare the performance of the ARIMA and state space models. The results of this study showed that the ARIMA(1,2,1) and the local linear trend models are appropriate models for forecasting Kenya's real GDP per capita. The accuracy of the two models was compared and local linear trend model (a form of state space models) was found to perform better than the ARIMA model because it had a larger log likelihood, minimum MAE, SE, MASE, AIC and BIC and the R -Squared was closer to 1. The findings of this study were found to be consistent with those of [10,12,14,27,28] who concluded that Bayesian models are alive and well and are appropriate for prediction especially over a short horizon.

Conclusion
State space models outperform ARIMA models in their forecasting ability. This clearly indicates that the Bayesian approach is superior to the frequentist approach in time series forecasting. The advantage of Bayesian approach is that the model parameters are updated when a new observation is brought in. These models are therefore appropriate for generating future values of a macroeconomic time series.

Recommendations
The results of this study showed that the state space models which are a class of Bayesian models outperform the autoregressive moving average models which employ the frequentist approach in time series forecasting. We therefore recommend the use of State space models in forecasting the future values of a macroeconomic time series.

Suggestions for Further Research
There are many models that can be used to forecast future values of a time series. We suggest further studies to determine whether there are other models that can outperform the state space models in their predictive ability. More research can be conducted to establish ways that can ensure a sustained increase in real GDP per capita especially in developing countries like Kenya.