Modelling and Forecasting Inflation Rate in Kenya Using SARIMA and Holt-Winters Triple Exponential Smoothing

In this paper, two models of forecasting are used the Box-Jenkins procedure employing the SARIMA and the Holt-Winters triple exponential smoothing. Published Consumer Price Index Data from Kenya National Bureau of Statistics (KNBS) for the period November 2011 to October 2016 was used. This paper we equate the forecasted values of both the models and we choose the best model based on the least mean Absolute square error (MASE), mean absolute error (MAE) and mean absolute percentage error (MAPE). The three step model building for Box-Jenkins was first employed, followed by the Hold-Winters triple exponential smoothing. The study found the SARIMA Model was a better model than the Holt-winters triple exponential smoothing as per the obtained results using MASE, MAE and MAPE.


Introduction and Motivation
Inflation is the rate at which the general level of prices for goods and services in an economy rises and consequently affecting the purchasing power of currency to fall. In Kenya Consumer Price Index is used in calculation of Inflation rates on yearly basis even though data is collected and computed monthly basis by the KNBS. KNBS, defines Consumer Price Index (CPI) is defined as a measure of the weighted aggregate change in retail prices paid by consumers for a given basket of goods and services. Price changes are measured by re-pricing the same basket of goods and services at regular intervals, and comparing aggregate costs with the costs of the same basket in a selected base period Price data for constructing the indices are collected by Kenya National Bureau of Statistics through a survey of retail prices for consumption of goods and services. The percentage change of the CPI over a one-year period is what is usually referred to as inflation.
In 2016 the rate has been almost stable with the highest being in January at 7.78% and the lowest being in May at 5.01%. November 2016, inflation was 6.68% which was the highest since February 2016 (7.08%). According to Otu, A. O, et al [1], The effect of inflation is highly considered as a crucial issue for a country. Inflation problems could cause living conditions in a country much harder to lot of people.
People who are living on fixed income suffer most as when prices of commodities rise, since these people cannot buy as much as they could previously. Monetary policy consists of decisions and actions taken by the Central Bank of Kenya (CBK) to ensure that the supply of money in the economy is consistent with growth and price objectives set by the government. The objective of monetary policy is to maintain price stability in the economy. Price stability refers to maintenance of a low and stable inflation.
Inflation is caused by many factors including micro economic factors and even natural factors like rain and drought. According to CBK [16], periods of drought or excessive rain, can cause the prices of food to increase, leading to an increase in the inflation rate. International factors like increases or decreases in oil prices can also lead to changes in inflation reflecting movements in energy and transport costs. Depreciation in the exchange rate against the major currencies can also cause inflation since Kenya is a net importer of goods. Inflation can also be caused by factors that influence the demand for goods and services, like the amount of money ordinary people have available to spend.
The Central Bank's monetary policy decisions are made to maintain a low and stable inflation rate over time, which is an indication of price stability. Inflation is a general increase in price levels over time. It is based on the prices of various consumer goods and services, which are evaluated and statistically represented in the Consumer Price Index (CPI). The month-on-month (or year-on-year) inflation rate is determined by comparing the CPI for a particular month to the CPI of that same month in the previous year.
Inflation is a key factor in helping people make sound decsions on financial matters. Deflation on the other hand can affect the economy by impacting on profitability of companies and hamper investor confidence. If there is a general decrease of prices over time due to a collapse in demand or increased supply of goods and services, then there is deflation. Hence there is need for the inflationary rates to be stable to enable a more stable economy through price stability that will drive economic growth.
In Kenya inflation, is controlled through the Monetary policy whose objective is to spur economic growth. Monetary policy decisions are made by the Monetary Policy Committee (MPC). The MPC meets at least once every two months and reviews data and analysis from various sources including the Central Bank Departments enabling it to decide on any action to maintain or vary its stance.

Justification of the Study
Not only have many Forecasting methods been developed in the past but equally many methods of measuring forecasting accuracy have been developed. Hyndman R. J. and Koehler A. B. [2], for example studied and compared all measures accuracy and settled on mean Absolute Scaled Error (MASE) as the best measure of accuracy on forecasting.
Time series researchers have also compared several models of forecasting with a view of determine which is a better model for forecasting depending on the nature of data and industry. For example, Virginia Gathingi [3] modelled inflation in Kenya using ARIMA and VAR. Equally Ingabire. J and Mung'atu. J. K. [4] compared ARIMA and VAR models in forecasting inflation rate in Rwanda. In the IMF Working paper series, Tim, C. and Dongkoo, C. [5] studied modelling and forecasting inflation in India using Bivariate VAR. They found out that broad money supply, exchange rate and import prices are relevant indicators that affect inflation especially in the manufacturing sector in India.
Other studies that concentrated on inflation include Otu et al [1], who discussed the application of SARIMA Models in Modelling and Forecasting Nigeria's Inflation Rates. while, Uwilingiyimana C., Mungatu. J and Harerimana J. [6] conducted a study on forecasting inflation in Kenya using two models, the ARIMA (1, 1, 12) and GARCH (1, 2) and a combination of the two model ARIMA ( The rest of the study is structured as follows: Section 2 highlights the empirical literature whereas, Section 3 presents the methodology. Section 4 will report the results of the empirical analysis and section 5 will discuss and conclude the study.

Research Objective
This paper aims to predict the value of CPI in Kenya twelve months ahead and to establish the best forecasting method for forecasting inflation to help in proper economic and financial management.

Scope of the Study
This research uses secondary data based on facts and figures collected and by the Kenya National Bureau of Statistics (KNBS) on their website to forecast CPI data which is the measure of inflation in Kenya. The data from November 2011 to October 2016 was used.

Literature Review
Otu et al [1], discussed the application of SARIMA Models in Modeling and Forecasting Nigeria's Inflation Rates. They employed Box and Jenkins to build the Autoregressive Integrated Moving Average (ARIMA) monthly inflation rates for the period November 2003 to October 2013 with a total of 120 data points. They found that the Seasonal ARIMA 12 (1,1,1)(0, 0,1) was the best model to forecast Nigeria's inflation rate.
Gathingi, V. [3] modelled inflation in Kenya using ARIMA and VAR models using data from January 2005 to June 2013. When the author compared the two models, VAR was a better model than the ARIMA (1, 1, 0) due to the smaller errors for RMSE, MAE and MAPE. However, she concluded that despite ARIMA using univariate historical consumer pricing data to model inflation, ARIMA model (1, 1, 0) resulted as the best model showing strong evidence of substantial inflation inertia even with the exclusion of independent variables Ingabire. J and Mung'atu. J. K. [4] compared ARIMA and VAR models in forecasting inflation rate in Rwanda. After carrying out all the necessary diagnostic checks, the study indicated that ARIMA (3, 1, 4) model was better than VAR model in predicting inflation in Rwanda. However, thy further stated that ARIMA model may be efficient in forecasting short term periods.
Jere, S. and Siyanga, M. [7], studied forecasting of inflation rate in Zambia using Holts exponential smoothing. In this paper, the use of Exponential smoothing uses a linear combination of the previous values of the given data to model and predict future values. Even though both models gave almost similar results ARIMA model (12, 1, 0) was adequate enough than holts exponential smoothing. They further concluded that Holt's exponential smoothing is as good as an ARIMA model due to the smaller deviations in the MAPE and RMSE but the Holt's exponential smoothing model is less complicated since you do not require specialised software to implement it as is the case for ARIMA models.
Hamidreza M. and Leila S. [8] examined a seasonal long memory process, denoted as the SARFIMA to study and predict the Iran's oil supply. They fitted both SARIMA and SARFIMA models, and estimated the parameters using CSS method. The results of their analysis indicated that the best model was SARFIMA 12 (0,1,1)(0, 0.199,0) − which was used to predict the quantity of oil supply in Iran.
Puthran et al [9], studied and forecasted Indian Motor Cycle industry by comparing SARIMA and Holt Winters models. They found out that even though both models were good, Holt-Winters method was a better model than SARIMA model due to the minimum MSE, MAE, and MAPE values when compared with SARIMA.
Udom. P and Phumchusri. N [10], compared the application of three forecasting methods on the amount of the sales volume for plastic distributor in Thailand for the period between January 2004 and December 2012. They compared, the ARIMA method, Moving average method and Holt's and Winter exponential method. They employed five data sets of raw material from plastic distributor and their results showed that the ARIMA model was a better model when compare with other methods by using MAPE (Mean Absolute Percentage Error).
Hyndman R. J. and Koehler A. B. [2], in their paper titled "Another look at measures of forecast accuracy" analysed and compared measures of accuracy of univariate time series forecasts. They analysed Scale-dependent measures, Measures based on percentage errors, Measures based on relative errors, Relative measures, and Scaled errors. They concluded and proposed that scaled errors become the standard measure for forecast accuracy The Mean Absolute Scaled Error (MASE). They however, stated that depending on the type of data mean absolute error (MAE) and Mean Absolute Percentage Error (MAPE) could be used. In this Paper we shall employ all the three methods for testing the model accuracy. They argued that MAE should be used if the data series are on the same scale on the other hand MAPE should be used when the data is all postive and much greater than zero since its very simple to use and interprete.

The Box-Jenkins Procedure
The Box-Jenkins procedure is concerned with fitting an ARIMA model to data. It has four parts: identification, estimation, verification or diagnostic checking and forecasting.

Model Identification
Normally we assume the data series is stationary. The initial model Identification is to estimate the sample autocorrelations function (ACF) and partial autocorrelations function (PACF) and compare the resulting ACF and PACF with expected or theoretical ACF and PACF derived already.
It can also be detected from an autocorrelation plot. Specifically, non-stationarity is often indicated by an autocorrelation plot with very slow decay.
Box and Jenkins recommend the differencing approach to achieve stationarity. In our data the series was not stationary and therefore we had to differntiae once to make it stationary. This was mainly done in order to Straighten out trends and to reduce heteroscedasticity (produce approximately uniform variability in the series over the sample range. If on plotting the data is not stationary on variance, the series has to be transformed through with the objective of making the series stationary on both mean and variance.

Model Estimation
Estimating the parameters for Box-Jenkins models involves numerically approximating the solutions of nonlinear equations. In our Case we employ R software to estmaite our model by using Maximum likelihood estimation.

Diagnostic Checks
The model having been identified and the parameters estimated, diagnostic checks are then applied to the fitted model, Box G. E. P et al [11]. In this paper we shall use the residuals methods in our diagnostic checks.

Forecasting
We evaluate forecasts using both subjective and objective means. The subjective examination looks for large errors and/or failures to detect turning points The analyst may be able to explain such problems by unusual unforeseen or unprovided for events Great care should be taken to avoid explaining too many of the errors by strikes etc In an objective evaluation of a forecast we may use various standard measures If i x is the actual datum for period {i} and{fi} is the forecast then the error is defined as.
In this study, the selected SARIMA model was used to forecast the mean monthly CPI for the period November-2016 to October-2018 by using the observed data of the period November-2011 to October-2016.

General ARIMA Process
The autoregressive moving average process, ARMA(p, q), is defined by where again t ε is white noise. This process is stationary for appropriate ϕ ,θ .
if we consider two models where t X is unobserved, t Y is observed and t ϕ and t η are independent white noise sequences. Note that t X is AR(1). We can write 1 1 1 In our Case, if the CPI data denoted by t Y is process is not stationary, we will look at the first order difference process or the second order differences This can continue until we find the difference with the expected outcome.
If we ever find that the differenced process is a stationary process we can look for a ARMA model of that. The process t Y is said to be an autoregressive integrated moving average process, ARIMA(p, d, q), if = t t X Y is an ARMA(p, q) process. The general ARIMA(p, d, q) is denoted by

The SARIMA Process
Chatfield, C. [12], states that If the series is seasonal, with s time periods per year, then a seasonal ARIMA (abbreviated SARIMA) model may be obtained as a generalization of an ARIMA. Let s B denote the operator such that Then the seasonal differencing will be written as Given that our data is monthly with 12 months per year(s=12) the seasonal difference will be; 12 12 (1 ) = ( ) The main objective of seasonal differencing will be to remove seasonal trend and season random walks in our data. Chatfiled C. [12], further elaborates that A seasonal autoregressive term, for example, is one which depends linearly on. A SARIMA model with non-seasonal terms of order (p, d, q) and seasonal terms of order (P, D, Q) is abbreviated a SARIMA(p, d, q)(P, D, Q)s model and may be written where ϕ and Θ denote polynomials in s B of order P, Q respectively. One model, which is particularly useful for seasonal data, is the SARIMA model of order 12 (0,1,1)(0,1,1) . Shumway R. H and Stoffer D. S [13], confirm that The multiplicative seasonal autoregressive integrated moving average model, or SARIMA model, of Box and Jenkins (1970) is given by; where t Z is the usual Gaussian white noise process. The Since our data for CPI is monthly data, s = 12, hence our equation will be written as. 12 12 Under this method we shall employ monthly CPI data with 12 seasons per year ( = 12) s , where the first order AR(1) model will use 12 t X − to predict t X , while the seasonal first order MA(1) will use 12 t Z − as its predictor.

Holt-Winters Tripple Exponential Mmoothing
Triple exponential smoothing takes into account seasonal changes as well as trends. This was and extension of Holt's method to capture seasonality. The Holt-Winters seasonal method comprises the forecast equation and three smoothing equations -one for the level l t , one for trend b t , and one for the seasonal component denoted by s t , with smoothing parameters α , * β and γ . Holt-Winter's exponential smoothing model is used for data that exhibit both trend and seasonality. Seasonality is the tendency of time-series data to exhibit behavior that repeats itself every t periods. The term season is used to represent the period of time before behavior begins to repeat itself. The Holt-Winters method has two versions, additive and multiplicative, the use of which depends on the characteristics of the particular time series.

Holt-Winters Multiplicative Method
and α , β , and γ are constants that must be estimated in such a way that the MASE of the error is minimized.

Data Analysis
Our analysis will use two methods Box-Jenkins which is sometimes referred to as ARIMA model and the Holt-Winters Triple Exponential smoothing. Firstly, the plot of the monhtly CPI data (from November 2011 to October 2016) done and observed the presence of trend and check for stationarity. The data collected is shown in table 1 below.

Model Identification
From Fig. 1 (a), we can clearly see that the data has the presence of trend since its increasingly moving updward.  Our data exhibited trend but seasonality and hence no seasonal difference but transformed the data by taking log and first difference and re-evaluated the trend and Fig. 1(b) clearly indicates no presence of trend. We confirmed stationarity through Augmented Dikey Fuller (ADF) test and the results indicated stationarity as per the results below.
The hypothesis of the Augmented Dickey-Fuller t-test is: 0 H : The data needs to be differenced to make it stationary 1 H : The data is stationary and doesn't need to be differenced Test regression trend Call: lm(formula = z. diff z. lag. 1 + 1 + tt)

Model Estimation and Selection
From Fig. 2. we can see that ACF and PACF The seasonal spikes at ACF and PACF after 1 lag (11,12,16) is observed which indicates that it is not taking seasonal difference of the series. This also indicates the seasonal model of SAR (1) and SMA(0). Therefore, model (1, 0, 0) will form the (P, D, Q) part. The non-seasonal part the PACF shows that there is a spike at lag 1 and no spike till lag 11, 12, 15 16 and then a discontinuation. This indicates AR (1) and since we differentiated our data once the model (p, d, q) will be (1, 1, 0). From this estimation we see that = 0 D implying that the data does not have seasonality and hence there is no need for seasonal differencing. This means that the appropriate SARIMA model for forecasting Monthly CPI and hence inflation will be 12 (1,1, 0)(1, 0, 0) . Plot ACF and PACF to identify potential AR and MA model is shown below.

Diagnostics Checks
After identification and estimation of the model, diagnostic checks was analysed using the residuals of the model. Figure 4 shows the standardized residuals, using the Normal QQ Plot of Standardd Residuals, ACF plot of the residuals and p-values for the Ljung-Box statistic. From Figure 4, the Standardized Residuals indicate that there's no trend in the residuals and no changing variance with time. The ACF of the residuals shows no significant autocorrelations and hence the estimated model is good and the Normal QQ Plot of Std Residuals portray a normal distribution which are independent and identically distributed sequence with a mean of zero and a constant variance. We confirmed the plot results with The Shapiro-Wilk test for normality where the null hypothesis for this test is that the residual are normally distributed. Our p-value was found to be 0.1019 which greater than 0.05, hence we fail to reject the null hypothesis and conclude that the residuals are normally distributed. From the plot of p-values for the Ljung-Box-Pierce statistics, the presented statistics consider the accumulated residual autocorrelation from lag 1 up to and including the lag on the horizontal axis. The dashed blue line is at.05. we can clearly see that All p-values are above it which means that the results are very good. We can also see that the calculated p-value of Ljung-Box-Pierce statistics at lag 24 is = 0.5168 p value − . This confirms non-significant values for this statistic when looking at residuals. The diagnostic results confirm that we cannot reject the null hypothesis of independence in this residual results. Hence we conclude that our Sarima 12 (1,1, 0)(1, 0, 0) model is best for our forecasting inflation.

Forecasting
Forecasted Values for January, February and March are, 176.0305, 176.5741 and 177.3643 respectively.

Model Estimation and Selection
Hyndman et al [14] coined the three components of smoothing triplet (E, T, S) as error, trend and seasonality. In this paper we emloyed ETS in R for automatic selection of our model in a similar manner as used by Hyndman et al [15] in Application of the automatic forecasting Strategy to the M-competition data and IJF-M3 competition data. The selected model was then tested for accuracy based on errors. Figure 5 below indicates that the model space selected is the model ETS (A, A, N). This means that our model has additive errors, additive trend and no seasonality which is an indicator of Holt's linear method with additive errors. This confirms lack of seasonality just hence confirming our SARIMA model results.

Determination of Smooth Parameters
From the results above we see that = 0.9999 α which is very close to 1. This indicates that more weight is on the recent value of the independent variable and less on the previous data. Our = 0.0001 β which is very close to zero. This means less weight on recent data. We do not have the γ which confirms the non existence of seasonality component in our model.

Diagnostic Check
Forecasted Values for January, February and March 2017 are 177.4458, 178.2016 and 178.9574 respectively

Discussion
The two prediction models were compared using MASE, MAE and MAPE. From Table 4 [9] whose results in the study of Indian Motor Cycle industry found that Holt Winters was more precise than SARIMA Model. On model selection, our result agree with Udom. P and Phumchusri. N (2014). who confirmed that ARIMA 12 (1, 0,1)(1, 0,1) was a better model compared to Holt-winters based on mean Absolute Percentage Error. Considering that MAPE if one of the accuracy parameters proposed by Hyndman R. J. and Koehler A. B. (2006), this paper is confident to make the comparison. Finally, we compared the Actual CPI values for the month of January and February 2017 against the forecasted values with SARIMA and Holt-Winters using Analysis of variance with a view of finding out if they are significantly different. we got the rsults below.

ANOVA for Actual vs Forecasted Values
Here we test if the forecasted values between the two models are different and also test against the actual values for the months of January and February 2017  Table 5 above, we see that the F-statistic is 1.38 with a p-value equal to 0.3745. We clearly fail reject the null hypothesis of the data set are not signficantly different.

Conclusion
The main objective of this research paper has forecast Inflation in Kenya using SARIMA and Holt-winters (Triple exponential Smoothing). The study results indicates that both models gives almost similar results but when MASE, MAE, and MAPE was compared SARIMA model seemed more accurate since it had has the minimum MASE, MAE, and MAPE values. This research employed univariate approach to forecast inflation. More research can be done using multivariate analysis to forecast CPI using factors that affect the component used in calculating CPI e.g exchange rates, Oil prices, Weather conditions etc. The results will be helpful for better financial planning and budgeting. The Forecasted CPI for January 2017 is 176.04 meaning that the Forecasted Inflation rate for January and February 2017 is 6.45% and 7.1% respectively while the actual inflation rate for January and February 2017 from KNBS website is 6.99% and 7.9%.