Comparative Study of GCV-MCP Hybrid Smoothing Methods for Predicting Time Series Observations

Generalized Cross Validation (GCV) has been considered a popular model for choosing the complexity of statistical models, it is also well known for its optimal properties. Mallow’s CP criterion (MCP) has been considered a powerful tool which is used to select smoothing parameters for spline estimates with non-Gaussian data. Most of the past works applied Generalized Cross Validation (GCV) and Mallow’s CP criterion (MCP) smoothing methods to time series data, this methods over fits data in the presence of Autocorrelation error. A new Smoothing method is proposed by taking the hybrid of Generalized Cross Validation (GCV) and Mallow’s CP criterion (MCP). The predicting performance of the Hybrid GCVMCP is compared with Generalized Cross Validation (GCV) and Mallow’s CP criterion (MCP) using data generated through a simulation study and real-life data on all SITC export and import price index in Nigeria between the years, 2001-2018, performed by using a program written in R and based on the predictive Mean Score Error (PMSE) criterion. Experimental results obtained show that the predictive mean square error (PMSE) of the three smoothing methods decreases as the sample size and smoothing parameters increases. The study discovered that the Hybrid GCV-MCP smoothing methods performed better than the classical GVV and MCP for both the simulated and real life data.


Introduction
There are several ways of modeling time-series observations through nonparametric regression techniques to make predictions, one such nonparametric technique is the spline smoothing method [1]. Spline smoothing model assume that observations are taken at times , for i = 1, ..., n and the are generated by a model of the form; , 1, . . . n Where; is the value of the time series in time t i , f is an unknown smooth function, and ε i is normally distributed with the standard deviation σ i . Smoothing spline is a solution to a nonparametric regression problem with the function i.e. Є C 2 [a, b] in the model that minimizes the penalized residual sum of squares with two continuous derivatives, as given below; The first term in the equation is the residual sum of the square for the goodness of fit of the data, the second term is a roughness penalty, which is large when the integrated second derivative of a regression function is also large. λ is a smoothing parameter. If λ approaches 0 then simply interpolates the observations, if λ is very large, then will be selected so that is everywhere 0, which implies an overall linear least-squares fit the observations. The smoothing parameter λ plays a key role in controlling the trade-off between the goodness of fit represented by ∑ !" and smoothness of the estimate measured by . The solution based on smoothing spline for a minimum problem in equation (2) is known as a "natural cubic spline" with knots at x 1 , …, x n . From this point of view, a specially structured spline interpolation which depends on a chosen value λ develops into a suitable approach of function g in a model (2). Let . . * + + + + , ( ×") or & = . (3) where & is a natural cubic spline with knots at x 1 , …, x n for a fixed λ > 0, and S λ is a well-known positive-definite (symmetrical) smoother matrix which depends on λ and the knot points x 1 , …, x n , but not on y. The function & , is the estimation of function g, it is obtained by cubic spline interpolation that depends on the condition & (xi) = ( )i, i = 1, 2, …, n.
The smoothing parameter determines the smoothing level. For > = 0, ( ) equals the value of the time series at that time, x i . The greater the smoothing parameter the greater the difference between ( ) and . The most popular method used in time series analysis is the classical Autoregressive Moving Average (ARMA) approach; it assumes linear dependence on past observations and discoveries. The classical linear modelling was used with known data samples and discovered that it has its limitations [1]. The increasing knowledge of deviations from the known ARMA model is projected by Spline smoothing researchers in nonlinear time series analysis. Many Spline smoothing researchers have studied modeling of time series observations with Generalized Cross-Validation (GCV), Generalized Maximum Likelihood (GML), and Mallow's C P criterion (MCP) methods [2]. The GCV method was extended to estimate the smoothing parameter and Autocorrelated error term, [2]. A smoothing spline was represented by a state-space model and extended the CV, GCV, and GML estimation methods to an Autoregressive moving average error term [3]. A Cross-Validation method is employed to estimate some smoothing parameters [4] and more recently, it was extended to GML, GCV, and MCP methods to estimate the smoothing parameter when data are correlated, [5]. Almost all of these methods were developed for time series observations while some others require that the design points are equally spaced. In real-time series, usually, neither the function g nor the standard deviations ai are known, therefore E('x) cannot be calculated according to equation (1). However, in the case when ( ) is a linear combination of , the Generalized Cross-Validation (GCV) function is one of the models that is often used in selecting the optimal knot, [6, 7, and 8]. It provides several advantages over other methods, including asymptotic optimal, invariance to transformation and do not needs a known population variance. Mallow's C P Criterion (MCP) is a smooth parameter selection method that requires an estimated value of the known error variance [9,10,11]. The MCP method is used to select the smoothing parameter for spline estimation with non-Gaussian data [12] and [13]. From simulation study, the MCP smoothing method estimated knots which were close with the original knot, so the MCP method can be taken into consideration in choosing optimal knot, [14]. Numerical experiments showed that the direct MCP methods perform better than existing indirect MCP [15]. Generalized Cross-Validation (GCV), Mallow's CP criterion and Generalized Maximum Likelihood (GML) methods were examined based on a simulation study, it was discovered that for large samples and because of the effect of replication, GCV and Mallow's CP criterion has the same asymptotic result [16].
In this paper, the Hybrid smoothing method is developed by combining the unique optimal properties of GCV and MCP i.e. Hybrid GCV-MCP. The performance and efficiency of the Hybrid GCV-MCP smoothing method was compared with the classical GCV and MCP smoothing methods for predicting time-series observations. After giving a brief introduction of smoothing time series observation using and Generalized Cross-Validation (GCV) and Mallow's C P Criterion (MCP) the next section presents a literature review on GCV and MCP, section three presents materials and methods. Section four compares the three methods via a simulation study and it application to real life data, and finally, the conclusion was presented in the last section.

Literature Review
Literature shall establish the fact that Generalized Crossvalidation and Mallow's C P criterion have been applied to time series observations in past. Generalized Cross-Validation (GCV) was compared with and Mallow's C P criterion (MCP), it was recommended that Generalized Cross-Validation is a good smoothing parameter selection for small and medium-sized samples [17]. The smoothing spline method was applied to fit a curve to a noisy data set, where the selection of the smoothing parameter is essential. An improved Cp criterion for spline smoothing based on Stein's unbiased risk estimate was proposed to select the smoothing parameter. The resulting fitted curve showed to be superior and more stable than commonly used selection criteria and possesses the same asymptotic optimality as Cp, [18]. Most data-driven smoothing parameter selection methods were compared based on large and small sample sizes. The parallel of Akaike's information criterion (GFAIC) and Generalized Cross-Validation (GCV) is recommended as being the best selection criteria. For large samples, the GFAIC method would seem to be more appropriate while for small samples they proposed the implementation of GCV criterion [19]. Two types of results that support the use of Generalized Cross-Validation (GCV) for variable selection under the assumption of sparsity was investigated. The first type of result is based on the well-established links between GCV on one hand and Mallows's Cp and Stein Unbiased Risk Estimator (SURE) on the other hand. The result states that GCV performs as well as Cp or SURE in a regularized or penalized least squares problem as an estimator of the prediction error for the penalty in the neighborhood of its optimal value [20]. In the comparison of GCV with GML via a Monte Carlo Method using a program written in R. It was discovered that GML was better than GCV because it is stable and works well in all simulations and at all sample sizes and it does not over fit data when the sample size is small, [21]. MCP and CV was compared for selecting the optimal knots. The criteria for selecting the best model were based on Mean Squared Error and R-square. A simulation was performed on a spline truncated function with error generated from a Normal distribution for varied sample sizes and error variance. The results of the simulation study showed that CV estimates the knots more accurately than MCP [14]. Nonparametric regression problems were considered, a model-averaging procedure for smoothing spline regression problem was developed [22].
The main motivation of this paper is to present a new smoothing method to deal with the prediction of time series observations that yields better performance than the classical methods. All of the methods given above have some limitations when modeling time-series observations, which is indicated above. The introduction of the combination of GCV and MCP i.e. Hybrid GCV-MCP was proposed to solve the problem of over smoothing and over fitting of time series observations for the number of knots, smoothing parameters, and time series sizes.

Data Collection
This research was conducted to evaluate the performances of the three smoothing methods, i.e. GCV, MCP, and hybrid GCV-MCP. Time-series data were simulated by using a program coded in R (version 3.2.3) for time series sizes of; 50, 100, and 150. The number of replications was 1,000 for each of the samples. For each simulated data set, the Predictive Mean Squared-Errors (PMSE) was used to evaluate the quality and performance of the methods.

Equation Used for Generating Values in the Simulation
The simulation study conducted to evaluate and compare the performance of the four estimation methods is given as; Where;3 456°,8 9 ~ ; 6, <) and it is independently and identically distributed with zero mean and 0.8 and 1.0 standard deviation.

Experimental Design
The experimental plan applied in this research work was designed to 1. Three-time series sizes(T) of 50, 100 and 150 were considered for the simulation 2. Four smoothing Parameters were considered, i.e. λ = 1, 2, 3, 4 3. Two standard deviations were considered i.e.

Evaluation of the Smoothing Methods
In this study, Predictive Mean squared error (PMSE) is used as quality measurement to evaluate the performance of a smoothing or Curve fitting procedure. PMSE is the expected value of the square difference between the fitted value implied by the predictive function A ( ) and the values of the observed function ( ) . It is used to measure the performance and quality of a predictor or Smoothing methods like Cross-Validation, Generalized Cross-Validation, Mallow's C P Criterion, etc. The Predictive Mean Square Error (PMSE) is written mathematically as; The Predictive Mean Square Error can be divided into two terms, the first term is the sum of square biases of the fitted values while the second is the sum of variances of the fitted values. Where; ( ) is observed value and A ( ) = fitted/predicted/estimated value

Generalized Cross-Validation (GCV)
The term Generalized Cross-Validation (GCV) was proposed by [23] and [24] as a replacement of Cross-Validation (CV),it is the most popular method for choosing the complexity of statistical models. The basic principle of cross-validation is to leave the data points out one at a time and to choose the value of λ under which the missing data points are best predicted by the remainder of the data. To be precise, let & !" be the smoothing spline calculated from all the data pairs except (ti, yi), using the value λ for the smoothing parameter. The cross-validation choice of λ is then the value of λ which minimizes the Cross-Validation score; This is identical to the criterion for model selection in regression generally defined by matrix A (λ) [25] J K ( ) = # !" , K The use of related criterion was suggested, it was called Generalized Cross-validation, obtained from (8) by replacing J ( ) by its average value, # !" QJ( ), this gives the score [24].
Where; RSS (λ) is the residual sum of squares, ∑Hy Y − g (t Y )I , in their study [24] also give theoretical arguments to show that Generalized Cross-Validation should, asymptotically choose the best possible value of λ in the sense of minimizing the average squared error at the design points. This predicted good performance is borne out by published practical examples in [27]. The generalized Cross-validation method is well known for its optimal properties [14]. If there exists an n x n, the influence matrix, with the property; It is such that W 0 (λ) can be rewritten as; Where; > bL K ∈ 1, 2, . . . , # GCV is a modified form of the CV which is a conventional method for choosing the smoothing parameter. The GCV score which is constructed by analogy to CV score can be obtained from the ordinary residuals by dividing by the factors 1 − (Sλ) ii . The approved design of GCV is to replace the factors 1 − (S λ ) ii in Cross Validation with the average score 1 − n −1 trace (Sλ). Thus, by summing the squared corrected residual and factor {1 − n −1 trace (S λ )} 2 , by the example ordinary cross-validation, the GCV score function can be written as; Where; n is the measurement/observations {xi, yi}, λ is smoothing parameters and Sλ is the ith diagonal element of a smoother matrix.

Mallow's CP Criterion
Mallow's CP criterion (MCP) was developed by [26] to estimate the fit of a regression model based on Ordinary Least Square. It is applied to a model selection case where predictor variables are present for forecasting some outcomes and for finding the best model involved in subset predictors. The smaller the value of the Cp, the relatively precise it is, the Cp is written mathematically as;

Hybrid GCV-MCP Method
Hybrid is the combination of two models or methods. The combination of different methods have been frequently used in research for better performance by manipulating the unique strength of the two methods. A combination of GCV and MCP will provide more accurate/precise predicting model for forecasting as compared to an individual smoothing methods. The GCV method is well known for its optimal properties in smoothing estimation method [6] while MCP method has been successfully applied to estimate smoothing parameters for spline estimates with non-Gaussian data and to fit data appropriately [8,24,25] and [26]. The proposed Smoothing method combines the optimal properties of GCV and MCP.
The minimizer of GCV is given as; While the Mallow's CP criterion method of is given as; Therefore, a new smoothing method is proposed by introducing an additional weighted parameter g and combining properties of the Generalized Cross-Validation (GCV) and Mallow C P criterion (MCP). The combination, measurement, and expression of the quantities of the two methods will yield an optimal performance and smoothing model that does not over fit data. The minimizer of the hybrid methods of (13) and (14) Where; n is number of observations, 0 < < 1, g is weighted values, y = (y 1 , …,y n ) T is the smoothing function, A = ( ( " ). ( )). ) % is S λ y, Sλ= is the diagonal element of the smoother matrix.

Simulation Study
In this section, a simulation study is carried out to compare the behaviors of the Hybrid GCV-MCP smoothing methods with two classical smoothing methods namely; Generalized Cross-Validation (GCV) and Mallow's C P (MCP) criterion when estimating time-series observation. Before the results of simulation experiments, datasets for the different simulation combinations are generated by using codes written in the R 3.2.3 software. Our data generation procedure, with accompanying descriptions, is given in Table 1.   Table 2 presents the predictive mean square error of the three smoothing methods, three sample sizes at 0.8 sigma level. It was discovered that for the GCV smoothing method, the predictive mean square error decreases as the time increases; when T = 50 the PMSE decreased from 0.053273 to 0.027264 when T = 100 and further decreases from 0.027264 to 0.025485 when T = 150 for smoothing function λ = 1. When T = 50, the predictive mean square error (PMSE) of the hybrid GCV-MCP smoothing method, decreased from 0.042392 to 0.040716 when T = 100 and further decreased to 0.003851 when T = 150. Furthermore, for the MCP smoothing method, the predictive mean square error decreases as the time increases; when T = 50 the PMSE decreased from 0.048021 to 0.034561 when T = 100 and further decreases from 0.034561 to 0.025485 when T = 150 for smoothing function λ = 1. (λ = 1, 2, 3 and 4), Time (T = 50, 100 and 150) and Std. Deviation (σ = 1.0) Table 3 presents the predictive mean square error of the three smoothing methods and the three sample sizes at 1.0 sigma level. It was discovered that for the GCV smoothing method, the predictive mean square error decreases as the time increases; when T = 50 the PMSE decreased from 0.126826 to 0.0424942 when T = 100 and further decreases from 0.0424942 to 0.0242396 when T = 150 for smoothing function λ = 1. When T = 50, the predictive mean square error (PMSE) of the hybrid GCV-MCP smoothing method, decreased from 0.087162 to 0.016832 when T = 100 and further decreased to 0.004103 when T = 150. Furthermore, for the MCP smoothing method, the predictive mean square error decreases as the time increases; when T = 50 the PMSE decreased from 0.056372 to 0.044455 when T = 100 and further decreases from 0.044455 to 0.037378 when T = 150 for smoothing function λ = 1.

Smoothing Curves of the Time Series Observation
Figures 1 to 6 presents the observed and estimated values of GCV, Hybrid and MCP after 1000 replications. From these plots it is observed that Hybrid and GCV have smaller PMSEs when compared to MCP. The plot also indicated that Genralized Cross-Validation (GCV), Mallow's C p Criterion (MCP) and Hybrid GCV-MCP smoothing methods provides a good fits for time series observations, but Hybrid GCV-MCP and GCV smoothing methods provide a better estimates than MCP based on Predictive Mean Square Error (PMSE). And at all sample sizes. The Hbybrid method is more stable when the sample size is small and medium (T = 50 and 100) This behavior of the Hybrid smoothing method is somehow similar to the finding of [27] and [13].      T =50  2  3  1  MCP  T = 100  2  1  3  Hybrid  T = 150  2  1 3 Hybrid   Tables 4 and 5 above present ranks and preferred smoothing methods of the three smoothing methods (GCV, Hybrid GCV-MCP, and MCP) at three time periods (i.e. T = 50, 100 and 150) when the standard deviation is 0.8 and.
From the results present in table 4, it can be seen that Hybrid GCV-MCP methods had the least predictive mean square error when the time series size is small and moderate (T = 50 and 100) while MCP smoothing method had the smallest predictive mean square error when the time series size is large (T = 150).
From the results present in table 5, it can be seen that MCP smoothing methods had the least predictive mean square error when the time series size is small (T = 50) while Hybrid GCV-MCP smoothing methods had the smallest predictive mean square error when the time series size is small (T = 100 and 150).
In summary, Hybrid GCV-MCP is the best smoothing method for predicting time-series observations at three time periods (i.e. T = 50, 100, and 150), for standard deviation (σ = 0.8 and 1) based on the predictive mean square error (PSME) criterion. This finding is quite different from, [15] but somehow similar to those from; [27] and [13].

Application of Smoothing Methods to Real Life Data
This section is prepared to show the performance of the hybrid GCV-MCP smoothing method for time series observation. The dataset represents data from all Standard International Trade Classification (SITC) Export and Import price index in Nigeria between 2001-2018 collected from CBN 2018. To provide continuity, the logarithms of the All SITC product export price index are considered as a response variable (export), and import is taken as a nonparametric covariate (import). The right smoothing spline time series model is thus given by; 1 ˜™ š"™Q 1 f "™Q 1 › › 216 (17) The dataset contains information for 216 all Standard International Trade Classification (SITC) Export and Import price index in Nigeria to be used for this analysis. The outcomes calculated for the model in Equation (17) are given the following table and figure. Table 8 present the result for real-life data on all SITC export and import price index in Nigeria between 2001-2018. The Autocorrelation result showed Autocorrelation existed in the time series observation (X-squared = 96.7395, df = 1, pvalue < 2.2e-16). Table 6 showed the time series observation is stationary (Dickey-Fuller = -3.6471, Lag order = 5, p-value < 0.03021) while table 7 indicated that hybrid GCV-MCP smoothing method performed better than the GCV and MCP.

Conclusions
In this paper, the experimental results are being obtained by using GCV, MCP, and Hybrid GCV-MCP for predicting time-series observations. The performance of the Hybrid model is compared with Generalized Cross-Validation (GCV) and Mallow's C P Criterion (MCP). GCV is used to choose the complexity of statistical models while Mallow's CP criterion (MCP) is a powerful tool used to select smoothing parameters for spline estimates with non-Gaussian data. The result showed that MCP provided better results when compared to GCV for small sample size. In all, the Hybrid GCV-MCP method achieved good forecasted results and performed better than the two classical smoothing methods (i.e. GCV and MCP) based on the least predictive mean square error (PMSE) criterion. The proposed smoothing method was also applied to the real-life data on all SITC export and import price index in Nigeria between the years, 2001-2018, result from PMSE proved that hybrid GCV-MCP performed better than the GCV and MCP.
Finally, the Proposed Hybrid GCV-MCP is recommended as the best smoothing method for time series observations for mostly medium and large sample sizes.