Towards Efficiency in the Residual and Parametric Bootstrap Techniques

There are many bootstrap methods that can be used for statistical analysis especially in econometrics, biometrics, Statistics, Sampling and so on. The sole aim of this paper is to ascertain the accuracy and efficiency of the estimates from the independent and identically distributed (iid) simple linear regression (SLR) model under a variety of assessment conditions using bootstrap techniques. Analysis was carried out using S-plus statistical package on hypothetical data sets from a normal distribution with different group proficiency levels to buttress the arguments in the paper. In the course of the analysis, 268,800 scenarios were replicated 1000 times. The result shows a significant difference between the performances of the bootstrap methods used, namely; residual and parametric bootstrap techniques. From the analysis, the largest bias and standard error were always associated with model HP311 while the smallest bias and standard error values were associated with models HR311. The exception was found in the group proficiency level 3 N (1, 0.25), when the sample sizes were 200, 1000 and 10000 instead of model HR311 producing the smallest bias and standard error, model RP311 did. The significantly better performance of the residual bootstrap indicates the possible use of this technique in assessment of comparative performance and the capability of yielding very accurate, consistent, faster and extra-ordinarily reliable statistical inference under several assessment conditions.


Introduction
Bootstrap is not simply another statistical technique but is rather a general approach to statistical inference with very broad applicability and very mild modeling assumptions. It is the result from the way the sample information is processed. For instance, in the case of samples from a normal distribution, all the information about the distribution of the sample mean is summarized in the sample mean and variance (standard error for samples), which are jointly sufficient statistics. Thus, other ways of processing sample information in this case does not yield any better results. In most econometric applications, where there is no readily available finite sample distribution of the test statistics that's when one gets the most mileage out of the bootstrap methods. Even though, it is computationally more demanding than other sampling techniques. There are several forms of the bootstrap techniques but two methods (residual and parametric) are usually used when the dataset is independent and identically distributed (iid). This study examines them to ascertain which one is more efficient and sufficient statistic with respect to their proficiency level, bias and standard error on SLR models. Secondly, to estimate the test statistics of the functional models and to determine the best model under those conditions. This will be aided by an S-plus program (stat 4); the residual and parametric methods and functions were also incorporated. In addition, a hypothetical data sets from a normal distribution with different group proficiency levels will be used to buttress the arguments in the paper. [1] considered as a special case where a parametric regression model y i = mL (x i ) + e i, i = 1, where mL (x i ) is the regression function and e i denotes the associated ith error. For fixed design and parametric regression function, we may proceed by resampling the residuals er i = y i -mLh (x i ) where Lh is a parameter estimator. Naive bootstrap samples e* i drawn from the empirical distribution of the centered residuals eri, are used to get the bootstrap regression model y* i = mLh (x i )+ e* i . This approach is called model-based parametric bootstrapping. Each of the bootstrap samples can provide an estimate of the regression parameter (s) following the same estimation procedure that was used with the original fitted model (e.g., ordinary least squares). From all the bootstrap replicates we get a simulation approximation to the bootstrap distribution of the regression parameter (s) and this is then used to make inferences about the parameter (s) based on this approximation to the sampling distribution (s) for the parameter estimate (s). When an explicit parametric equation is not available, an alternative is the block bootstrap, which consists of resampling blocks of subsamples, trying to capture the dependence in the data. When bootstrap replicates obtained from block bootstrap is not stationary, a bootstrap method called stationary bootstrap proposed by [2] will be used. [3], introduced the tapered block bootstrap. The idea of the block bootstrap has also been extended to the spatial setting, [4]. In model-based inference, [5] underscored the importance of the choice of residuals for the residual bootstrapping. Different modifications of this simple idea allow for adapting to random design, heteroscedastic models or situations where the regression function is not totally specified or is unknown [6], [7], [8]. For example, similar to the ideas of bootstrap in regression models, given an explicit dependence structure such an autoregressive model, [y i = m (y i-1 , y i-2 ,..., y i-p )] and proceed by resampling from the residuals. Moreover, an overview of the residual bootstrap methods for estimation and prediction in time series and regression can be found in [9], [10], [11], [12], [13]. Though, [14] and [15], [16], [17] worked on the general efficiency of the bootstrap. In this study, the particular bootstrap method that is more efficient over the other in independent and identical distribution (iid) on the SLR will be established and added to literature.

Methodology
In this section, hypothetical data sets will be bootstrapped using the procedures below to develop residual and parametric bootstrap (PB) models aided by an S-plus package.

The Residual Bootstrap
Assuming the error terms in SLR are independent and identically distributed with common variance σ 2 , then we can generally make very accurate inferences by using the residual bootstrap. We do not need to assume that the errors follow the normal distribution or any other known distribution. The first step in the residual bootstrap is to obtain OLS estimates β and residuals ̂ . Unless the quantity to be bootstrapped is invariant to the variance of the error terms, if not, it is advisable to rescale the residuals so that they have the correct variance. The simplest type of rescaled residual is The bootstrap errors here are said to be 'resampled' from the üt. That is, they are drawn from the empirical distribution function, or EDF, of the üt. This function assigns probability 1/n to each of the üt. Thus, each of the bootstrap error terms can take on n possible values, namely, the values of the üt, each with probability 1/n.

Parametric Bootstrap
The reason parametric bootstrap often works well is that least squares estimates and test statistics are generally not very sensitive to the distribution of the error terms. Of course, interest lies on when the distribution is assumed to be known, the parametric bootstrap DGP is; * = + * , * ~ 0, Here it is assumed that the errors are normally distributed, and so the bootstrap error terms are independent normal random variates and the usual use ordinary least square estimate of the error variance. Similar methods can be used with any model estimated by maximum likelihood, but their validity generally depends on the strong assumptions inherent in maximum likelihood estimation.

Evaluation Criteria
The following statistics evaluation criteria were used to investigate and understand the bootstrap DGP methods. Also, used to investigate the impact of different proficiency level, bias, standard error, root mean square (RMSE) on SLR models and to estimate the test statistics of the functional models. To achieve this, a satisfactory degree of smoothness for the distributions including nineteen assessment conditions will be used to estimate proficiency level, bias and standard error from the two bootstrap methods, in order to determine the best model. In bias test by [18], [19], a difference of 0.1 standard deviation units is generally considered relatively large, whereas a difference of 0.25 is regarded as very large. This style will be adopted in this study. They ten assessment conditions can be classified into two categories. The first category describes the five factors and the second category describes kernel density, three test lengths and quantilequantile plot. It is pertinent to note that 268,800 scenarios were replicated 1000 times.
These nineteen assessment conditions are described below, starting with; The First Category -Five Factors; a. Factor 1: Bootstrap method; as indicated earlier, the residual and parametric bootstrap methods were considered.
b. Factor 2: Restrictions; in the bootstrap methods the total of two rescaled and seven transformations were carried out to improve the bootstrap data generating processes in the study. c. Factor 3: Degree of group proficiency difference; the populations do not need to be equivalent in ability levels. Therefore, it is essential to investigate this factor since it is reflected by the magnitude of differences in the means of the examinees' ability distributions. The bootstrapped proficiency levels, investigation and evaluation will be described in four forms; N(0, 1), N(0, σ 2 ), N(0, s 2 ), and N , denoted as (X, M, Z Q). Here, we use the standard error to get the group differences except in the last form were both were considered. Though, [20], [21] used only the mean differences in their study. Since σ 2 = 1.00001, the two forms (X, M) are approximately the same based on the simulated values, so they are treated as one form.

Data Analysis
i. The results obtained from the Residual bootstrap data generating process (DGP), when applied on the hypothetical data sets with fixed sample size are as follows;

Interpretation of Results
This section will be based on the hypothetical bootstrap models when bootstrap DGP models with Uncorrelated Error Term from the forms X; M; Z; Q distributions.

Bias and standard error of the SLR a hypothetical data set
As the sample size increased, the bias obtained from all the residual bootstrap models (HR311), decreased at almost all estimated values, which is to be expected because of the property of estimation bias. It can also be noted that although the bias at the estimated range for parametric bootstrap models (HP311) was large (in absolute value), the estimates curves from different parametric bootstrap models were closer to one another when the sample size was 20,000 than when the sample size was 200, Table 1. Across all the conditions considered, models HR311 yielded much smaller bias and standard error than the other models at almost all score points and various conditions. The regression coefficients (b 1 and b 2 ) of HR311 and HP311 have positive relationship with HYPt and also highly significant at 5% level.
A general observation is that across different group proficiency levels, as the sample size increased, the standard error reduced; meanwhile, the differences among the different parametric bootstrap models were becoming more similar. As with the conditional bias, as the bootstrap level increased, the bias generally decreased. In Test 1, across the three different sample sizes and the three different group proficiency levels, the largest the bias and standard error were always associated with model HP311 while the smallest bias and standard error values were associated with models HR311. However, the difference in the bias between model HR311 and model HR311 in Test 2 was less than 0.0003. For Test 3, the results for including or excluding the lower end scores were not different, for HR311 still yielded the smallest bias and standard error, see Table 1 and 2. The exception was found for the group proficiency level 3-N(1, 0.25), when the sample sizes were 200, 1000 and 10000 instead of model HR311 producing the smallest the bias and standard error, model HP311 did. Note. The bold is the smallest value in each row.

Conclusion
The main findings are that, under all bootstrap conditions, the HR311 functional models produced smaller bias and standard error than HP311 functional models. The regression coefficients (b 1 and b 2 ) of HR311 and HP311 have positive relationship with HYPt and are also highly significant at 5% level. The result shows a significant difference between the performances of the bootstrap methods used, namely; residual and parametric bootstrap techniques. From the analysis, the largest bias and standard error were always associated with model HP311 while the smallest bias and standard error values were associated with models HR311. The exception was found for the group proficiency level 3-N(1, 0.25), when the sample sizes were 200, 1000 and 10000 instead of model HR311 producing the smallest bias and standard error, model HP311 did. The significantly better performance of the residual bootstrap indicates the possible use of this technique in assessment of comparative performance and the capability of yielding very accurate, efficient and sufficient statistical inference under the several assessment conditions considered.