Efficiency Comparisons of Different Estimators for Panel Data Models with Serially Correlated Errors: A Stochastic Parameter Regression Approach

This paper considers panel data models when the errors are first-order serially correlated as well as with stochastic regression parameters. The generalized least squares (GLS) estimators for these models have been derived and examined in this paper. Moreover, an alternative estimator for GLS estimators in small samples has been proposed, this estimator is called simple mean group (SMG). The efficiency comparisons for GLS and SMG estimators have been carried out. The Monte Carlo studies indicate that SMG estimator is more reliable in most situations than the GLS estimators, especially when the model includes one or more non-stochastic parameter.


Introduction
In panel data models, the pooled least squares estimator is the best linear unbiased estimator (BLUE) under the classical assumptions as in the general linear regression model. These assumptions are discussed in [1,2]. An important assumption for panel data models is that the individuals in the database are drawn from a population with a common regression parameter vector. In other words, the parameters of a classical panel data model must be non-stochastic. In particular, this assumption is not satisfied in most economic models, see, e.g., [3,4]. In this paper, panel data models are studied when this assumption is relaxed. In this case, the model is called stochastic parameter regression (SPR) model. This model has been examined in several publications such as [5][6][7][8][9][10][11][12][13]. Some statistical and econometric publications refer to this model as Swamy's model, e.g., [14][15][16][17][18].
In SPR model, Swamy [5] assumed that the individuals in the database are drawn from a population with a common regression parameter, which is a non-stochastic component, and a stochastic component, that will allow the parameters to differ from unit to unit. This model has been developed by many researchers, see, e.g., [19][20][21].
Generally, the SPR models have been applied in several fields, especially in finance and economics, and they constitute a unifying setup for many statistical problems. For example, Boot and Frankfurter [22] used the SPR model to examine the optimal mix of short and long-term debt for firms. Feige and Swamy [23] applied this model to estimate demand equations for liquid assets, while Boness and Frankfurter [24] used it to examine the concept of riskclasses in finance. Recently, Westerlund and Narayan [25] used the stochastic parameter approach to predict the stock returns at the New York Stock Exchange.
The main objective of this paper is to provide the researcher with some guidelines on how to select the appropriate estimator of panel data models when the parameters are stochastic and mixed-stochastic. To achieve this objective, the conventional estimators of these models in small samples are examined. Also, an alternative consistent estimator of these models has been proposed under an assumption that the errors are first-order serially correlated.
The rest of the paper is organized as follows. Section 2 provides generalized least squares (GLS) estimators in case of the parameters of the model are stochastic. Section 3 presents an appropriate estimator when the parameters are mixed-stochastic. In section 4, an alternative estimator of these models has been proposed. Section 5 contains the results of Monte Carlo simulation studies. Finally, section 6 offers the concluding remarks.

The Model with Stochastic Parameters
Let there be observations for cross-sectional units over time periods. Suppose the variable for the th unit at time is specified as a linear function of strictly exogenous variables, , in the following form: where denotes the random error term, x is a 1 × vector of exogenous variables, and is the × 1 vector of regression parameters. Stacking (1) over time: it is assumed that in the initial time period the errors have the same properties as in subsequent periods. So, assume that: The exogenous variables are non-stochastic (in repeated samples), and then assume independent with other variables in the model. And the value of 7<=> = ; ∀ = 1, … , , where < , . Assumption 4: The vector of regression parameters is specified as: = ̅ + @ , where ̅ = ) ̅ , … , ̅ + is a vector of non-stochastic parameter and @ = @ , … , @ is a vector of random variables with zero means and constant variance-covariances: where A * = D <EFG H; for > = 1, . . , . And assume also that )@ * + = 0 ∀ and ,.
Using assumption 4, the model in (2) can be rewritten as: where L = , , … , N , = , , … , N , = , … , N , @ = @ , … , @ N , and M = D <EF H; for = 1, … , . Under assumptions 1 to 4, the BLUE of ̅ and the variance-covariance matrix of it are: where T * = ] + M ^N⨂A * M , with where * = Ω % % Ω % , with It is noted that the ̅ O PQRPS can be rewrite as a weighted average of GLS estimator for each cross-sectional unit: where To make the ̅ O PQRPS estimator feasible, we suggest using the following consistent estimators for $ and / 0 1 : where y = y , … , y = − O ; O = % , while &̂ = &̂ , &̂ , … , &̂ ; &̂ = y {1 − $ r , and &̂ = y − $ r y , % for = 2, … , . 1 By replacing $ by $ r in Ω matrix, it gives consistent estimators of Ω , say Ω € . Use of / y 0 1 and Ω € to get consistent estimators of ] and A * , say ] r and A r * . By using consistent estimators (/ y 0 1 , Ω € , and A r * ), it gives a consistent estimator of T * , say T O * . And then use T O * to get a feasible estimator of ̅ O PQRPS . Note that in non-stochastic parameter model, we assume that the errors are cross-sectional heteroskedasticity as well as they are first-order serially correlated. However, the individuals in the database are drawn from a population with a common regression parameter vector ̅ , i.e., = ⋯ = N = ̅ . Therefore the BLUE of ̅ , under assumptions 1 to 3, is: this estimator has been termed pooled least squares (PLS) estimator. Using ] r that defined above, it gives the feasible (FPLS) estimator of PLS.
In standard stochastic parameter model that presented by Swamy [5], he assumed that the errors are cross-sectional heteroscedasticity and they are serially independently. As for the parameters, he assumed the same conditions in assumption 4. Therefore, the BLUE of ̅ , under Swamy's [5] assumptions, is: To make the ̅ O PQR estimator feasible, Swamy [27] used the following unbiased and consistent estimator for / : where y is defined in (7). Swamy [6,7] showed that ̅ O PQR estimator, under Swamy's [5] assumptions, is consistent as both , → ∞ and is asymptotically efficient as → ∞.
It is worth noting that, just as in the error-components model, the estimates values of A * and A are not necessarily non-negative definite. So, expect to obtain the negative values of the estimated variances of ̅ O PQRPS and ̅ O PQR . To avoid this problem, it can use the following consistent estimators for A * and A: 1 The estimator of $ in (7) is consistent, but it is not unbiased. See [26] for other suitable consistent estimators of it that are often used in practice.
Swamy [5] suggested use A r † if one finds the estimated variance of ̅ O PQR is negative. 2 Although that these estimators A r * † and A r † ) are biased but they are non-negative definite and consistent when → ∞, see [16,28]. Moreover, these estimators may be suitable in case of moderate or large samples but they are not suitable for small samples.

The Model with Mixed-Stochastic Parameters
In this section, the GLS estimator for the model with mixed (stochastic and non-stochastic) parameters will be derived. In this case, the (mixed SPR) model can be written as: where and are defined in (2), ‡ = , where and are × and × matrices of observations on and explanatory variables, respectively. ˆ = , , where is a × 1 vector of parameters assumed to be stochastic with mean ̅ and variancecovariance matrix A ‰ , and is a × 1 vector of parameters assumed to be non-stochastic, where + = . The model in (8) applies to each of cross-sections. Under suppose that = ̅ + @ ‰ , these individual equations can be combined as: where L is defined in (3), ‡ = ‡ , … , ‡ N , Š = ) ̅ , + , and ‹ = ‹ , … , ‹ N , where ‹ = @ ‰ + . Under Swamy's [5] assumptions, this model has been examined by Swamy [27] and Rosenberg [30]. However, in this paper, this model under assumptions (1 to 4) will be examined, therefore the variance-covariance matrix of ‹ is: The GLS estimator of Š is: Where = , … , N and = , … , N . Since the mixed SPR model is a special case of the SPR model when the variances of certain parameters are assumed to be equal to zero, therefore it can get the feasible estimator for Š r by the following algorithm: Step 1: Calculate A r * as in (5), by using consistent estimators of / 0 1 and Ω as given in (7).
Step 2: Find the estimation of A ‰ , say A r ‰ , by removing the rows and columns for non-stochastic parameter (that within vector) from A r * matrix.
Step 3: Find the estimation of Π, say Π € , by using A r ‰ and consistent estimators in (7).
Step 4: Finally, using Π € in (10) to get the feasible estimator for Š r .
The main point in this algorithm is step 2, i.e., how determine the non-stochastic parameters in the model. It needs to a statistical test for randomness of parameters. In this paper, Swamy's [5] test will be used. The basic idea of this test; since @ is fixed for every , as given in assumption 4, so it becomes possible to test of random variation indirectly by testing whether or not the non-stochastic parameters vectors are all equal. That is, the null hypothesis is: The test statistic is: where where ‚ r ƒ is the estimated matrix of ‚ ƒ . Swamy [5] showed that, under ' 9 , the test statistic in (11) is asymptotically chisquare distributed, with − 1 degrees of freedom, as → ∞ and is fixed. It can apply Swamy's [5] test on Mixed SPR model as in SPR model. Beginning, suppose that mixed SPR model in (8) can be rewritten as: 3 where = ˜ ,˜ , where ˜ is a ℎ × 1 vector of stochastic parameters to be included in a test of some hypotheses, and ˜ is a ℎ × 1 vector of stochastic parameters, but these are to be excluded from the test; = -, -, whereandare × ℎ and × ℎ matrices, respectively, of observations on independent variables; and all other terms were defined when discussing equation (8). As previously noted, the Mixed SPR model can be rewritten as: where L , , and ‹ are defined in (3), (10), and (9), respectively, -= -, … , -N , -= -, … , -N , and Š and Š are means of stochastic parameters ˜ and ˜ , respectively.
In the Mixed SPR model, procedures are available to 3 See [2] for more information about this test. test the following hypothesis for randomness of parameters: This is analogous to the indirect test for randomness in the SPR model. In this case, there may be a subset of parameters which are initially assumed stochastic but which are to be tested for randomness. In this case, the test statistic that can be used to conduct the test is: where Š r is the estimated vector of parameters assuming they are non-stochastic and r (for = 1, . . , ) are the separate estimates of the parameters. If the null hypothesis is accepted, the parameters are non-stochastic and should be treated in the manner of the vector of parameters in (12). But if the null hypothesis is rejected, the parameters ˜ are treated as stochastic.

An Alternative Estimator
Generally, It is easy to verify that under assumptions 1 to 4 the PLS and SPR are unbiased for ̅ and with variancecovariance matrices: The efficiency gains, from the use of SPRSC estimator, it can be summarized in the following equations: where š 9 = T * % % T * % . Since T, ] and T * are positive definite matrices, then oe Q•P and oe PQR matrices are positive semi-definite matrices. In other words, the SPRSC estimator is more efficient than PLS and SPR estimators. These efficiency gains are increasing when |$ | and/or G are increasing. However, these efficiency gains may be not achieved in practice because T and T * are not consistently positive definite matrices, especially in small samples, as explained above. Therefore, in the following, an alternative estimator will be proposed that more suitable for the model than SPRSC estimator when the sample size is small. Moreover, the properties of this estimator will be studied.
When Swamy [27] From (15), (16), and whereas O is an unbiased estimator for , therefore we will suggest the following estimator as an alternative estimator for SPR and SPRSC: Note that this estimator is the simple average of ordinary least squares estimators ( O ), so it is defined in econometric literature 4 as the simple mean group (SMG) estimator. The SMG estimator is also used by Pesaran and Smith [31] for estimation of dynamic panel data (DPD) models with stochastic parameters. 5 It is easy to verify that SMG estimator is consistent of ̅ when both , → ∞. Moreover, statistical properties of SMG estimator will be explained in the following lemma: Lemma 1: If assumptions 1 to 4 are satisfied, then the SMG is unbiased estimator of ̅ and consistent estimator of the variance-covariance matrix of ̅ O P•´ is: The next lemma explains the asymptotic variances (as → ∞ with fixed) properties of SPRSC, SPR, and SMG estimators. definite for all , then the estimated asymptotic variancecovariance matrices of SPRSC, SPR, and SMG estimators 4 Such as [12,17]. 5 For more information about the estimation methods for DPD models, see, e.g., [29,[32][33][34][35][36]. are: Lemma 2 shows that the means and the variancecovariance matrices of the limiting distributions of SPRSC, SPR, and SMG estimators are the same and are equal to ̅ and N A + respectively even if the errors are correlated as in assumption 2. Therefore, it is not expected to increase the asymptotic efficiency of SPRSC about SPR and SMG. This does not mean that the SPRSC estimator cannot be more efficient than SPR and SMG in small samples when the errors are correlated as in assumption 2, this will be examined in the following Monte Carlo simulation.

The Simulation Studies
In this section, two Monte Carlo simulation studies will be conducted. In first, examine the problem of negative variance estimates and the power of Swamy's test in different models (non-stochastic, stochastic, and mixed-stochastic) when the sample size is small and moderate. While in the second, make comparisons between the behavior of the pooled least squares ( ̅ O Q•P ), simple mean group ( ̅ O P•´) , and stochastic parameter ( ̅ O PQR , ̅ O PQRPS , and Š r •PQRPS ) estimators in small samples. The programs to set up the Monte Carlo simulation studies, written in R language, are available upon request. 6 Monte Carlo experiments were carried out, in the two studies, based on the following data generating process: x ̅ + x @ + , = 1, … , ; = 1, … , , where x = 1, , ̅ = ) ̅ 9 , ̅ + , and @ = @ 9 , @ .

First Study: Negative Variance Estimates and the Power of Test
In this study, the model in (19) was generated as in the second simulation study below but after replacing the following: $ = 0, G =5, ¶ = 10000, and = = 5, 10, 20, 25, and 50. The simulation results are summarized in figures 1 and 2. Specifically, Figure 1 presents the percent of negative variance (PNV) estimates of A r . While the results for the power of Swamy's test are presented in Figure 2. Figure 1 indicates that the values of PNV are not appearing when = ≥ 10 if the parameters are stochastic. However, if one or more of the parameters is non-stochastic, the values of PNV are close to zero when = ≥ 50. Moreover, the values of PNV are increasing when the value of / 0 is increased. Figure 2 indicates that when the parameters are stochastic the power of test is very high (close to one) when = ≥ 20 . However, if one or more of the parameters is nonstochastic, the power of test is still low even if = ≥ 50. Moreover, the power of test is increasing when the value of / 0 is decreased.
From figures 1 and 2, conclude that if the sample size ( and/or ) less than 20, the efficiency of stochastic parameter estimators ( ̅ O PQR , ̅ O PQRPS , and Š r •PQRPS ) is very affected. Therefore, the efficiency of these estimators in small samples will be examined.

Second Study: The Performance of Estimators in Small Samples
In this study, the model in (19) was generated as follows: 1. The values of the independent variable, , were generated as independent normally distributed random variable with mean 5 and standard deviation 10. The values of were allowed to differ for each crosssectional unit. However, once generated for all N crosssectional units the values were held fixed over all Monte Carlo trials. 2. The parameters, 9 and , were generated as in assumption 4: = 9 , = ̅ + @ , where the vector of ̅ = 5, 5 , and @ were generated as multivariate normal distributed with mean zero vector and a variance-covariance matrix A * = D <EFG H; > = 0, 1. The values of G were chosen to be fixed for all > and equal to 0 or 25. Note that when G = 0 , the parameters are non-stochastic. To compare the small samples performance for the different estimators, the three different types of regression parameters (non-stochastic, stochastic, and mixed-stochastic) have been designed in this simulation study. To raise the efficiency of the comparison between these estimators, the relative efficiency ratio (RER) for each estimator has been calculated. The RER of any estimator, for a Monte Carlo experiment, is calculated by: where the subscript 6 indicates the estimator that it calculated the ratio, while Â indicates the appropriate estimator in each model in this simulation study. For example, The RER value of FPLS estimate of ̅ 9 when the all regression parameters are stochastic is calculated as: Step 1: Calculate the mean of variance for ¶ Monte Carlo trials for FPLS and SPRSC estimators: where U<7 µ V ̅ O 9 Q•P W and U<7 µ V ̅ O 9 PQRPS W are obtained using feasible formulas for (13) and (4), respectively.
Step 2: Find the RER value: The simulation results are summarized in figures 3 to 6. Specifically, Figure 3  Specifically, Figure 5 displays the results when the intercept parameter is stochastic and the slope parameter is nonstochastic, we refer to this model as Mixed-stochastic type-I model. Figure 6 displays the inverse case; when the intercept parameter is non-stochastic and the slope parameter is stochastic, also we refer to this model as Mixed-stochastic type-II model. The different formulas of variances of estimators that used in this study are summarized in Table 1.  Figure 3 indicates that the values of LRER for SPR and SMG are very close and almost equal zero for all simulation situations (for every value of / 0 and $), this means that the efficiency of SPR and SMG is close to the efficiency of SPRSC estimator even if / 0 10 and $ .95, then SPR and SMG are good alternatives estimators for SPRSC in stochastic parameter models. But FPLS is inefficient estimator (highest LRER) for this model even if / 0 1 and $ .35. Figure 4 indicates that SPR and SPRSC estimators are greater in LRER than SMG for every value of / 0 and $, this means that SMG estimator is more efficient than SPR and SPRSC estimators and it is a good alternative estimator for FPLS in non-stochastic parameter models. Figures 5 and 6 indicate that FPLS is inefficient estimator (highest LRER) for this model for every value of / 0 and $. Also, SPR and SPRSC estimators are greater in LRER than SMG in most situations, especially the parameter is nonstochastic. Then SMG estimator is more efficient than SPR and SPRSC estimators and it is a good alternative estimator for MSPRSC in mixed-stochastic parameter models.

Conclusion
In this paper, GLS (FPLS, SPR, SPRSC, and MSPRSC) and SMG estimators of panel data models are examined when the errors are first-order serially correlated and the regression parameters are stochastic, non-stochastic, or mixed-stochastic. Efficiency comparisons for these estimators indicate that the SMG and stochastic parameter estimators (SPR, SPRSC, and MSPRSC) are equivalent when sufficiently large. Moreover, the performance of all estimators above has been investigated by Monte Carlo simulations. The Monte Carlo results suggest that, in nonstochastic parameter model, the SMG estimator is more efficient than SPR and SPRSC estimators and then it is a good alternative estimator for FPLS. In stochastic parameter model, the FPLS estimator is not suitable for this model but SPR and SMG are good alternatives estimators for SPRSC in this model. While in mixed-stochastic parameter model, the SMG only is a good alternative estimator for MSPRSC. Consequently, it concludes that the SMG estimator is suitable to the three models, especially in small samples and the model includes one or more non-stochastic parameter.