On the Restricted Poisson Ridge Regression Estimator

: For modeling count data, the Poisson regression model is widely used in which the response variable takes non-negative integer values. However, the presence of strong correlation between the explanatory variables causes the problem of multicollinearity. Due to multicollinearity, the variance of the maximum likelihood estimator (MLE) will be inflated causing the parameters estimation to become unstable. Multicollinearity can be tackled by using biased estimators such as the ridge estimator in order to minimize the estimated variance of the regression coefficients. An alternative approach is to specify exact linear restrictions on the parameters in addition to regression model. In this paper, the restricted Poisson ridge regression estimator (RPRRE) is suggested to handle multicollinearity in Poisson regression model with exact linear restrictions on the parameters. In addition, the conditions of superiority of the suggested estimator in comparison to some existing estimators are discussed based on the mean squared error (MSE) matrix criterion. Moreover, a simulation study and a real data application are provided to illustrate the theoretical results. The results indicate that the suggested estimator, RPRRE, outperforms the other existing estimators in terms of scalar mean squared error (SMSE). Therefore, it is recommended to use the RPRRE for the Poisson regression model when the problem of multicollinearity is present.

Another technique for dealing with multicollinearity is to include exact linear restrictions for the parameters in addition to regression model. Duffy and Santner [9] presented the restricted maximum likelihood estimator (RMLE) in logistic regression model. Then, by combining the RMLE and the biased estimators, many restricted biased estimators were presented in the context of logistic regression model such as, the restricted logistic ridge regression estimator (RLRRE) by Saleh and Kibria [28], the restricted logistic Liu regression estimator (RLLRE) by Şiray et al [29], and the restricted logistic Liu-type regression estimator (RLLTRE) by Asar et al [5]. Recently, Månsson and Kibria [17] introduced both the unrestricted and restricted Poisson Liu regression estimators for the Poisson regression model. This paper aims to suggest a new restricted estimator named as, restricted Poisson ridge regression estimator (RPRRE) to handle the problem of multicollinearity in Poisson regression model by combining the RMLE and the PRRE, and compare the estimators considered in this paper with the suggested estimator through a simulation study and a real data application. The rest of this paper is planned as follows. In Section 2, the Poisson regression model specification and estimation are given. In Section 3, the restricted Poisson ridge regression estimator is suggested and its statistical properties are explained. In Section 4, mean squared error (MSE) matrix comparisons of the suggested estimator, RPRRE, and some existing estimators are derived. In Section 5, a simulation study is conducted to examine the performance of the suggested estimator according to the scalar mean squared error (SMSE) criterion. In Section 6, a real data is analyzed to clarify some of theoretical results. Finally, the conclusion is provided in Section 7.

Poisson Regression Model Specification and Estimation
For analyzing count data, the Poisson regression model is the most widely employed. It assumes that the mean and the variance are equal which is known as equidispersion. Let , = 1, 2, … , is the count random variable that has a Poisson distribution with probability mass function as follows: where = (x ! β) is the mean of the response variable , x = (x # , … , x $ ) ! is i th row of %, which is an × ( + 1) matrix of explanatory variables, and β = (( # , ( ) , … , ( $ ) ! is a ( + 1) × 1 vector of regression coefficients.
Using the maximum likelihood estimation method, one can estimate the Poisson regression coefficients by differentiating the log likelihood function with regard to β and solving them to zero. From (1), the likelihood function is given by Then, the log likelihood function is given as follows: From (3), the first partial derivative with regard to β is given as follows: , Since Equation (4) is nonlinear in β, therefore, the iterative weighted least squares (IWLS) algorithm can be used to get an appropriate solution. Then, the MLE of β can be obtained by β 5 678 = 9 :) % ! ; < = >, Consequently, the MSE and SMSE of β 5 678 are given by where S R is the j th eigenvalue of the matrix, 9. Since multicollinearity between the explanatory variables leads to inaccurate parameters estimation, and large variance for the estimates (Kibria et al [14]). Therefore, many other estimators have been suggested for Poisson regression to tackle this problem. One of the most important estimation approaches is to use biased estimators, among those, the Poisson ridge regression estimator (PRRE) by Månsson and Shukur [20].
The statistical properties of β 5 TUU8 are given by where `R is the j th element of Q ! β, and Q is an orthogonal matrix defined so that An alternative technique to improve estimation duo to multicollinearity is to include prior information for the parameters as in form of exact linear restrictions in addition to regression model. The resulting estimator is called a restricted estimator.
According to (1), assume that the following exact linear restriction is considered for the parameter vector, β e β = ℎ, where e is a g × ( + 1) known matrix, and ℎ is a g × 1 vector of known constants. Then, the restricted maximum likelihood estimator (RMLE) according to Duffy and Santner [9] is as follows: Following Şiray et al [29] and Månsson et al [18], the SMSE of β 5 U678 is as follows: where @ RR is the j th diagonal element of the matrix Q ! hQ, and [ R is the j th element of the vector [! Q.

The Restricted Poisson Ridge Regression Estimator
In this section, following (10) and (17), the restricted Poisson ridge regression estimator (RPRRE) is suggested and is defined by the following form: where X Y = (9 + VW) :) 9, V ≥ 0.
The statistical properties of β 5 UTUU8 are given by Consequently, the MSE of β 5 UTUU8 is given as Following Najarian et al [23], the SMSE of β 5 UTUU8 can be obtained by where O RR = ? @A(k), and k = Q ! e ! (e9 :) e ! ) :) eQ.

The Comparisons of the Estimators
The superiority of the suggested estimator, RPRRE, over the estimators MLE, PRRE, and RMLE is compared in this section using the MSE matrix criterion.
The following Lemmas will be used in the comparisons among the estimators: Lemma 1: Assume that h is a × matrix, and M is a × matrix such that h > 0 and M ≥ 0. Then, h + M > 0.

.2. The RPRRE is always superior to PRRE.
Proof. It is clear that X Y tX Y ! is positive definite. Hence, the RPRRE, is always superior to PRRE in the sense of MSE.

RPRRE Versus RMLE
From (21) and (27), the difference is computed as Therefore, the RPRRE, is superior to RMLE if and only if S pqr (n^m: ) ) < 1.

Simulation Study
In order to assess the performance of the suggested estimator, RPRRE over the MLE, PRRE, and RMLE by means of SMSE , a Monte Carlo simulation study is performed. The response variable of the Poisson regression model is generated by pseudo-random numbers from the , ( ) = (^= ⋯ = ( $ . According to McDonald and Galarneau [21] and Kibria [12], the following formula is considered to generate the explanatory variables: where v^ represents the coefficient of correlation between any two explanatory variables, and x R are independent pseudo-random numbers generated from the standard normal distribution. Four explanatory variables are generated by (33)  The simulation is repeated 1000 times for each combination of , v , and V and the simulated SMSE is computed by where β 5 i is any estimator used in the O th replication. The results of the simulation are given in Tables 1-3.  According to Tables 1-3, it can be noticed that the estimated SMSE values of MLE, PRRE, RMLE, and RPRRE are increasing as the degree of correlation increases. In addition, with the increasing of sample size, the estimated SMSE values of all existing estimators are decreasing. Also, the increases in biasing parameter, V values decrease the estimated SMSE values of PRRE and RPRRE. In all cases, the MLE has the worst performance (having the largest SMSE value). Moreover, for all selected values of , v, and V, the RPRRE has the best performance compared to MLE, PRRE, and RMLE since it has the least SMSE value. Therefore, the RPRRE can be used in practical applications to tackle multicollinearity in Poisson regression.

Real Data Application
In this section, an application of real data of FIFA World Cup in 2018 due to https://www.fifa.com is considered. The data set involves 32 teams in which the response variable is defined as the number of won matches with 5 explanatory variables include the number of goals scored (% ) ), the number of goals conceded (%^), the number of clean sheets (% u ), the number of shoots (% ‚ ), and the number of assists (% ƒ ).
First, for checking the adequacy of fit the Poisson regression model to this data, the residual deviance test is used. The result of residual deviance is 11.121 with 27 degrees of freedom and the − "@…= is 0.9970. It is clear that data set is well fitted to Poisson regression model. Also, the condition number ( In ) is used to check for multicollinearity among the explanatory variables as follows: where S pqr and S p , are the maximum and minimum eigenvalues of the matrix, 9 respectively.  Table 4, and the coefficients of Poisson regression and corresponding standard errors (SE) of this estimators are given in Table 5 for different values of V.
From Table 4, it is obvious that the estimator, RPRRE, has the lowest SMSE value for all different values of V, while the largest is obtained by the MLE which suffers from multicollinearity. Additionally, from Table 5, the standard errors of all coefficients decrease as V increases for all estimators and the RPRRE has the lowest values confirming its superiority over MLE, PRRE, and RMLE.

Conclusion
In this paper, the RPRRE was suggested for Poisson regression model with exact linear restrictions on the parameters to tackle the problem of multicollinearity. Further, based on the MSE matrix criterion, the conditions for superiority of the suggested estimator, RPRRE over the estimators MLE, PRRE, and RMLE are given. Moreover, a Monte Carlo simulation study and a real data application were conducted to evaluate the performance of the RPRRE with the MLE, PRRE, and RMLE according to SMSE criterion. The results indicate that the RPRRE is superior to the MLE, PRRE, and RMLE in the SMSE sense. So, the RPRRE is a better alternative to MLE, PRRE, and RMLE in Poisson regression when multicollinearity is present. Therefore, for future work, it is recommended to use the RPRRE in many applications to tackle the problem of multicollinearity.