Poisson Ridge Regression Estimators: A Performance Test

In Multiple regression analysis, it is assumed that the independent variables are uncorrelated with one another, when such happen, the problem of multicollinearity occurs. Multicollinearity can create inaccurate estimates of the regression coefficients, inflate the standard errors of the regression coefficients, deflate the partial t-tests for the regression coefficients, give false p-values and degrade the predictability of the model. There are several methods to get rid of this problem and one of the most famous one is the ridge regression. The purpose of this research is to study the performance of some popular ridge regression estimators based on the effects of sample sizes and correlation levels on their Average Mean Square Error (AMSE) for Poisson Regression models in the presence of multicollinearity. As performance criteria, average MSE of k was used. A Monte Carlo simulation study was conducted to compare performance of Fifty (50) k estimators under four experimental conditions namely: correlation, Number of explanatory variables, sample size and intercept. From the results of the analysis as summarized in the Tables, the MSE of the estimators performed better in a lower explanatory variables and an increased intercept value. It was also observed that some estimators performed better on the average at all correlation levels, sample sizes, intercept values and explanatory variables than others.


Introduction
In multiple regression analysis, predictions are made on one variable on the basis of several other variables. For example, suppose we were interested in predicting the level of exposure of an individual, variables such as age, gender, environment, academic qualifications and occupation might all contribute towards level of exposure. Another example could be when analyzing why people go to the cities; job opportunities, availability of basic amenities, schools and standard of living are important factors. In both examples, some or all covariates would be highly correlated. In the case of highly correlated explanatory variables, it is usually impossible to interpret the estimates of the individual coefficients. Such a problem is often referred to as the multicollinearity.
Multicollinearity exists whenever two or more of the predictor variables in a regression model are moderately or highly correlated. Multicollinearity can create inaccurate estimates of the regression coefficients, inflate the standard errors of the regression coefficients, deflate the partial t-tests for the regression coefficients, give false p-values and degrade the predictability of the model.
The Ridge Regression method is a well-known efficient remedial measure in the presence of multicollinearity. This method was first introduced by Schaeffer et al [16] and they have shown by both analytically and means of Monte Carlo simulations that this method has a smaller total Mean Square Error (MSE) than the OLS estimator. They suggested that a small positive number ( > 0) be added to the diagonal elements of the ′ matrix from the multiple regression, and the resulting estimator is obtained as and this is known as the ridge regression (RR) estimator. Schaeffer et al [16] Different techniques for estimating the ridge parameter k have been suggested by a lots of researchers; [Hoerl and Kennard [6], Hoerl and Kennard [7], Algamal and Alanaz [3], Asar and Gen [4], Asar and Gen [5], Kaciranlar and Dawound [8], Kibra et al [12], Mansson and Shukur [13], Alkhamisi et al [1], Alkhamisi and Shukur [2], Muniz and Kibria [14], to mention but a few. However, the work on the estimation of the ridge parameter under the Poisson Regression model is limited. Schaeffer et al [16] worked on the Simple Poisson Regression model, Muniz and Kibria [14] considered the Poisson RR models, Kibria et al [11] generalized some estimators of the ridge parameters proposed for logistic regression by Kibria et al [12] for Poisson ridge regression and Zaldival [17] considered the performance of some Poisson Ridge Regression Estimators.
One of the standard statistical method for analyzing count data is the Poisson Regression (PR) model. This model has found a widespread use in microeconometrics when the dependent variable y of the regression model is Poisson distributed. In the presence of multicollinearity, when estimating the parameters for Poisson regression model using the maximum likelihood (ML) method, the estimated parameters become instable with high variance, resulting in an increase in the probability of conducting a Type II error in any hypothesis testing regarding the estimated parameters Kibria et al [11].
Several techniques for estimating the ridge parameter k have been suggested by a lot of researchers; The purpose of this article is to determine the best estimators among 50 selected ridge parameters estimators (k), based on the effect of sample sizes and levels of correlation on the performance of the 50 selected estimators. The performance of the estimators is judged by the Average MSE of k.

Literature
Zaldivar [17] investigated some Ridge Regression (RR) estimators for estimating the ridge parameter for the poisson regression model and proposed five new estimators. A simulation study was conducted to compare the performance of some of the estimators in literature with the newly proposed estimators using the average Mean Square Error (MSE), percentage of times Poisson Ridge Regression (PRR) outperforms MLE and the Average Mean Absolute Percentage Error (AMAPE) as performance criteria, these five new estimators performed well, producing small MSE values. A real life data was also used to illustrate the findings.
Kibria et al [11] conducted a simulation study of some biasing parameters for the ridge type estimation of Poisson regression, they generalized some estimators proposed for logistic regression by Kibria [12]. In the work they included the average value of and the standard deviation of as performance criteria aside the Mean Square Error (MSE), these performance criteria are very informative because if several estimators have equal estimated MSE, then those with low average value and low standard deviation of should be preferred.
Muniz and Kibria [14] proposed a Poisson Ridge Regression (PRR) estimator, by means of Monte Carlo simulations they evaluated the traditional ML estimator and this new method using different estimators of the ridge parameter . The result from the simulation study showed that the sample size, the value of the intercept, the number of independent variables and the correlation between the independent variables are important factors for the performance of the different estimation methods. The result also showed that the proposed PRR method, regardless which ridge estimator used, has a lower MSE than the ML method for all different situations that has been evaluated. Many researchers, Hoarl and Kennard [6], have worked on Regression and Poison estimator and their conclusion varied. Hoerl and Kennard [6], Hoerl and Kennard [7], Algamal and Alanaz [3], Asar and Gen [4], Asar and Gen [5], Kaciranlar and Dawound [8], Kibra et al [12], Mansson and Shukur [13], Alkhamisi et al [1], Alkhamisi and Shukur [2], Khalaf Ghazi [9], Kibria [10].

Methodology
This section starts by defining the Poisson Regression model and the traditional ML estimation method.

The Poisson Regression
The Poisson Regression is similar to the regular multiple regression except that the dependent variable ( ) is an observed count that follows the poisson distribution. Thus the possible values of ( ) are the non-negative integers: 0, 1, 2, 3, and so on. The model of this regression is given as: Where is an × 1 vector of responses that is poisson distributed, = exp ( ) is an × ( + 1) data matrix with explanatory variables, and is an × 1 vector of random errors.
The parameters of this model are estimated using the Maximum Likelihood (ML) method and the following iterated weighted least-square algorithm: Where %̂ is a vector where the th element equals %̂ = log(*̂ ) + + ,-. / , 0 / , and # $ is a matrix where the off-diagonal elements are equal to zero and the th diagonal element is equal to *̂ . Where *̂ = exp( ) and is the 12 row of .

Poisson Ridge Regression
In the presence of multicollinearity, the weighted matrix of cross products ( ′# $ ) is near singular. For this model, the following extension of the linear ridge regression estimator (RRE) was proposed by Mansson and Shukur [13].
They showed that the above estimator approximately reduces the increase of the weighted Sum of Squared Error. The reason for Poisson Ridge Regression (PRR) is to find a value of k that is large enough, but not so big that it causes a lot of bias. When such a k is found, then the MSE of the RIDGE regression (RR) estimator will be smaller than that of the ML estimator. The MSE of RR equals

Some Methods for Estimating the Ridge Parameter (I)
Various methods have been suggested by different researchers for estimating the ridge parameter .
Hoerl and Kennard [6] were the first to propose a ridge parameter estimator and their estimator formed the basis upon which other estimators were proposed. The first estimator is Where B . Muniz et al [14] and Kibria et al [12] proposed k 5 to k 8 given as l max = p X ? q (11) r ig = 5i D 6 (12) Kibria et al [11] proposed k 9 to k 16 given as  where { D is the € 12 eigenvalue of ′# $ .
Muniz et al [14] proposed the following estimators, based on square root transformations:

Performance of the Estimators
To investigate the performance of the Poisson Ridge RR and the Maximum Likelihood ML method, the Average MSE (AMSE) are obtained using the following equation: where H = number of replicates = 1000 Where is the estimator of obtained from ML or PRR and SE is the squared error.

The Monte Carlo Simulation
This section consists of a brief description of how the data is generated and the factors varied in the simulation study.

The Design of the Experiment
The dependent variable (y) of the Poisson Regression Model is generated in R using pseudo-random numbers from the poisson distribution, with mean (* ) = e ; " @; U Z ,U @⋯@; _ Z ,_ ; =1, 2,…,n The parameter values in equ (4.1) are chosen so that ∑ D 9 = 1 DE and = 9 = ⋯ = . These are common restrictions in many simulation studies (zaldivar, 2018).
The independent variables are generated as follows: Where › D are pseudo-random numbers from the standard normal distribution, š 9 represents the degree of correlation. In the design of the experiment, three values of š 9 are considered which are 0.85, 0.90, 0.99. Other factors that were varied includes sample size ( ) =50 and 200, number of explanatory variables ( ) = 4 and 8 and intercept value ( v ) = -1, 0, 1.

Results / Findings
In this work, Multicollinearity was discussed. Some

Conclusion
Simulations were varied by sample size ( ). The sample sizes used ere = 50 and = 200. Generally, the larger the sample size, the smaller the MSE values. It is also important to note that the sample size affects the performance of individual k estimators. With = 50 g f = 4 , the k estimators with the lowest total MSE values were k 37 , k 36 , k 34 , k 45 , k 46 , k 8 , k 35 , k 47 , k 40 , and k 21 . With = 50 g f = 8, the k estimators with the lowest total MSE values were k 36 , k 37 , k 14 , k 16 , k 12 , k 34 , k 7 , k 8 , k 45 , k 40 , k 46 , k 21 , k 35 , k 29 , and k 27 . While with = 200 g f = 8, Gℎe estimators with the lowest total MSE values were k 36 , k 37 , k 12 , k 14 , k 16 , k 34 , k 40 , k 45 , k 46 , k 7 , k 35 , k 8 , and k 47 .
Simulations were also varied by the number of explanatory variables ( ), with = 4 g f 8. It was discovered that the total MSE increases with an increase in , therefore the PRR is best used when the number of explanatory variable is small.
Estimators were judged by their total MSE value for the different sample sizes, correlation, number of explanatory variable and intercept. The estimators that produced the lowest total MSE values were selected and they are as follows: k 34 , k 40 , k 45 , k 35 , k 36 , k 12 , k 47 , k 21 , k 37 and k 8 .
In conclusion, the MSE of the estimators performed better in an increased explanatory variables and an increased intercept value. It was also observed that \r , 9 , \s , a and r performed better on the average at all correlation levels, sample sizes, intercept values and explanatory variables