Regularized Nonlinear Least Trimmed Squares Estimator in the Presence of Multicollinearity and Outliers

This study proposes a regularized robust Nonlinear Least Trimmed squares estimator that relies on an Elastic net penalty in nonlinear regression. Regularization parameter selection was done using a robust cross-validation criterion and estimation through Newton Raphson iteration algorthm for the oprimal model coefficients. Monte Carlo simulation was conducted to verify the theoretical properties outlined in the methodology both for scenarios of presence and absence of multicollinearity and existence of outliers. The proposed procedure performed well compared to the NLS and NLTS in a viewpoint of yielding relatively lower values of MSE and Bias. Furthermore, a real data analysis demonstrated satisfactory performance of the suggested technique.


Introduction
The knowledge of nonlinear regression is one of the widely used models in analyzing the effect of explanatory variables on a response variable. For example, Khademzadeh et al. [9] used MapReduce to model large-scale nonlinear regression problems, Ramalho and Ramalho [13] used moment based estimation of nonlinear regression models with boundary outcomes and endogeneity to nonnegative and fractional responses. Moreover, Tabatabai et al. [15] used robust nonlinear regression to estimate drug concentration, and tumor size-metastasis.
The computational field of statistics has developed over time with the introduction of the computer which has increased statistical methods and algorithms with numerous applications capable of generating and storing significant amounts of data. Various studies in Genomics, Medicine, Epidemiology, Marketing and basic sciences meet data sets whereby the effects of collinearity arise in analysis amongst their study variables [2].
Regularization techniques have widely been used for the solution of ill-posed problems occurring in the use of maxi-mum likelihood or least squares methods and have proved successful in several fields including regression analysis and machine learning [16]. The methodology involves the addition of restrictions or penalty to a model [14], with the objective of preventing overfitting [11]. It also improves predictive accuracy in situations where there exist many predictor variables, high collinearity, seeking for a sparse solution or accounting for variables grouping in the high-dimensional dataset [5]. Tikhonov [17] developed this mathematical technique of regularization while working on the solution of ill-posed problems. In vast literature, choices of penalty functions such as, , , or norms are available. Ando, Konishi and Imoto [1] introduced radial basis functions with hyperparameter and constructed nonlinear regression models with the help of regularization. Tateishi, Matsui and Konishi [16] constructed nonlinear regression models with Gaussian basis functions using weighted type regularization for analyzing data with complex structures. Farnoosh, Ghasemian and Fard [4] proposed a weighted ridge penalty on a fuzzy nonlinear regression model using fuzzy numbers and Gaussian basis functions. Jiang, Jiang and Song [7] developed weighted composite regression estimation and used the Adaptive Lasso and SCAD regularization to achieve a simultaneous parameter model estimation and selection. Zucker et al. [19] developed an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation and Hang et al. [6] proposed a graph regularized nonlinear ridge regression (RR) model for remote sensing data analysis, including hyper-spectral image classification and atmospheric aerosol retrieval. Although these regularization methods have shown exceptional performance in various fields, it uses the least squares loss function which is influenced by outliers [12]. Outlier resistant regularized methods have been developed by replacing the least squares loss function with the robust technique [5]. Amongst these methods are Huber's M-estimators, MM-estimators, Least Trimmed Squares, Least Median Squares estimators. Lim [10] proposed robust Ridge regression estimation procedures for nonlinear models with varying variance structure.
Regularization parameter selection is an essential problem in Lasso-type methods since effective variable selection, and model estimation depends on adjusted parameters. AIC, BIC, Cross-validation, Mallow's C p criterion and generalized information criterion have been suggested for choosing regularization parameters. However, the Lasso-type penalty cannot be analytically derived because they are not differentiable and local quadratic approximation, LARS, and Coordinate descent algorithms have been developed to settle this issue.
Elastic Net penalty, a regularization method, has been established to encourage grouping effect when predictors are highly correlated and also useful when there exists a large number of predictors than that of the observations in linear regression [18]. Section 2 introduces the proposed regularized robust nonlinear Least Trimmed Squares estimator with an Elastic net penalty in nonlinear regression models. In section 3, the procedure for estimation and parameter selection of the proposed methodology is derived. Lastly, the efficiency of the strategy is investigated through Monte Carlo study and real data analysis in section 4.

Regularized Nonlinear Least Trimmed Squares Regression
The nonlinear regression model has the form = ( , ) . = 1, 2, , ⋯ However, the error distribution in the model (1) may exhibit heavy tails and asymmetry due to data inadequacies often caused by measurement or recording errors. Robust techniques are commonly used to accommodate violation of the assumptions that such errors cause.
Nonlinear Least Trimmed Squares estimator (NLTS) is a high breakdown regression technique which safeguards against wild observations [3]. It derives the parameters of the model (1)  order Taylor series expansion about an initial value 0 β . This is given by is of full rank p, the step is unique and is presented as However, due to the presence of multicollinearity, the inversion may not be useful. The study proposes to obtain parameter model estimates to enhance the estimation performance of NLTS by minimizing the objective function The objective function (4) above combines NLTS loss function with an Elastic net penalty envisaged to produce stable model estimates when multicollinearity is present with the existence of outliers. The NLTS-ELnet estimator is based only on s observations having small residuals to relax the effect of outliers satisfying < 2 n s n ≤ . The trimming constant s defines the breakdown point of the estimator, and a breakdown point of s = 0.75n is considered, to include a sufficient number of observations [8].

Parameter Selection and Estimation of the NLTS-Elnet Estimator
The objective function (4) above can be simplified to However, the penalty term (6) cannot be analytically derived since it is not differentiable at zero. The local quadratic approximation of a Lasso-type penalty has been proposed to settle this drawback. Suppose 0 β is given as an initial value of the minimizer of equation (5), then the local quadratic approximation of the derivative of the penalty term is given by Where for The regression parameter vector β is derived using Newton-Raphson iterative method given by The study proposes to use the robust Generalized Cross Validation approach to select regularization parameters of the suggested procedure. The criterion is given by

Data Analysis and Discussion
In this section, Monte Carlo study is conducted to investigate the behaviour of the proposed NLTS-Elnet estimator on simulated and real data with a comparison of its performance to NLS and NLTS. The Mean Squared Error and Bias are considered in evaluating their performance. A breakdown point of s = 0.75n is considered to include a sufficient number of observations. Standard normal error distribution with 30% Uniform (-1, 4) and Student ' s t (3) contamination are considered. Subsection (4.1) studies the properties of NLTS-Elnet, NLTS and NLS estimators using simulated data sets. Subsection (4.2) exhibits the behaviour of NLTS-Elnet, NLS and NLTS methods on SENIC data set obtained from the Hospital Infections Program for the 1975-76 study period.

Study on the Regularized Least Trimmed Squares Estimator
In this subsection, a Monte Carlo study is carried out to evaluate the performance of the proposed nonlinear regression estimator. Samples of size 10, 20, 40, 80 and 200 are generated from an exponential regression model (10) below adopted from [7].      Tables (1), (2) and (3) displays the MSE and Bias under standard normal errror distribution, and in the presence of Uniform and students t error contaminations. Moreover, parameter estimates of model (18) are displayed where the three covariates 1 x , 2 x , and 3 x are highly correlated ( ρ The methodologies perform poorly when n=10 and improves as the sample increases. Furthermore, NLTS-Elnet has the lowest values of MSE under all the considered error distributions. Its values also seem to decrease consistently from n=10 to n=200 except for U (-1, 4) unlike for NLS and NLTS. Moreover, the model estimates of the proposed method appear to approach a stable value faster than those for the considered methods.

Real Data Application
In this section, the NLTS-Elnet, NLS and NLTS estimators are applied to SENIC data set obtained from the Hospital Infections Program for the 1975-76 study period. Model (11) below is used to model the relationship between the infection rate (y) and length of stay , routine culturing ,   Table (5) are comparably the same and lowest for the proposed methodology.

Conclusions
Regularized robust Nonlinear Least Trimmed Squares estimator has been proposed in this study by adding an Elastic net penalty to the NLTS loss function. Robust generalized cross-validation was used to select the regularization parameters robustly. It can be seen through Monte Carlo study and real-world data example that the proposed methodology performs well in several situations compared to the NLS and NLTS in the viewpoint of yielding relatively lower values of MSE and Bias when there is the presence of multicollinearity. A breakdown point of 0.75n was predetermined to enhance efficiency by including the majority of the data points. Future work remains to be done towards considering the simultaneous selection of optimal values of regularization parameters and the breakdown point. Also, variable selection behaviour of the proposed estimator in a large number of predictors could be a possible problem which needs to be explored.