Robust Estimation of Finite Population Totals Using a Model Based Approach in the Presence of Two Auxiliary Variables

The utilization of auxiliary information during surveys increases the accuracy of estimators, thereby giving more reliable estimates of the population parameters of interest. It has been established that the presence of more than one auxiliary variables, some more robust estimators can be formed by combining different estimators like product, ratio or even regression estimators and in each case the individual estimators uses its own random variable. One of the most commonly used methods is the ratio method of estimating finite totals which is the foundation of all the other methods that use auxiliary information. In this paper, an estimator of the ratio-exponential class that uses two auxiliary variables has been proposed and its variance derived. After deriving the proposed estimator the coverage probabilities were estimated. Results showed that the interval length of the proposed estimator was narrower and tighter than that of the known Horwitz-Thompson’s estimator. Two datasets from the agricultural and environmental sectors were used in order to investigate the properties of the estimator and they gave satisfactory results. Mean squared error criteria was used to investigate the performance of the proposed estimator and in both cases it had the minimum squared error values. The analysis in these paper is of very great importance in understanding environmental and agricultural data.


Introduction
The main purpose of surveys which are conducted at Local, National and International levels is to gather information and aid public and private sectors in effective policy making [1]. Information regarding a study variable is obtained only for the sampled elements, the way how auxiliary information relates with the study variable across the sample allows inferences on the non-sampled portion of the population. Auxiliary information in the ratio type of estimator was first used in 1820 [2]. Since then, and as may be seen from studies that followed, it has emerged that the more robust estimators are the ones using auxiliary information [1]. In the recent years, researches have proposed estimators that are more efficient in estimating finite population totals using two auxiliary variables [4][5].
It has been established that the presence of multivariate auxiliary variables, some more robust estimators can be formed by combining up different estimators such as ratio, product or even regression estimators and in each case the individual estimators uses its own random variable [6]. Several researchers who have used auxiliary information in the estimation stage of parametric super population models include, Chambers and Danstan [7], Wang and Dorfman [8], Rao et al [9].
The use of auxiliary information in double sampling found out that the proposed estimators did perform better than the mean per unit estimator and also compared to the other estimators that don't utilize the auxiliary information and they are not asymptotically optimum with two auxiliary variables [10].
The use of local polynomial regression with two auxiliary information to estimate population total has been investigated as well [11]. A super population approach was used where they assumed a working model and simulation was done. In all the models used, when the model is specified incorrectly, the local linear regression dominated the linear regression. In this paper a new estimator is proposed and its variance derived, Mean Squared Error and the proposed estimator compared with the Horvitz Thompson estimator under the design based approach. The proposed estimator has been developed in relation to the motivation behind Cem Kadilar, Hulya Cingi [4][5]. Two datasets from the agricultural and environmental sectors were used in order to investigate the properties of the estimator and they gave satisfactory results.

Some Useful Information
Consider an auxiliary variable , correlated with the variable of interest is obtained for every unit in the sample that has been drawn by simple random sampling from the study population and in addition the population mean of the is known. The estimate of the mean of , the population mean of the , is: Where is an estimate of the change in when is increased by one unit, ̅ , are the means of and respectively. MSE of the estimate is given by: Where = /!, is size of the sample, ! is size of the population, is the population variance of , is the correlation coefficient between and .
Auxiliary information can be used at the sample stage as well as at the estimation stage. The regression estimate of the population mean when there are two auxiliary variables and , will be: The MSE of the estimator is given by: Survey variables are often estimated by the auxiliary variables, a super population approach is used whereby a model which is working relating the two auxiliary variables is used.

Sampling with One Auxiliary Variable
The use of auxiliary information in finite population increases the precision of the estimators of the population mean, Total or population distribution function. If a researcher in his sampling comes across an auxiliary variable, the first thing is always to think of how to utilize it in a more efficient manner. The auxiliary information may be correlated with the study character and may be put into use either at the design stage or estimation stage or sometimes at both stages. When sampling using one auxiliary variable, the regression estimate of ̅ the population mean of is given by: where is an estimate of the change in when w is increased by only one unit. From the foregoing, the estimate of the population total is given by The MSE of the regression estimate is also given by: where = / , is the population variance of and . = 0 12 0 1 0 2 the population correlation coefficient between and .

Estimators in Literature Using Two Auxiliary Variables
An estimator for the population mean that relies on the assumption that the means of the two auxiliary variables are known was proposed by Abu-Dayyeh [12]. The proposed estimator is given by: An exponential-ratio estimator which was proposed by [13] for estimating finite population mean is given by: Utilizing a linear combination of two auxiliary variables, Jinglu Lu [14] proposed an exponential ratio type estimator given by:

The Datasets
To demonstrate the performance of the estimator give in equation (12) over the Horwitz-Thompson's estimator for finite population total, two datasets were used. 0.6516 7 7 0.809158  The scatter plots to explore linear relationship of the different variables in the data was drawn. The relationship between each of the variables is linear and positive as indicated in figures 1 and 2. This observation concurs with the existing literature in that the auxiliary variables should be positively correlated with the study variable in question.

Results and Discussions
In order to use two auxiliary variables, the variables have to be positively correlated. As indicated in Figure 1, there is a positive relationship between the girth of the trees and the volume and height. This supports the requirement that auxiliary variable should be positively correlated with the study variable. The same conclusion of positive correlation between the study variables and auxiliary information can be deduced from Figure 2.
In order to compare the performance of the proposed estimator with the Horvitz Thompson estimator the mean squared error was investigated. As seen in Table 1, the proposed estimator gives the minimum variance as compared to the existing estimator under the two populations. The confidence interval of the proposed estimator was calculated and the results tabulated in Table 2. According to researchers, a wider confidence width expresses a high level of uncertainty. As seen from Table 1, the confidence interval of the proposed estimator are narrower and tighter than the one compared to the Horvitz Thompson estimator at 95% coverage rate.

Conclusion
In this study, a ratio-regression type estimator using two auxiliary variables has been developed. The confidence intervals of the proposed estimator were tighter and narrower than those of the Horvitz Thompson estimator. The proposed estimator was found to be more efficient than the traditional design based Horvitz Thompson estimator. The comparison was made in terms of MSE and it was established that the MSE of the proposed estimator was smaller than that of the design based Horvitz Thompson estimator.