Estimation of Parameters of the Two-Parameter Rayleigh Distribution Based on Progressive Type-II Censoring Using Maximum Likelihood Method via the NR and the EM Algorithms

In this article, Maximum likelihood estimates for the shape and scale parameters of two-parameter Rayleigh distribution are obtained based on progressive type-II censored samples using the Newton-Raphson (NR) method and the Expectation-Maximization (EM) algorithm. A simple algorithm discussed in [2-3] is used for generating progressive type-II censored samples. Based on this censoring scheme, approximate asymptotic variances are derived and used to construct approximate confidence intervals of the parameters. The performance of these two maximum likelihood estimation algorithms is compared in terms of simulation results of root mean squared error (RMSE) and the coverage rates. Simulation results showed that in nearly all the combination of simulation conditions the estimators based on the EM algorithm have less root mean squared error (RMSE) and narrower widths of confidence intervals compared to those obtained using the NR algorithm. Finally, an illustrative example with real-life data sets is provided to illustrate how maximum likelihood estimation using the two algorithms works in practice.


Introduction
The two-parameter Rayleigh distribution is a particular case of a Weibull distribution widely used in reliability theory and life testing. Rayleigh [25] introduced this distribution in connection with a problem in acoustics. Rayleigh distribution has a nice relation to other distributions including Chi-Square and most extreme value distributions. In addition, the hazard function of this distribution increases with an increase in time. As a result, the distribution has attracted several researchers as it occurs in different forms including one-parameter Rayleigh distribution, and twoparameter Burr type X distribution also known as the Generalized Rayleigh distribution. According to Surles and Padgett, the two-parameter Rayleigh distribution is an extreme value distribution that is effective in modeling general life data [26].
In literature, several distinguished authors have extensively studied estimation, inferential, and predictions issues for one-parameter Rayleigh distribution although not much has been done on two-parameter Rayleigh distribution. Interested readers are referred to [9, 10, 15, and 16] for exposure to the Rayleigh distribution.
Recently, Khan, Provost, and Singh [17] considered the predictive inference based on doubly censored samples for the two-parameter Rayleigh distribution. Very recently, Dey, Dey, and Kundu [12] derived interval and point estimates of the scale and location parameters of a two-parameter Rayleigh distribution using progressive Type-II censored samples.
A continuous random variable X is said to have a twoparameter Rayleigh distribution with a scale parameter λ and location parameter µ, if its density function is given by: The corresponding distribution function for x > µ is given by: (2) The presence of the location parameter makes the twoparameter more effective in analyzing real life data sets compared to one-parameter Rayleigh distribution.
In reliability testing, an experimenter may cease testing before all the experimental units fail due to time constraint or lack of funds. Samples that results from such situations are known as censored samples. There are numerous censoring methods available to an experimenter with type-II and type-I censoring schemes being the commonly used schemes in life testing. A mixture of these two schemes results to a hybrid censoring scheme. However, type-II, type I, and hybrid censoring schemes do not give room for removal of experimental units before the terminal point of the experiment. Progressive type-II censoring scheme allows such removal hence it gained popularity in life-testing and reliability experiments. In this paper, we consider progressive Type-II censoring scheme.
In the recent statistical literature, progressive censoring scheme has attracted many reliability practitioners and theoreticians. Interested readers are referred to [2][3][4]. For more recent references, refer to [24,27], as well as references, cited therein.
Recently, Lio, Chen, & Tsai [19] investigated inference of the estimated parameters of the generalized Rayleigh distribution based on progressive type-I interval censoring scheme. The study reviewed that use of progressive type I interval censored samples to estimates the MLEs using Expectation Maximization algorithm yields more accurate and precise parametric estimates. Very recent, Dey et al. [12] derived interval and point estimates of the scale and location parameters of a two-parameter Rayleigh distribution using progressive Type-II censored samples.
The purpose of this article is to develop an estimation procedure for the scale and shape parameters of the twoparameter Rayleigh distribution based on progressive type-II censoring scheme. We first derive the maximum likelihood estimators of the unknown parameters. Since the MLEs of the shape and scale parameters of the two-parameter Rayleigh distribution cannot be obtained in the explicit form, we propose the use of the NR and the EM algorithms to compute the MLEs. Progressive type-II right censored samples are considered as incomplete data hence both the EM and the NR algorithms are suitable numerical iterative procedures for finding the MLEs. For more information regarding the EM algorithm including its application and advantages compared to those of NR method readers are referred to [1, 18, and 29]. For derivation and application of the Newton-type method refer to [20, 21, and 23].
The rest of the article is organized as follows. In section 2, progressive type-II censoring scheme is briefly discussed, the MLEs of the scale and location parameters are derived based on progressive type-II censoring using the EM and NR algorithms. Based on this censoring scheme, approximate asymptotic variances are derived and used to construct approximate confidence intervals of the parameters. In Section 3, simulation results and discussions are provided. In section 4, an illustrative example is provided using real-life data sets. In the final section, a conclusion is provided.

Progressive Type-II Censoring Scheme
Let n identical items be put on a life-testing experiment at time 0 with the corresponding lifetimes X 1 , X 2 , X 3 ,…, X n being independent and identically distributed with the density function given in equation (1). Further, suppose that integer m <n is fixed at the beginning of the experiment (where m <n is the number of units to be observed completely until failure) with and specified.
This implies that progressive censoring will occur in m failure stages as follows. At the time of the first failure, a random sample of size R 1, (X 1 : m: n) surviving items are randomly drawn from n-1 remaining surviving units in the experiment leaving n-1-R 1 survival units. At the time of the second failure, a random sample of size R 2 , (X 2 : m: n) is randomly drawn from n-2 surviving units leaving n-2-R 1 surviving items in the experiment. The process is continued until the m th failure time X m:n:m is evident (the m th stage) when all R m = n -m -R 1 -R 2 -… -R m-1 surviving items are removed from the life-testing experiment.  for j=1, 2, 3, …, m to make the notation simple.

Maximum Likelihood Estimation Based on Progressive Type-II Censoring
MLE is one of the standard techniques for estimating unknown parameters of distribution or a model. The principle concept behind this method is to select the value of the parameter under which the underlying data is most likely to be observed.
Suppose n identical units are placed at the same time on a life-testing experiment. Let x 1: n , x 2: n …x m:n be a progressive type-II censored random sample from density function in equation (1). According to Balakrishnan and Aggarwala, m ordered failures out of the sample of size n are observed under this scheme and random samples R 1 , R 2 … R m of survival units drawn and removed from the experiment at each of m th failure stage [2]. The likelihood function based on progressive type-II censored random samples as in [2] is given by: Substituting the value of f (.) and F (.) in equation (3), the log likelihood function of µ and λ constructed on progressive type-II censored sample ignoring the constant term can be written as follows: The log-likelihood function of (4) is written as: denotes the progressive type-II right-censored data from a population with density function and distribution given in equations (1) and (2), respectively.

Expectation-Maximization (EM) Algorithm
We propose the use of EM algorithm discussed in [7] as follows.
Let some of the complete data vectors W be observed such denotes the censored data (missing data) and ( ) 1 2 , ,..., m Y y y y = denotes the observed data. The log-likelihood function of the complete data set can be written as: The MLEs of the parameters λ and µ based on W are obtained as: The E-Step of the EM algorithm requires substituting any function of Z jk (say h (Z ik )) by E (h (Z jk ) /Z jk >y j ). Hence, equations (7) and (8) becomes We make use of theorem by Ng et al. [22] that states that given j j Y y = ; the conditional distribution of Zik follows a truncated two-parameter Rayleigh distribution with left truncation at y j . Hence, The conditional expectations in equations (9) and (10) are obtained as: The M-step of the (h+1) th iteration of the EM algorithm is completed by substituting the above conditional expectations on to equations (9) and (10) as follows: 1 h λ + is the estimate of λ at the (h+1) th iteration of the EM algorithm. Once

The Newton-Raphson Algorithm
We will directly extend the argument for deriving the Newton-Raphson algorithm for optimization in one dimension to two-dimensional problems as discussed by Devore and Berk [8] giving the two-parameter Newton-Raphson method as: Where ( ) J θ is the Hessian matrix (a matrix with (i, j) entry equal to the second derivatives with respect to θ j and θ i) and ( ) S θ is the score function (a vector of derivatives).
From equation (5) Hence, equation (16) becomes The procedure is reiterated until there is no significant difference between

Approximate Interval Estimation
The approximate asymptotic variances of the shape and the scale parameters and the confidence intervals are obtained as follows: ; , The exact asymptotic variance of μ cannot be obtained in explicit form. We rely on the results of Dey, Dey, and Kundu [11] who applied Corollary of Theorem 3 of Smith [28], to approximate the asymptotic variance of μ by ( ) V µ µ − using the inverse of the observed information as: Hence, ; , Using equation (20)

Results and Discussions
In this section, a simulation study is performed to compare the performance of MLEs of the two-parameter Rayleigh distribution obtained using the NR method and the EM algorithm based on progressive type-II censored samples.
Progressive type-II is right censored samples from twoparameter Rayleigh distribution were generated using the algorithms discussed in [2][3].
In comparing the performance of the MLEs, four measures considered were the root mean squared error (RMSE) and the 95% approximate confidence width of MLEs. Suppose ˆi ii.
In this paper, samples of sizes 20, 30, 40, 50, and 70 were used and the censoring schemes considered are given in Table 1 and 2 below.     1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 It is imperative to note that, in all the above censoring schemes no restriction has been imposed on the maximum number of iterations and convergence is assumed to occur when the absolute differences between successive estimates are less than 0.0001.   Tables 3-7 is Provided Below i. The MLEs realized using the EM algorithm have lower levels of RMSE compared to those obtained by the NR method in nearly all combinations of simulation conditions. ii. The widths of 95% approximated confidence intervals of parameters and obtained using the EM algorithm tends to be lesser compared to those obtained by NR method in nearly all combinations of simulation conditions. According to Gulhar et al., a smaller width is better because it captures the true parameter value (CV) within a small span and the results are more accurate and precise [13]. iii. For a fixed sample size n (e.g. n=30), we noted that as the number of failures (m) increases (i.e. from 20 to 25), the RMSE and widths of confidence intervals of MLEs obtained using both the EM and NR algorithms decreases. For RMSE and widths of the confidence intervals (compare Table 3 and 4, and Table 5 and 6). This implies that the performance of MLEs becomes better. iv. When the number of failures m is fixed, we observed that as the sample size n increases the RMSE, and the widths of 95% approximate confidence intervals of MLEs obtained using both the EM and NR algorithms decreases (see Table 7). This indicates that the MLEs are consistent in nature. v. When the value of is fixed, we noted that as the value of increases, the RMSE for all the estimates increases, which indicates the consistency of the estimators. vi. Additionally, if m=0 there is no censoring, under this condition zero samples are generated hence it is not possible to obtain the corresponding MLEs. On the other hand, if , then n=m (complete sample situation), under this condition estimates are extremely biased.

Example Using Real-Life Data
Now consider a real-life data set to illustrate how maximum likelihood estimation using the NR method and the EM algorithm for the two-parameter Rayleigh distribution works in practice. We have utilized progressive type-II censoring to analyze a real data representing the survival times (in years) of 46 patients given chemotherapy treatment as discussed in [5]. The discussion indicated that the Rayleigh Distribution is acceptance for this data set (provides a good fit From the above data, progressive type-II censored samples were generated with m=20, 30, and 40 as follows: From the table above, it is observed that: i. The MLEs obtained using the EM algorithm have narrower widths of confidence intervals compared to those obtained using NR method except for when m=40. A smaller width is better because it captures the true parameter value (CV) within a small span and the results are more accurate and precise. ii. For both methods, the MLEs and the width of 95% approximate confidence intervals decrease as the number of failures increases (i.e., from 20 to 40) for nearly all the values of m, which indicates the consistency of the estimators.

Conclusions
In this study, the problem of estimation of the MLEs for the parameters of the two-parameter Rayleigh distribution based on generated progressive type-II censored samples was addressed. In particular, the MLEs were derived using the NR and the EM algorithms. Approximate asymptotic variances of the MLEs were also derived and used to construct approximate confidence intervals of the parameters.
The simulation results clearly show that the MLEs obtained using the EM algorithm have lower levels of RMSE and narrower widths of the corresponding confidence intervals compared to those obtained using the NR algorithm. However, the NR method may yield better estimates especially when is greater than 70%. This shows that both the EM and NR algorithms can be used in estimation problem, but we can conclude that the EM algorithm is highly recommended as it provides better estimates. Al-Zahrani and Gindwan [1] and Helu, Samawi, & Raqab [14] obtained similar simulation results.