Parameter Estimation of Kumaraswamy Distribution Based on Progressive Type II Censoring Scheme Using Expectation-Maximization Algorithm

This project considers the parameter estimation problem of test units from Kumaraswamy distribution based on progressive Type-II censoring scheme. The progressive Type-II censoring scheme allows removal of units at intermediate stages of the test other than the terminal point. The Maximum Likelihood Estimates (MLEs) of the parameters are derived using Expectation-Maximization (EM) algorithm. Also the expected Fisher information matrix based on the missing value principle is computed. By using the obtained expected Fisher information matrix of the MLEs, asymptotic 95% confidence intervals for the parameters are constructed. Through simulations, the behaviour of these estimates are studied and compared under different censoring schemes and parameter values. It’s concluded that for an increasing sample; the estimated parameter values become closer to the true values, the variances and widths of the confidence intervals reduce. Also, more efficient estimates are obtained with censoring schemes concerned with removals of units from their right.


Introduction
Censored sampling arises in a life testing experiment whenever the experimenter does not observe (either intentionally or un-intentionally) the failure times of all units placed on a life test.
"According to Horst, a data sample is said to be censored when, either by accident or design the value of the variables under investigation is unobserved for some of the items in the sample." [1] Inference based on censored sampling has been studied during the past over 50 years by numerous authors for a wide range of lifetime distributions.
In this study, we assume that the lifetimes have Kumaraswamy distribution. This distribution was introduced by Kumaraswamy as a probability density function for double bounded random processes. [2] This distribution is applicable to many natural phenomena whose outcomes have lower and upper bounds, such as the heights of individuals, scores obtained on a test, atmospheric temperatures, hydrological data etc.
The two parameter Kumaraswamy distribution has a PDF and CDF given respectively by; ( ; , ) ( , Kumaraswamy and Ponnambalam et al. [2,3] have pointed out that depending on the choice of the parameters, this distribution can be used to approximate many distributions, such as uniform, triangular, or almost any single model distribution and can also reproduce results of beta distribution. The basic properties of the distribution have been given by Jones. [4] Inferential issues for the Kumaraswamy distribution based on censored data have been addressed by Gholizadeh et al. [5] who considered the Bayesian estimation of Kumaraswamy distribution under progressively Type II censored samples. Tabassum et al. [6] explored the Bayesian analysis of Kumaraswamy distribution under failure censoring sampling scheme. Feroze et al [7] estimated the parameters of Kumaraswamy distribution under progressive type II censoring with random removals using maximum likelihood method.
Most recently, Mostafa et al [8] derived parameter estimators of Kumaraswamy distribution based on general progressive type II censoring scheme using maximum likelihood and Bayesian approaches. Also, some of the recent work on progressive censoring include but not limited to [9][10][11][12][13][14]. As far as we know, no one has described the EM algorithm for determining the MLEs of the parameters of the Kumaraswamy distribution based on progressive type-II censoring scheme.
The purpose of this study is to estimate the shape and scale parameters of the Kumaraswamy distribution under progressive type-II censoring using the EM algorithm and to compare the results under different censoring schemes.
In this work, we propose to use EM algorithm for computing MLEs. This is because the EM algorithm is relatively robust against the initial values compared to the traditional Newton-Raphson (NR) method. [15,16] For some of the recently relevant references on EM algorithm and censoring include [17 and 20].

Progressive Type II Censoring
Suppose n identical units are put on a test and the lifetime distributions of the n units are denoted by , … , . The integer m < n is fixed at the beginning of the experiment and they are the units which are observed completely until failure.
The censoring occurs progressively in m stages. These m stages offer failure times of the m completely observed units.
From equations (1) and (2), the likelihood function based on progressive Type II censored sample is as follows; ( ) ( ) The log-likelihood function of equation (4) can be written as follows:

EM Algorithm
We propose the EM algorithm, introduced by Dempster et al. [22] to find the MLEs. Let be the censored data. We consider the censored data as missing data. The combination (X, Z) = W forms the complete data set. The log-likelihood function based on W can be written respectively as: In the E-step, one requires to compute the pseudo-likelihood function. This can be obtained from ; , by replacing any function of Therefore, , , given X x = the conditional distribution of follows a truncated Kumaraswamy distribution with left truncation at j x . That is Therefore the conditional expectations in equations (6) and (7) can be obtained as follows: Thus, in the M-step of the ( ) 1 th k + iteration of the EM algorithm, the value of ( 1) k θ + is first obtained by solving the following equation: ( 1) 1 1 1 1 ln ln( ) ( 1) ; , 1

Asymptotic Variance-Covariance Matrix of the MLEs
The variance-covariance matrix is used to provide a measure of precision for parameter estimators by utilizing the log-likelihood function. We first compute the variancecovariance matrix of parameters θ and λ by considering a complete data set from the Kumaraswamy distribution.
For such a case, the log likelihood function based on X is obtained as follows; Using equation (13), the Fisher information matrix for the complete data set is given as; And the variance-covariance matrix of parameters θ and λ is given by x and x ψ ψ are the digamma and trigamma functions respectively.
In this work, we are interested in deriving the asymptotic variance-covariance matrix for the MLEs based on the EM algorithm. For this we will use the procedure that was established by Louis and Tanner. [23,24] The idea of this procedure is given by ℎ ! " # , ! $%& # and ! '&& # denote the complete, observed, and missing (expected) information, respectively, and η = (θ, λ). The Fisher information matrix for a single observation which is censored at the time of the ( failure is given by The expected values of the second partial of the log-likelihood function of Z given X are calculated as follows; Note that ! '&& is a function of ) and η, since the expectation is taken with respect ; therefore, the expected information matrix is simply Using equation (21) an approximate 100(1−α) % confi-dence intervals for θ and λ is obtained respectively, as;

Numerical Results and Discussion
In this section a simulation study is conducted to investigate how the above estimators perform in estimating the parame-ters of Kumaraswamy distribution based on progressive type II censored data. The samples were generated based on the algorithms of Balakrishnan and Sandhu and Aggarwala and Balakrishnan (1998). [25,26] The censoring schemes con-sidered are given in table 1 below; Table 1. Censoring schemes.

Scheme
Censoring rate All the computational results were computed using R software   Table 3 clearly shows that the widths of 95% confidence intervals tend to be lesser as the sample size increases.  Table 4 has been extracted from table 2, so as to clearly illustrate the effect of censored units on the parameter estimates. The results in table 4 show that when the sample size is kept constant, then better estimates are obtained when the censored units are reduced. Schemes 4-6 have better estimates compared to schemes 1-3 because the number of censored units in schemes 4-6 are each 3 units while in schemes 1-3, we have 6 units censored from each. The removal of units in scheme 1, 2 and 3 was done at the 12th, 6th, and 1st failures respectively and from the results it was observed that scheme 1 which is right censored, gave a better estimate followed by scheme 2 (centre censored scheme) and lastly scheme 3 (left censored scheme). The same trend was observed across all the censoring schemes i.e all the right censored schemes resulted in better estimates followed by centre censored and left censored in that order.  Table 6 also shows that for increasing sample size the estimated value of the parameter becomes closer to the true value and the variances of the MLEs decrease.
However, these variances are much higher than those obtained in table 2. The widths of the confidence intervals are also higher under these set of parameter values and tend to be lesser for an increasing sample size.  Table 6 as well reveals that reducing the censored units leads to better estimates for a constant sample size. In schemes 7-9, the number of units censored are each 7, while in schemes 10-12, the censored units are each 3 and we see from the results that schemes 10-12 gave better estimates compared to schemes 7-9. The removal of units in scheme 7, 8 and 9 was done at the 18 th , 9 th and 10th, and 1 st failures respectively and from the results it was observed that scheme 7, gave a better estimate followed by scheme 8 and finally scheme 9. This trend was observed to cut across all the censoring schemes i.e all the right censored schemes resulted in better estimates followed by centre censored and left censored in that order.

Conclusion
This study has addressed the problem of estimation of parameters of the Kumaraswamy distribution based on progressive Type-II censored data. It is shown that the MLEs of the scale and shape parameters can be obtained by using EM algorithm.
A comparison of the MLEs and their variances as well as their confidence intervals is made by simulation for different censoring schemes. It is observed that: i. for an increasing sample size, the estimated value of the parameter becomes closer to the true value, the variances of the MLEs decrease and the widths of the confidence intervals become less. ii. better estimates are obtained when the removal of units is from the right, followed by those at the centre and poorest for those removed from the left. iii. reducing the number of units to be removed in the censoring scheme, leads to better estimates for a fixed sample size. iv. an increase in the true parameter values leads to estimates with large variances and increased widths of the confidence intervals.