Analysis of Progressive First-Failure-Censoring for Non-normal Model Using Competing Risks Data
A. A. Modhesh1, G. A. Abd-Elmougod2
1Department of Mathematics, Faculty of Science, Taiz University, Taiz, Yemen
2Department of Mathematics, Faculty of Science, Taif University, KSA
To cite this article:
A. A. Modhesh, G. A. Abd-Elmougod. Analysis of Progressive First-Failure-Censoring for Non-normal Model Using Competing Risks Data. American Journal of Theoretical and Applied Statistics. Vol. 4, No. 6, 2015, pp. 610-618. doi: 10.11648/j.ajtas.20150406.33
Abstract: Competing risks data usually arises in studies in which the death or failure of an individual or an item may be classified into one of T≥2 mutually exclusive causes. In this paper, we will study the competing risks model when the data is progressively first-failure-censored. Based on this type of censoring, we derive the maximum likelihood estimators (MLE's) for the unknown parameters. Approximate confidence intervals and two bootstrap confidence intervals are also proposed. The results in the cases of first-failure censoring, progressive Type II censoring, Type II censoring and complete sample are special cases. A real data set has been analyzed for illustrative purposes. Different methods have been compared using Monte Carlo simulations.
Keywords: Burr XII Distribution, Progressive First-Failure-Censoring, Competing Risks, Maximum Likelihood Method, Bootstrap
In medical studies and reliability analyses an investigator is often interested in the assessment of a specific risk in the presence of other risk factors. In statistical literature this is known as the analysis of competing risks model. A lifetime experiment with different risk factors competing for the failure of the experimental units is considered. The data for such a competing risks model consist of the lifetime of the failed item and an indicator variable which denotes the cause of failure. For example, the competing risk for a prostate cancer patient may include prostate cancer itself, heart disease and (all) other causes. The effects of the other competing risks may play an important role in survival studies on slowly progressing diseases such as prostate cancer. In engineering applications, the causes or risks may signify either multiple modes of failure for a complex unit or multiple components or subsystems which comprise an entire system. Occurrence of a system failure is caused by the earliest onset of any of these component failures. In this respect, the framework is that of a system with components connected in series. Several studies have been carried out under this assumption and the risks follow different lifetime distributions, namely the exponential, lognormal, gamma, Weibull, generalized exponential or exponentiated Weibull; see for example Moeschberger et al. , Pascual  Cramer and Schmiedt , Sarhan et al. , Sarhan , Alwasel , Kundu and Bas  and Kundu and Sarhan .
Censoring occurs when exact lifetimes are known only for a portion of the individuals or units under study, while for the remainder of the lifetimes information on them is partial. There are several types of censored tests. The most common censoring schemes are Type-I (time) censoring, where the life testing experiment will be terminated at a prescribed time , and Type-II (failure) censoring, where the life testing experiment will be terminated upon the -th ( is pre-fixed) failure. However, the conventional Type-I and Type-II censoring schemes do not have the flexibility of allowing removal of units at points other than the terminal point of the experiment. A generalization of Type II censoring is the progressive Type II censoring. It is a method which enables an efficient exploitation of the available resources by continual removal of a prespecified number of surviving test units at each failure time. On other hand, the removal of units before failure may be intentional to save time and cost or when some items have to be removed for use in another experiment. Wu et al.  and Wu and Yu  for extensive reviews of the literature on progressive censoring. When the lifetimes of products are very high, the experimental time of a Type II censoring life test can be still too long. Because of these lack of flexibilities, Johnson  described a life test in which the experimenter might decide to group the test units into several sets, each as an assembly of test units, and then run all the test units simultaneously until occurrence the first failure in each group. Such a censoring scheme is called first-failure censoring. If an experimenter desires to remove some sets of test units before observing the first-failures in these sets this life test plan is called progressive first-failure-censoring scheme.
In this section, first-failure censoring is combined with progressive censoring scheme as in Wu and Kuş  Suppose that independent groups with items within each group are put on a life test, groups and the group in which the first failure is observed are randomly removed from the test as soon as the first failure (say and ) has occurred, groups and the group in which the second first failure is observed are randomly removed from the test when the second failure (say and ) has occurred, and finally groups and the group in which the -th first failure is observed are randomly removed from the test as soon as the -th failure (say and ) has occurred. The data are called progressively first-failure-censored order statistics with the progressive censoring scheme and for each , takes a value either 1 and 2 the causes of failures. It is clear that . For a given censoring scheme , the likelihood function of the observed data
where and are reliability and failure rate functions, respectively, and
It is clear from that the progressive first-failure-censored scheme containing the following censoring schemes as special cases:
The first-failure censored scheme when .
The progressive Type II censored order statistics if
Usually Type II censored order statistics when and
The complete sample case when and .
The main aim of this paper is to develop a confidence interval and the MLE for the Burr XII parameters based on the progressively first-failure-censored sample in the presence of competing risks. Therefore, the organization of the paper is as follows. The model description and notation used throughout this paper are introduced in Section 2. The MLE's of the unknown parameters are presented in Section 3. Approximate confidence intervals and two parametric bootstrap confidence intervals are discussed in Section 4. A real data set, due to Hoel , is analyzed in Section 5. In Section 6, the different methods are compared by conducting Monte Carlo simulations. Some concluding remarks are finally made in Section 7.
2. Model Assumptions and Notations
Before proceeding any further, we describe some notations we are going to use in this paper.
lifetime of the -th unit.
lifetime of the -th individual under cause , .
cumulative distribution function (cdf) of .
probability density function (pdf) of .
distribution function (cdf) of .
probability density function (pdf) of .
survival function of .
indicator variable denoting the cause of failure of the -th individual.
To simplify the notations we will use hereafter instead of , …. The model studied in the paper satisfies the following assumptions
I) The lifetime of unit is denoted as , …. The time at which the unit fails due to cause is , . That is, .
II) The distribution of the random variable is Burr XII with shape parameters and , and That is, the (pdf) and (cdf) of , , and …, are
The corresponding reliability and failure rate functions of the Burr XII distribution at some , are
The two-parameter Burr XII distribution has unimodal or decreasing failure rate function Eq. . It is clear that the parameter does not affect the shape of failure rate function and is the shape parameter. Thus the shape parameter plays an important role. Its capacity to assume various shapes often permits a good fit for describing biological, clinical or other experimental data sets.
3. Maximum Likelihood Estimation
Based on the observed sample (), (),,(), the likelihood function is given by
The log-likelihood function without the additive constant can be written as follows;
Calculating the first partial derivatives of with respect to and and equating to zero, we obtain the likelihood equations
Hence, the MLE's of and , respectively, given by
Using and in , we obtain the profile likelihood function for , as
Therefore, the MLE of , say can be obtained by maximizing with respect to . The MLE which maximizes can be obtained from
Thus, the MLE of the parameter can be obtained by solving the nonlinear Eq. using, for example, the Newton-Raphson or fixed point iteration. The corresponding MLE and of the parameters and are computed from Equations and
Notice that both and follow binomial distributions with sample size . Hence, Bin .
4. Confidence Intervals
In this section, we propose different confidence intervals. One is based on the asymptotic distribution of and and two bootstrap confidence intervals.
4.1. Approximate Confidence Intervals
From the log-likelihood function in , we have
To find the confidence interval for the estimators we determine the asymptotic distribution of the maximum likelihood estimator of the element of the vector of unknown parameters , which produce an approximation confidence interval. It is known that the asymptotic distribution of the MLE is given by
where is the variance-covariance matrix of the vector of unknown parameters. In practice, we usually estimate by
Therefore, the approximate two sided confidence intervals for , and are, respectively, given by
Here, is the upper () th percentile of standard normal distribution.
4.2. Bootstrap Confidence Intervals
In this subsection, we propose to use two confidence intervals based on the parametric bootstrap methods: (i) percentile bootstrap method (Boot-p) based on the idea of Efron . (ii) bootstrap-t method (Boot-t) based on the idea of Hall . The confidence intervals of using both methods are illustrated briefly in the following steps:
Step 1: From the original data ( ), ( ), , ( )) compute the MLE's of the parameters: say and equations .
Step 2: Use and in Step 1 to generate a bootstrap sample with the same values of and . We used the algorithm proposed by Balakrishnan and Sandhu , with the fact that, the progressive first-failure censored sample with distribution function , can be viewed as a progressive Type II censored sample from a population with distribution function , Bin .
Step 3: Repeat Step 2, times representing where and.
Step 4: Arrange all in an ascending order to obtain the bootstrap sample.
I- Percentile bootstrap method (Boot-p)
Let be the cdf of . Define for given The approximate bootstrap confidence interval of are given by
II- Bootstrap-t method (Boot-t)
Compute the following statistic:
where are obtained using the Fisher Information matrix. Using values, determine the upper and lower bounds of the confidence interval of as follows: let be the cdf of . For a given , define
Here also, can be computed as same as computing the . The approximate confidence interval of is given by
Hall showed that the Boot-t confidence interval is better than the Boot-p confidence interval from an asymptotic point of view.
5. Data Analysis
We consider in this section a real-life data set which was originally reported by Hoel  and latter analyzed by several authors, see for example Pareek et al. , Sarhan et al.  and Cramer and Schmiedt . It was obtained from a laboratory experiment in which male mice received a radiation dose of 300 roentgens. The cause of death for each mouse was determined by autopsy. Restricting the analysis to two causes of death, for the purpose of analysis, we consider thymic lymphoma as cause and combine the other causes of death as cause . There were deaths due to cause and deaths due to cause .
The mean, standard deviation and the coefficient of skewness for the two causes of death are calculated () and (), respectively. The measure of skewness indicates that the data are positively and negatively skewed for cause and cause , respectively. For computational ease, we have divided each data point by .
To check the validity of the model, we compute the the Kolmogorov Smirnov (K-S) statistic whether the Burr XII model is suitable for this data. The maximum likelihood estimates of and based on the two causes of death are () and (), respectively. In deaths due to cause the K-S distance and the associated p-value are and , respectively, and for the deaths due to cause the corresponding values are and Based on the p-values, the Burr XII model is found to fit the data well. We have plotted the empirical survival functions, and the fitted survival functions in Fig. 1 and Fig. 2 for both data sets. Observe that they fit the data very well.
Now, there were observations in the data. The data are randomly grouped into groups with () items within each group. We suppose that the pre-determined progressively first-failure censored scheme is given by (, ), then a progressively first-failure censored competing risks data of size out of groups of death time is obtained as .
There were deaths due to cause 1 and deaths due to cause 2. We would like to compute the MLE's of the unknown parameters. Before going to compute the MLE's, we plot the profile log-likelihood function in Fig. 3. From the Fig. 3 it is clear that the profile log-likelihood function is unimodal and the MLE of is close to . We start the iteration to solve the Eq. with , and obtain the and and the confidence intervals for , and are the Boot-p confidence intervals are and the corresponding Boot-t confidence intervals are respectively. Using the two confidence intervals (Boot-p) and (Boot-t), we present the mean of bootstrap samples of () by .
6. Monte Carlo Simulations
In this section we primarily perform some simulation experiments to observe the behavior of the different methods. Monte Carlo simulations were performed utilizing 1000 progressively first-failure-censored samples for each simulations. The samples were generated by using the algorithm described in Balakrishnan and Sandh  using ( ) = (1.5, 0.3,0.7), (1.5, 0.4,0.5) with different choices of , and . We take into consideration that the progressively first-failure-censored order statistics is a progressively Type II censored sample from a population with distribution function . For each data point, we assigned the cause of failure as 1 or 2 with probability ( ) and ( ), respectively. We consider the following different sampling schemes:
Scheme I for
Scheme II for
Scheme III for
The MLE of parameter is then computed from the solution of Equation using the Newton-Raphson iteration. Once we estimate , we derived and using ) and , respectively. We compute the average estimates, mean squared errors (MSEs) of the MLE's. The estimated coverage probability, and the average lengths are computed for different methods of estimation. For both Boot-p and Boot-t, we considered replications. Tables 1-4 summarize the obtained results.
Table 1. The average estimates of and and their mean squared errors (within brackets) of MLE's, for different censoring schemes ( and
Table 2. The average estimates of and and their mean squared errors (within brackets) of MLE's for different censoring schemes ( and )
Table 3. The average 95% confidence lengths and the corresponding coverage percentages (within brackets) of Boot-p and Boot-t for different censoring schemes and
Table 4. The average 95% confidence lengths and the corresponding coverage percentages (within brackets) of MLE's, Boot-p and Boot-t for different censoring schemes ( and
In this paper, we have analyzed progressive first-failure-censoring in the presence of competing risks. In particular, we have assumed that the latent failure times under the competing risks follow independent Burr XII distributions with common the shape parameters. We compared different statistical inference procedures and the performance of the unknown parameters based on MLE, Boot-p and Boot-t methods in this setting. We have then conducted a simulation study to assess the performance of all these procedures and a numerical example has been presented to illustrate all the methods of inference developed in this paper. This work can be extended in several directions. Bayesian inference is also possible through the inclusion of suitable prior distributions. Inferences can be extended to allow for more than two causes. Based on the results of the simulation study some of the points are clear from this experiment. Even for some small sample sizes, we observe the following:
The results obtained in this paper can be specialized to: (a) First-failure-censored order statistics by taking . (b) Progressively Type II censored statistics for . (c) Usually Type II censored order statistics for and . (d) Complete sample for , and .
From Tables , as expected for all the methods, when , increase then the average lengths and the MSEs decrease.
From Tables in most cases the estimated coverage probability is close to the nominal level of based on different methods.
We also observe very stable coverage probabilities (quite close to the nominal level). On the other hand, the performances of the MLE, Boot-p and Boot-t methods are satisfactory for small sample sizes as their actual coverage probabilities are close to the specified nominal levels in most cases.