A New Approach to Dose Estimation in Drug Development Based on Maximization of Likelihood of Grouped Data
Nicholas A. Nechval^{1, *}, Gundars Berzins^{2}, Vadims Danovics^{3}
1Department of Mathematics, Baltic International Academy, Riga, Latvia
^{2}Department of Management, University of Latvia, Riga, Latvia
^{3}Department of Marketing, University of Latvia, Riga, Latvia
Email address:
To cite this article:
Nicholas A. Nechval, Gundars Berzins, Vadims Danovics. A New Approach to Dose Estimation in Drug Development Based on Maximization of Likelihood of Grouped Data. American Journal of Theoretical and Applied Statistics. Special Issue: Novel Ideas for Efficient Optimization of Statistical Decisions and Predictive Inferences under Parametric Uncertainty of Underlying Models with Applications. Vol. 5, No. 2-1, 2016, pp. 12-20. doi: 10.11648/j.ajtas.s.2016050201.13
Abstract: Identifying the ‘right’ dose is one of the most critical and difficult steps in the clinical development process of any medicinal drug. Its importance cannot be understated: selecting too high a dose can result in unacceptable toxicity and associated safety problems, while choosing too low a dose leads to smaller chances of showing sufficient efficacy in confirmatory trials, thus reducing the chance of approval for the drug. The optimal dose is the dose that gives the desired effect with minimum side effects. The dose of a drug is of course ‘optimal’ only for a given subject, but not necessarily for any other. In view of this the objective of a dose-finding trials is not to determine a single fixed dose for use in the early phases of clinical trials or in medical practice, but to determine an interval of doses within which there is a stated degree of confidence that the defined, acceptable therapeutic response and the frequency of adverse reactions will lie above and below, respectively, certain acceptable predetermined levels. If the subject samples used in the dose finding studies adequately represent the subject population for which the drug is intended, the interval of doses so defined can be applied to the subject population as a whole. In this paper, we propose the technique based on maximization of likelihood function in order to estimate the maximal tolerated dose (MTD) and minimal effective dose (MED) on the basis of l samples of subjects, which are grouped in a simplest way. The necessary and sufficient conditions for the existence and uniqueness of the maximum likelihood estimates are derived. The proposed approach to dose estimation in drug development is simple and suitable for medical practice. The numerical examples are given.
Keywords: Drug Development, Dose Estimation, Grouped Data, Likelihood Function, Maximization
1. Introduction
The proper understanding and characterization of the dose-response relationship for a new compound is a fundamental step in the clinical development process of any medicinal drug. The determination of the doses of a new pharmaceutical preparation to be used in clinical practice is a very important issue in the early phases of clinical trials: this is recognized not only in the literature [1] but is also mentioned in the F.D.A. guidelines for the clinical evaluation of drugs [2].
A common problem in toxicological and drug development studies is to assess the biological activity of a chemical compound. For this purpose, a dose-response experiment is conducted in which several doses of the compound are administered to separate groups of experimental units. There are two primary goals in these studies. In a toxicological study the goal is to estimate a safe dose that will not cause some undesirable effect (e.g., toxicity, carcinogenicity), whereas in a drug development study the goal is to estimate the lowest dose that will cause some desirable effect. Indeed, in Phase I clinical trials, researchers test a new drug or treatment in a small group of subjects for the first time in order to evaluate its safety, identify side-effects and determine a therapeutically useful interval of doses. The upper end of the interval is the maximal tolerated dose (MTD) and the lower end of the interval is the minimal effective dose (MED). Here we will face the estimation of the minimal effective dose; generalization to maximum tolerated dose will be shown straightforward.
A well-established approach to search for MED and MTD is based on the construction of a dose–toxicity curve. Robbins and Monro [3] and later Wu [4] try to estimate, through stochastic approximation, the MTD as the quantile of this curve. Eichhorn and Zacks [5,6] studied the sequential search problem through linear regression dose–toxicity models. Inference within the dose–response framework that accounts for the model uncertainty was discussed by Pinheiro, Bretz, and Branson [7]; Bornkamp, Pinheiro, and Bretz [8]; and Whitney and Ryan [9] who used parametric nonlinear models to characterize the dose–response relationship and proposed to use model averaging techniques to account for model uncertainty. Within this approach several parametric nonlinear models are fitted to the data and information from all models is combined, using information criteria, for both estimation and inference. Although some of these methods perform quite well in small sample dose–response study simulations, their main drawback is just requiring an explicit dose–toxicity curve, often nonparametric, that could be artificial and complicated. Gasparini and Eisele [10] proposed a curve-free method: modelling the probabilities of toxicity directly as the unknown parameter of interest, a product of beta prior distributions is considered.
In this paper, the parameters of interest are the probability distribution functions of a suitable dose in drug development (from the point of view of toxicity and from the point of view of efficacy, respectively), which are determined via a new approach based on maximization of likelihood of grouped data.
2. Minimal Effective Dose Estimation
2.1. Likelihood Function of Grouped Data
Let us assume that the random variable X, which represents an effective dose level of a drug (from the point of view of efficacy) for randomly chosen subject, has a continuous cumulative distribution function(probability density function ) with unknown parametric vector . We consider l (l ³ the number of components of the unknown parametric vector) random samples of subjects of sizes N_{j}, j=1(1)l. The N_{j}_{ }subjects of the jth random sample are assigned to the dose d_{j} of a drug. Let n_{j}_{ }be the number of subjects in the jth sample, for which the effective dose of a drug is less than d_{j}. It is assumed, without loss of generality, that 0 < d_{1} < d_{2} < …< d_{l}.
The problem is to estimate the unknown parametric vector. For this purpose, it can be used the likelihood function of the grouped data
(1)
Consider a situation described by a location-scale family of probability distribution functions, indexed by a parametric vector
(2)
where -¥ < μ_{e} < ¥ is a location parameter and s_{e} > 0 is a scale parameter, the distribution of Z = (X – μ_{e}) /s_{e} does not depend on any unknown parameters.
Assumption 1. is strictly increasing continuous function for all zÎ(-¥, ¥), i.e., for all
For computational convenience, the maximum likelihood estimate is obtained by maximizing the log-likelihood function, This is because the two functions, and are monotonically related to each other so the same maximum likelihood (ML) estimate is obtained by maximizing either one. Thus,
(3)
where
(4)
2.2. Log-Likelihood Equations
To obtain the log-likelihood equations for the unknown parameters andthe log-likelihood function in Equation (3) will be differentiated partially with respect to and respectively, and set equal to zero to provide the maximum likelihood estimates (MLEs), and of the unknown parameters,and respectively, as follows:
(5)
(6)
where
(7)
Equations (5) and (6), which represent a necessary condition for the existence of the MLEs, can be reduced to:
(8)
(9)
It is usually not possible to obtain an analytic form solution of Equations (8) and (9). Hence, a numerical iterative technique can be used to obtain the MLEs, and of the parameters,and respectively.
2.3. Existence and Uniqueness of the ML Estimates
The maximum likelihood estimates need not exist nor be unique. In this section, we show that the MLEs, and which maximize (1), exist in the above case and are unique.
Assumption 2. andexist and are continuous functions for all zÎ(-¥, ¥).
Assumption 3. is strictly increasing function for all zÎ(-¥, ¥);is strictly decreasing function for all zÎ(-¥, ¥).
It follows from (8) and (9) that the MLEs, and exist and are unique if and only if
(10)
or
(11)
A sufficient condition for to be a maximum of (3) is that the Hessian matrix
(12)
evaluated at satisfies the following condition [11]:
(13)
where
(14)
It follows from (5) and (6) that
(15)
where
(16)
(17)
(18)
(19)
Then
(20)
and
(21)
i.e., the MLEs, and exist and are unique ones that maximize (1). Thus, the following theorem has been proven.
Theorem 1. Suppose that X is a random variable coming from a situation described by a location-scale family of probability distribution functions, indexed by a parametric vector
(22)
where -¥ < μ_{e} < ¥ is a location parameter and s_{e} > 0 is a scale parameter, the distribution of Z = (X – μ_{e}) /s_{e} does not depend on any unknown parameters.
To find an estimate of the unknown parametric vector the likelihood function,
(23)
of the observed grouped dataand from l random samples of sizes N_{j}, j=1(1)l, respectively, is used, where
(24)
are fixed values,
(25)
Then the estimates, and which maximize (23), exist and are unique if and only if the following conditions are satisfied:
1) is strictly increasing continuous function for all zÎ(-¥, ¥),
2) andexist and are continuous functions for all zÎ(-¥, ¥),
3) is strictly increasing function for all zÎ(-¥, ¥);is strictly decreasing function for all zÎ(-¥, ¥).
4)
2.4. Estimation of Minimal Effective Dose
Numerical methods can be used to find the ML estimate Then an estimate, , of the minimal effective dose (MED) for randomly chosen subjects, which elicits a prescribed lowest therapeutic response, is given by
(26)
where
(27)
a is a significance level (say, a = 0.05).
3. Maximal Tolerated Dose Estimation
3.1. Likelihood Function of Grouped Data
Let us assume that the random variable Y, which represents a tolerated dose level of a drug (from the point of view of toxicity) for randomly chosen subject, has a continuous cumulative distribution function (probability density function ) with unknown parametric vector . We consider l (l ³ the number of components of the unknown parametric vector) random samples of subjects of sizes N_{j}, j=1(1)l. The N_{j}_{ }subjects of the jth random sample are assigned to the dose d_{j} of a drug. Let m_{j}_{ }be the number of subjects in the jth sample, for which the tolerated dose of a drug is more than d_{j}. It is assumed, without loss of generality, that 0 < d_{1} < d_{2} < …< d_{l}.
The problem is to estimate the unknown parametric vector . For this purpose, it can be used the likelihood function of the grouped data
(28)
Consider a situation described by a location-scale family of probability distribution functions, indexed by the parametric vector
(29)
where -¥ < μ_{t} < ¥ is a location parameter and s_{t} > 0 is a scale parameter, the distribution of Z = (Y– μ_{t}) /s_{t} does not depend on any unknown parameters.
Assumption 1. is strictly increasing continuous function for all zÎ(-¥, ¥), i.e.,for all
For computational convenience, the maximum likelihood estimate is obtained by maximizing the log-likelihood function,This is because the two functions, and are monotonically related to each other so the same maximum likelihood estimate is obtained by maximizing either one. Thus,
(30)
where
(31)
3.2. Log-Likelihood Equations
To obtain the log-likelihood equations for the unknown parameters and the log-likelihood function in Equation (30) will be differentiated partially with respect to and respectively, and set equal to zero to provide the MLEs, and of the unknown parameters,andrespectively, as follows:
(32)
(33)
where
(34)
Equations (32) and (33), which represent a necessary condition for the existence of the MLEs, can be reduced to:
(35)
(36)
It is usually not possible to obtain an analytic form solution of Equations (35) and (36). Hence, a numerical iterative technique can be used to obtain the MLEs, and of the parameters, andrespectively.
3.3. Existence and Uniqueness of the ML Estimates
The maximum likelihood estimates need not exist nor be unique. In this section, we show that the MLEs, and which maximize (28), exist in the above case and are unique.
Assumption 2. and exist and are continuous functions for all zÎ(-¥, ¥).
Assumption 3. is strictly increasing function for all zÎ(-¥, ¥);is strictly decreasing function for all zÎ(-¥, ¥).
It follows from (35) and (36) that the MLEs, and exist and are unique if and only if
(37)
or
(38)
A sufficient condition for to be a maximum of (30) is that the Hessian matrix
(39)
evaluated at satisfies the following condition [11]:
(40)
where
(41)
It follows from (32) and (33) that
(42)
where
(43)
(44)
(45)
(46)
Then
(47)
and
(48)
i.e., the MLEs, and exist and are unique ones that maximize (28). Thus, the following theorem has been proven.
Theorem 2. Suppose that Y is a random variable coming from a situation described by a location-scale family of probability distribution functions, indexed by the parametric vector
(49)
where -¥ < μ_{t} < ¥ is a location parameter and s_{t} > 0 is a scale parameter, the distribution of Z = (Y – μ_{t}) /s_{t} does not depend on any unknown parameters.
To find an estimate of the unknown parametric vector the likelihood function,
(50)
of the observed grouped dataand from l random samples of sizes N_{j}, j=1(1)l, respectively, is used, where
(51)
are fixed values,
(52)
Then the estimates, and which maximize (50), exist and are unique if and only if the following conditions are satisfied:
1) is strictly increasing continuous function for all zÎ(-¥, ¥),
2) andexist and are continuous functions for all zÎ(-¥, ¥),
3) is strictly increasing function for all zÎ(-¥, ¥);is strictly decreasing function for all zÎ(-¥, ¥).
4)
3.4. Estimation of Maximal Tolerated Dose
Numerical methods can be used to find the ML estimate Then an estimate,of a safe dose (maximal tolerated dose (MTD)) for randomly chosen subjects that will not cause some undesirable effect (e.g., toxicity, carcinogenicity) is given by
(53)
where
(54)
a is a significance level (say, a = 0.05).
It is clear that upper end of the interval is the maximal tolerated dose (MTD) and the lower end of the interval is the minimal effective dose (MED) i.e.,
Statistical inference. If the dose of a drug d is given by
(55)
then there is more than 100(1-a)% assurance that d will be a suitable dose for randomly chosen subjects from the point of view of efficacy as well as from the point of view of toxicity.
4. Numerical Examples
4.1. Example 1
Let us assume that the random variable X, which represents an effective dose level of a drug (from the point of view of efficacy) for randomly chosen subject, has a continuous cumulative distribution function of a normal distribution, i.e.,
(56)
where is an unknown parametric vector, -¥ < μ_{t} < ¥ is a location parameter and s_{t} > 0 is a scale parameter. There are l=4 random samples of subjects of size N_{j}=10, j=1(1)l, and l=4 fixed doses d_{j}, j=1(1)l, of a drug, where
(57)
The N_{j}=10_{ }subjects of the jth random sample are assigned to the dose d_{j} of a drug. Let n_{j}_{ }be the number of subjects in the jth sample, for which the effective dose of a drug is less than d_{j}.
The observed grouped data n_{j} and N_{j }- n_{j} from l random samples of sizes N_{j}, j=1(1)l, respectively, are given as follows:
(58)
and
(59)
It follows from (11) that
(60)
i.e., the MLEs, and exist and are unique ones that maximize (1). Thus, using the Solver of Excel 2010 for maximizing (1), we have that
(61)
where
(62)
(63)
Then the estimate,, of the minimal effective dose (MED) for randomly chosen subjects, which elicits a prescribed lowest therapeutic response, is given by
(64)
where the significance level a = 0.025.
Let us assume that the random variable Y_{,} which represents a tolerated dose level of a drug (from the point of view of toxicity) for randomly selected subject, also has a continuous cumulative distribution function of a normal distribution, i.e.,
(65)
The observed grouped data m_{j} and N_{j }- m_{j} from l random samples of sizes N_{j}, j=1(1)l, respectively, are given as follows:
(66)
and
(67)
It follows from (38) that
(68)
i.e., the MLEs, and exist and are unique ones that maximize (28). Thus, using the Solver of Excel 2010 for maximizing (28), we have that
(69)
where
(70)
(71)
Then the estimate, of a safe dose (maximal tolerated dose (MTD)) for randomly chosen subject, which will not cause some undesirable effect (e.g., toxicity, carcinogenicity) is given by
(72)
where the significance level a = 0.025.
Thus, the upper end of the interval is the maximal tolerated dose (MTD) and the lower end of the interval is the minimal effective dose (MED) =7.0557, i.e.,
Statistical inference. If the dose d of a drug is given by
(73)
then there is more than 97.5% assurance that d will be a suitable dose for randomly chosen subjects from the point of view of efficacy as well as from the point of view of toxicity.
4.2. Example 2
Consider the situation described in Example 1. Let us assume that
(74)
where
(75)
is a cumulative distribution function of a two-parameter Weibull distribution with an unknown parametric vector is a scale parameter, is a shape parameter. Using the Solver of Excel 2010 for maximizing the likelihood function of the grouped data (58) and (59),
(76)
it can be obtained the MLEs
(77)
where
(78)
(79)
Then the estimate,, of the minimal effective dose (MED) for randomly selected subjects, which elicits a prescribed lowest therapeutic response, is given by
(80)
where the significance level a = 0.025.
Now, let us assume that
(81)
where
(82)
is a cumulative distribution function of a two-parameter Weibull distribution with an unknown parametric vector is a scale parameter, is a shape parameter. Using the Solver of Excel 2010 for maximizing the likelihood function of the grouped data (66) and (67),
(83)
it can be obtained the MLEs
(84)
where
(85)
(86)
Then the estimate, , of a safe dose (maximal tolerated dose (MTD)) for randomly chosen subject, which will not cause some undesirable effect (e.g., toxicity, carcinogenicity) is given by
(87)
where the significance level a = 0.025.
Maximum likelihood inference. Since
(88)
and
(89)
it follows from (88) and (89) that the normal distribution better fits the grouped data of Example 1 than the Weibull distribution.
5. Conclusion
Dose–response experiments are an important part of biomedical research to study relationships between increasing doses of a therapeutic compound and a variety of responses. Typically, the response represents a phenotypical effect of a compound such as inhibition, stimulation, toxicity, or expression level of a certain gene. The primary goal of such an experiment is to detect a dose–response relationship and to determine the nature of the relationship wherever it exists. In this article, we focus on a continuous response and an experimental design with fixed number of doses. If the dose–response relationship exists, it may be monotone, that is, the compound effect (increasing or decreasing) becomes stronger (or stays the same) with an increasing dose. This property is very common in real applications, especially when inhibition or toxicity is measured. More general umbrella-shaped profiles [12] can occur within the context of overdosing and therefore a decreasing (increasing) effect is expected after reaching some threshold dose. These properties are not discussed in this paper. It is assumed only that the dose of a drug suitable for a randomly chosen subject is a random variable that follows a certain law of probability distribution. If there are a few models of the probability distribution for the aforementioned random variable, then the best model is the one that provides the greatest value of the maximum likelihood function of the grouped data. The methodology described here can be extended in several different directions to handle various problems that arise in practice. We have illustrated the proposed methodology for location-scale distributions (such as, say, the normal, lognormal, Gumbel, Weibull distribution, etc.). Applications to other distributions could follow directly.
References