Two-Sided Generalized Gumbel Distribution with Application to Air Pollution Data
Mustafa Ç. Korkmaz
ArtvinÇoruh University, Department of Statistics and Computer Sciences, Artvin/TURKEY
To cite this article:
Mustafa Ç. Korkmaz. Two-Sided Generalized Gumbel Distribution with Application to Air Pollution Data. International Journal of Statistical Distributions and Applications.Vol. 1, No. 1, 2015, pp. 19-26.doi: 10.11648/j.ijsd.20150101.14
Abstract: We introduce a univariate generalized form of the Gumbel distribution via two-sided distribution structure. We obtain its some properties such as special cases, density shapes, hazard rate function and moments. We give the maximum likelihood estimators of this two-sided generalized Gumbel distribution with an algorithm. Finally, a real data application based on air pollution data is given to demonstrate that it has real data modeling potential.
Keywords: Gumbel Distribution, Two-Sided Distribution, Generalized Gumbel Distribution, Exponentiated Gumbel Distribution
The Gumbel distribution, denoted by Gu, is introduced by German statistician Emil J. Gumbel (1958) and it is introduced by the following cumulative distribution function (cdf) and probability density function (pdf)
respectively and where , ,.
Gu distribution is frequently used for modelling in many areas such as environmental, engineering and actuarial sciences. It is also known as the extreme value distribution of type I. Also, the Gu distribution is a limit distribution of the generalized extreme value distribution (Von Mises, 1954). Kotz and Nadarajah (2000) explain this distribution in detail and with its applications. To increasing model flexibility of the Gu distribution there are several generalization of the Gu distribution in the literature such as the beta-Gu distribution (Nadarajah and Kotz, 2004), the generalized Gu distribution (Cooray, 2010), the Kumaraswamy-Gu distribution (Cordeiro et al., 2012), the Gu-Weibull distribution (Al-Aqtash et al, 2014) and the exponentiated generalized Gu distribution (Andrade et al. 2015). For more information on Gumbel and extreme value distributions, see Gumbel (1958), Johnson et al. (1995), Kotz and Nadarajah (2000), and Beirlant et al. (2006).
On the other hand, two-sided generalized a class of the distributions is introduced by Korkmaz and Genç (2015) by following cdf
whereshape parameter, reflection parameter, is parameter vector, is the cdf of the base distribution and its inverse. The corresponding pdf is
Since the standard two-sided power distribution (Van dorp and Kotz, 2002) is applied as a distribution class generator, the standard two-sided power distribution underlies of this generalized two-sided class. Since comes from the standard two-sided power distribution, this parameter also controls the kurtosis and tail of the distribution. This general two-sided class can contain alternative distributions for modeling not only positive data but also negative data with high kurtosis.Korkmaz and Genç (2015) also define some members of this class using ordinary distributions such as exponential, Weibull, normal, Fréchet, half logistic, Pareto, Gumbel and Kumaraswamy. Two-sided generalized normal distribution is studied in detail by the authors. In this paper we obtain the some properties of the generalized form of the Gumbel distribution, referred to as the two-sided generalized Gumbel (TSGG) distribution, defined by Korkmaz and Genç (2015) via two-sided distribution structure.
From (1), (2), (3) and (4) the cdf and pdf of the TSGG distribution are easily obtained as
respectively and where, parameter vector, and. We also note that is the reflection point of the distribution.
In the rest of this paper, we obtain the special cases and explore density shape of (6). We examine hazard rate function. We derive formulas for the rth moment. We also consider the maximum likelihood estimates of parameters. Finally we end the paper with a real data application.
2. Special Cases and Shapes
When and , the TSGG distribution reduced to Gumbel distribution and triangular-Gumbel distribution respectively. When , the distribution consists of the exponentiated-Gumbel distribution with cdf, that is Lehmann type 1-Gumbel distribution denoted by . When , the distribution convergences to distribution with cdf , that is Lehmann type 2-Gumbel distribution denoted by . We note that distribution is also introduced by Nadarajah (2006). Further, the TSGG distribution is in fact a mixture of the distribution truncated above at and the distribution of the distribution truncated below at the same point, with the mixing parameter, that is,
where, denotes the doubly truncated distribution with truncation points a and b, and similarly for . Hence TSGG has properties of both distribution and the distribution.
First and second derivatives of for the TSGG distribution
Respectively and where.
These derivatives show that has a mode at on thesupport. On the other support it also may have a mode at solution point of the . As a result we can say that TSGG distribution can be bimodal. Further
Plots of the pdf (6) for some parameter values are given in Figure 1.
3. Hazard Rate Function
The hazard rate function defined by
and its important issue in the lifetime modeling. For TSGG distribution hazard rate function is given by
where .We plot the hazard rate function of the TSGG distribution in Figure 2. From Figure 2 we see that TSGG distribution has increasing hazard rate as ordinary Gu distribution. We also note that contrary to the Gu distribution, we observe that the hazard rate function of the TSGG can be firstly unimodal then increasing shaped for some selected values ofand. With this property, the TSGG distribution is much more advantageous than ordinary Gu distribution. Moreover
Using the moment definition and setting we write the rth moment of the TSGG distribution.
Using the binomial expansion for and rth moment can be obtained by
where and denotes these integrals with
respectively. Especially for r=1, using equations (22.214.171.124) and (126.96.36.199) in Prudnikov et al. (1986) we obtain the following cases for the calculating expected value
where denotes the exponential integral and is (Prudnikov et al., Eq. 188.8.131.52, 1986).Thus,
We sketch the skewness,, and kurtosis, , measurement in Figure 3 where .
We can observe empirically that the TSGG distribution can be left skewed, symmetric or right skewed. So it is much more flexible than the Gu distribution.
5. Maximum Likelihood Estimation
Let be a random sample of size n from the TSGG distribution and let denote the corresponding order statistics. Then the log-likelihood function is given by
wherefor r = 1,2,…,n and .
From Van dorp and Kotz (2002) and Korkmaz and Genç (2015), the estimating of the reflection point is one of the order statistics. Accordingly, we have the maximum likelihood estimate (MLE) of the and as
and by equating to zero the first derivative of the (20) respect to
For and , the associated likelihood estimating equations are found
where , . We need some iterative procedure to find the estimates forand parameters. We may explain this procedure with an algorithm:
Step1: Set and put an initial values and for in the log likelihood.
Step2: Compute the following estimates
Step3: Update and by using (25) and (26) to find and
Step4: Ifis less than a given tolerance, say , then stop. Else and go to Step 2.
We note that the usual regularity conditions, which belong to the asymptotic normality of the MLEs, are not ensured for the TSGG distribution since the support of the pdf of the TSGG depends on parameters and the pdf is not differentiable at . In addition to the estimator of theis based on the order statistics. So, the observed information matrix, which used to obtain the asymptotic variances of the MLEs, can be found numerically via optimization procedure in packet programme such as R, Maple, Matlab.
6. Data Analysis
In this section, we give a real data application. The computations of the MLEs of all parameters for all the distributions are obtained by using the optim function in Rprogram with L-BFGS-B method. This function also gives the numerically differentiated observed information matrix. The data are from the New York State Department of Conservation corresponding to the daily ozone level measurements in New York in May-September, 1973. Recently, Nadarajah (2008), Leiva et al.(2010), Cordeiro et al. (2013) and Korkmaz and Genç (2014) analyzed these data. To see the performance of the TSGG, we fit this data set to TSGG. After fitting the TSGG distribution to this data set, we find the following MLE results:
and where standard errors are given in parentheses.
Also we give the value of the Kolmogorov-Smirnov goodness of fit test statistic as 0.0175 with a p-value 0.5995. Hence we accept the null hypothesis that the data set is come from the TSGG distribution. We give the fitted TSGG density and empirical cdf plots in Figure. 4. These conclusions are also supported by Figure 4. Therefore, we show that the TSGGdistribution has the real data modeling potential.