Assessing the Impact of Square Root Transformation on Weibull-Distributed Error Component of a Multiplicative Error Model

This paper aims at determining if the assumed fundamental structure of the error component (unit mean and constant variance) is maintained after the square root transformation of a Weibull-distributed error component of a multiplicative model and also to investigate what happens to variance of the transformed and untransformed in terms of equality and non-equality. Considering the possibility that the error component of a Multiplicative Error Model (MEM) can be a Weibull distribution (W (σ, n)); σ and n are shape and scale parameters respectively) and the need for data transformation as a popular remedial measure to stabilize the variance of a data set prior to statistical modeling, this paper investigates the impact of the square root transformation on the mean and variance of a Weibull-distributed error component of a MEM. The mean and variance of W (σ, n) and those of the square root transformed distributions are calculated for σ= 6, 7,.., 99, 100 with the corresponding values of n for which the mean of the untransformed distribution is equal to one. The fitted MEM (2,0) under the square root transformation gave a better fit than the original fitted MEM (2,0). The paper concludes that the square-root transformation would yield better results as they reveal constancy in variance when using MEM with a Weibull-distributed error component and where data transformation is deemed necessary to stabilize the variance of the data set.


Rationale
Multiplicative Error Models (MEMs) were originally introduced as Autoregressive Conditional Duration (ACD) models by Engle R. F and Russell J. R… [8] and were generalized to any non-negative time varying event process by Engle, R. [7]. MEMs provide an observationdriven approach for dynamic non-negative variables; see the study conducted by Ramachandran K. M. and Tsokos C. P … [16].
Irrespective of the specification of the function (.) µ and of 1 ( | ) The assumption according to … [8] is that the time dependence in the durations can be subsumed in their conditional expectations (4), in such a way that | t t x ψ is independent and identically distributed.
1 t ψ − denotes the information set available at time 1 t − . No doubt the realization of (3) supports such distributions as Weibull, Gamma, Log Normal according to … [1,3,12]. In this case, t ε are independently and identically distributed as a Weibull (1, ) α probability distribution, as t ψ are proportional to the conditional expectation of t x as explained below 1 1 α β + < The conditional lag in time is guaranteed by the first three conditions while the last condition ensures non-negativity.
The property (4) provides us with a link on (6) which gives where (.) Γ is the gamma function. If 1 α = , the Weibull distribution becomes an exponential one. Here, .
The property of (3) also in realization support the left truncated normal distribution whose effect on the error component of the multiplicative time series has been studied via the logarithm, inverse, square, inverse square, square root transformations in their various studies … [6,9,10,13,14].
A random variable X has a Weibull distributioñ ( , ) X Weibull n α , with shape parameters ( 0), α > and scale parameter ( 0) n > . The probability density function of a standardized Weibull random variable X according to Tsay, R … [18] is: Γ is the usual Gamma function. The mean and The Weibull distribution is used in reliability and survival analysis to model the lifetime of objects, organisms and service time.

Research Problem
When distributions are non-normal (e.g., highly skewed, multimodal, or heavily-tailed), the ability to identify a viable probability distribution using the normal theory approach is reduced. This is because, under normal theory Statistics, the probability distributions are symmetric and more or less bellshaped and, when distributions are non-normal, especially in the ways listed above, the probability distribution is distorted which effects the estimation of means and variances leading to erroneous statistical results.

Research Importance
Data transformations are useful in many aspects of statistical work, often for stabilizing the variance of the data. Non constant variance is quite common in time series data, for example in financial time series analysis the problem is often to model non-negative valued processes. This occurs when considering variables such as volumes, trades durations, realized volatility, daily price range, etc.
To enhance approaches to forecasting, data transformation seems to be the most frequent reason for researchers to make the distribution of their data "normal" and thus fulfill one of the assumptions of conducting a parametric means comparison. Other reasons for data transformation include more informative graphs of the data, better outlier identification and increasing the sensitivity of statistical tests. Succinctly put, data transformation is a mathematical operation that changes the measurement scale of a variable. According to Chartfield, if there is trend in the series and the variance appears to increase with the mean, then it may be advisable to transform the data [5]; and in particular -if the standard deviation is directly proportional to the mean, a logarithmic transformation is appropriate. He proceeded to outline the following reasons for transformation (i). variance stabilization (ii). to make the seasonal effect additive and (iii) to normalize the data.
A successful transformation is achieved when the desirable properties of a data set remains unchanged after transformation. These basic properties or assumptions form the object of interest for this study and include; (i) Unit mean and (ii) constant variance. Iwueze pioneered a work on the implications of logarithmic transformations on the error component of the multiplicative model a [10]. Interestingly, the work elicited spontaneous interest in this area of time series analysis, thus adding to the litany several works including, …. [6,9,13,14].

Motivation
The purpose of this study is to determine if the assumed fundamental structure of the error component (unit mean and constant variance) is maintained after the power transformation and also to investigate what happens to variances of the transformed and untransformed (i.e., 2 1 σ and 2 2 σ ) in terms of equality and non-equality. To examine this, the Weibull distribution, a non-normal distribution whose distributional characteristics fit 2 (1, ) N σ is studied considering its flexibility and adaption with asymptotic properties relative to multiplicative error modeling.

Research Objectives
To investigate the use of square root transformation to transform Weibull data for evaluating its effect on the error component of a multiplicative error model, the pdf of the transformed distribution, th k uncorrected moments of the transformed, expression for the mean and variance of the transformed transformation, relative change in mean and variance of the transformed and untransformed distribution tests of model fit were reviewed. The procedure adopts the following algorithm as its aims and objectives: i. pdf of the transformed distribution is obtained ii. th k uncorrected moments of the transformed iii. Expression for the mean and variance of the transformed distribution iv. Relative change

The th P Transformed Weibull-Distributed Random Variable
Using the power transformation as a form of transformation that is frequently used in statistical analysis defined as follows [15]: and whereJ  is the absolute value of the Jacobian of the p-th power transformation. The pdf of t Y , denoted as ( ) t f y is then obtained as ( ) Now, suppose the error component (e t ) of a Multiplicative Error Model (MEM) is assumed to follow a Weibull distribution, then the probability density function pdf of e t denoted as f(e t ) is given as follows and ( ) ( ) Using (12) and (13), we obtain the pdf of the p-th transformed Weibull distribution denoted by ( ) To establish that ( ) . We now proceed as follows; In (17) if, we obtain p σ p σ -1 p p t t p n z y = n z and dy = d z σ (19) Now substituting (18) and (19) into (17), we have that The pdf of the transformed Weibull variable under the square root transformation is thus

The K-th Uncorrected Moment of Y t [E(Y k )]
The mean and higher-order raw moments can be used to describe the distribution of any random variable fairly well. Even the celebrated Central Limit Theorem which forms the basis for inferential statistics rely on moments, just to mention a few importance of moments in probability and statistics. The moments of the transformed Weibull distribution were derived: Inserting the substitutions in (15) and its corresponding results in (16) into (17), gives The expressions for the mean and variance under the square root transformation are given by respectively.

Numerical Applications
The data is on the daily closing stock prices of the Egyptian stock index (EGX 30) and the data was obtained from DataStream. The data only include official trading days (Monday-Friday) except public holidays and it span from 04/08/2004 to 13/04/2018 (3051 trading days).

Method of Data Analysis
In analyzing the data of this study, the following procedure shall be applied i) Make a plot of the data ii) Test for homogeneity of variance iii) Assess the appropriate data transformation and application of transformation iv) Fit the appropriate MEM model to the original data set v) Fit the appropriate MEM model to the transformed data vi) Residual analysis.

Plot of the data
The plot of the data is given in Figure 1 and we could quickly observe a trend pattern, a vigorous up and down movement in the plot and evidence of non-constant variance.

Testing for Homogeneity of Variance
Here the Levene's and Barttlet's tests would be employed to test for homogeneity of variance by grouping the data set into months of 25 trading days leaving only the last group with 26 trading days. The results of the tests are given in Figure 2. The p-values for the two tests are both 0.000 which suggest that the variance of the data set is not constant and thus there is need for a variance-stabilization transformation.

Assessment of the Appropriate Data Transformation
Here we would employ the Bartlett's technique as applied by Akpant, A. C and Iwueze, I. S … [2]. Since the data set is on daily basis, the technique involves partitioning the data into groups of fairly equal sizes. For this purpose, the data is

Test for Equal Variances for Egypt Stock Index
the slope coefficient (λ ) determines the appropriate transformation to be adopted. The groups' means and standard deviation and their natural logarithms are given in Table 1 while the fitted line plot of the natural logarithm of the groups' standard deviations (Log e ( ) i σ ) against the natural logarithm of the groups' means (Log e ( i X )) is given in Figure 3.   At 5% level of significance Z cal = 0.7687 < Z tabulated =1.95 and thus there is no evidence to reject the null hypothesis and thereforeλ ≅0.5 would be adopted which by Bartlett's technique as applied by Akpanta and Iwueze, affirms that the square root transformation is the most appropriate [2].

Fitting the Appropriate MEM to the Original Data Set
The Plot of the Autocorrelation function, ACF Figure 4 shows a gradual exponential decay and that of the Partial Autocorrelation Function, PACF Figure 5 shows a cut-off at

Fitting the Appropriate MEM to the Transformed Data
The plot of the square root-transformed is given in Figure   6 while the ACF and PACF are given Figures 7 and 8 respectively. It could be seen from Figure 6 an apparent reduction in the fluctuation of the transformed time series compared to the original time series. Figure 7 shows a gradual exponential decay and Figure 8 shows a cut-off at lag 2 suggesting that we can once again fit the MEM (2,0) in Equation 1 to the square root transformed data.
Repeating the analysis, gives the maximum likelihood estimate of the coefficients of the MEM (2,0) model with their corresponding standard errors in parenthesis are ̂= 82.4136 29.1944 = 1.1615 0.0179 and = −0.1622 0.0179 . The standard errors of the parameters are smaller than the values of the coefficients and this implies that the parameters are statistically significant.

Residual Analysis
The mean and variance of the residual series for the untransformed data are 1.0004 and 2.8 × 10 respectively while the mean and variance of the residual series for the square-root-transformed data are 1.0001 and 7.1 × 10 respectively (estimated from MINITAB software).

Partial Autocorrelation Function for the Transformed Data
(with 5% significance limits for the partial autocorrelations) The constant variance assumption for can further be verified by testing the null ! that is a stationary time series versus the alternative that is not a stationary time series and we can carry out this test by using the Cox-Stuart test, the turning Point test and the KPSS test and the p-values of these tests are 0.4734, 0.7506 and 0.1000, respectively as the p-value>0.05 confirms that has a constant variance and the overall variance was calculated as 2.8 × 10 . Also, the unit mean assumption for can further be verified by performing a one sample t-test by testing the null ! that the mean of is equal to 1 versus the alternative that the mean of is not equal to 1. The exact mean for was calculated as 1.0004 and the t-test result yields a p-value of 0.1923 (>0.05) suggesting that has a unit mean as obtained by MINITAB software.
Recall that and that the distribution of D + has support on [0, ∞ so to investigate this, some non-overlapping windows for was first constructed resulting to 102 windows where 101 of the window-lengths are equal (size 30) and just one window is of length 21 and calculated their corresponding variances and in the end had a total of 102 samples of variance. The histogram of the variance is shown in Figure 10 Figure 10 -Smirnoff goodness-of-fit test for the fitted Weibull distribution is 0.1699 (>0.05) thus, indicating a decent fit of the Weibull distribution for the sample of variances (see Figure 10(a)). It is clear from Figure 10(a) that the Weibull distribution provides a good fit for . D + Figure 9(b) shows the scatter plot of the error component of the fitted MEM (2,0) with the square-root-transformed dataand from the plot we can see that there is no obvious pattern in the plot and the error term appears to be pretty much centred around 1 (red line) thus, suggesting unit mean and constant variance for the error term.
The constant variance assumption for can also be further verified by testing the null ! that is a stationary time series versus the alternative that is not a stationary time series and we can carry out this test by using the Cox-Stuart test, the turning Point test and the KPSS test and the p-values of these tests are 0.4125, 0.8172 and 0.1000, respectively thus the p-value>0.05 confirms that has a constant variance and the overall variance was calculated as 7.1 × 10 . Also, the unit mean assumption for can further be verified by performing a one sample t-test by testing the null ! that the mean of is equal to 1 versus the alternative that the mean of is not equal to 1. The exact mean for was calculated as 1.0001 and the t-test result yields p-value of 0.2073 (>0.05) suggesting that has a unit mean.
Again, recall that and that the distribution of D + has support on [0, ∞ so to investigate this, some non-overlapping contiguous windows for were first constructedresulting to 102 windows where 101 of the window lengths are equal (size 30) and just one window is of length 21 and their corresponding variances were calculated, thus spanning a total of 102 samples of variance at the end. The histogram of the variance is shown in Figure 10 (b). The Weibull distribution was fitted to the sample of 102 variances by the method of MLE and the estimates of the parameters are % & = 1.2337 and ' ( = 7.6 × 10 and the p-value of the Kolmogorov-Smirnoff goodness-of-fit test for the fitted Weibull distribution is 0.2123 (>0.05) thus, indicating a decent fit of the Weibull distribution for the sample of variances (see Figure 10(b)). It is seen from Figure 8(b) that the Weibull distribution provides a good fit for D + .
To compare the fitted MEM model under the two different data structures, the use of the mean error (ME), root mean square error (RMSE), mean absolute error (MAE), mean percentage error (MPE), and mean absolute percentage error (MAPE) and the Akaike information criterion (AIC) to check which of the models performed better are made. The model with consistently smaller value of the error measures and the AIC value is considered to be better than the other one. From Table 2, it is clear that the fitted MEM (2,0) model under the square root transformed data gave a better fit than the fitted MEM (2,0) model under the original data. In conclusion, the square root transformation reduced the variance from 2.8 × 10 to 7.1 × 10 while the unit mean assumption remains unchanged. That is the unit-mean is unaffected by the transformation while there is a reduction in variance as established by the theory.

Conclusion
In this study the probability density function of the square root transformed two-parameter Weibull distribution and their respective distributional properties were derived. The k-th uncorrected moment, for the second, third and fourth moment as well as their expectation variances were obtained. The findings established the following main results that decreasing the value of scale parameter is meaningful and effective for the square root transformation. The study confirmed consistence of the unit mean and constant variance assumption in the transformed and untransformed dataset of a Weibull distributed error component of the multiplicative error component. The fitted MEM (2,0) under the square root transformed data gave a better result than the original fitted MEM (2,0). In conclusion, the square root transformation reduced the variance from 2.8 x 4 10 − to 7.1 x 5 10 − whereas the unit-mean is unaffected by the transformation while there is a reduction in variance as established.

Recommendation
The Weibull distribution has been shown to be very flexible in modeling various types of lifetime distribution with monotone failure rates, as data transformation is proven necessary tool to stabilize variance in statistical analysis. Therefore, from this study the following recommendations are made: The choice of square root transformation is deemed fit to transforming the Weibull distributed random variable given a multiplicative error model.
More research should be conducted to investigate other mixtures of the Weibull class of distributions considering its known flexibility and applications for modeling lifetime events. More research should be conducted on the application of the statistical results of this study on various fields, especially quality assurance, environmental and engineering applications.