NON GAUSSIAN LONGITUDINAL ANALYSIS OF PROGRESSION OF DIABETES MELLITUS PATIENTS USING FASTING BLOOD SUGAR LEVEL: A CASE OF DEBRE BERHAN REFERRAL HOSPITAL, ETHIOPIA

: Diabetes mellitus is a group of metabolic diseases characterized by hyperglycaemia resulting from defects in insulin secretion, insulin action, or both. It is a chronic disease with a high prevalence and growing concern in world wide. There are two Types of diabetes, which are Type I and Type II. A longitudinal data analysis retrospective based study was conducted between 1 st September, 2012 to 30 th August 2015 in Debre Berhan referral hospital. The main objective of the study was Non Gaussian longitudinal analysis of progression of Diabetes mellitus patients using fasting blood sugar level count following insulin, metformin and to identify factors predicting the progression of diabetic infection. A total of 248 Diabetes mellitus patients were included in the study whom 111(44.8%) were females and the rest 137(55.8%) were males. The generalized estimating equation would be used to estimate the progression of diabetic infection. The most appropriate working correlation structure was exchangeable correlation selected for this study. This study showed that age, male, primary, urban and the interaction effects of primary and age with time was statistically associated with the progression of fasting blood sugar level over time. Moreover, on average fasting blood glucose decreases in a quadratic pattern over time after patients initiated to insulin and metformin program. Finally we obtained generalized estimating equation model exhibited the best fit for this data with smaller disturbance for their estimated standard error. significant independent for fasting glucose level count at This study evaluated the relationship between the progressions of diabetic infection using longitudinally measured fasting blood sugar level count and its possible covariates using longitudinal analysis methodologies. The results of this study showed that the main effect independent variables age, male, urban and time and the interaction effects of age ,illiterate, primary, secondary and urban with time are significantly associated with the progression of fasting blood sugar level count over time in linear mixed model .In generalized linear mixed model the results showed that the main effect independent variables age, male and time, and the interaction effects of illiterate, primary, urban and age with time are significantly associated with the progression of fasting blood sugar level over time.

should be measured repeatedly per individual what is called longitudinal data, since the measurements are correlated within individuals, the classical regression techniques couldn't use rather the most flexible and powerful models were employed to handle such types of data [7]. The main aim of data analysis using the linear mixed model is to define an adequate error covariance structure in order to obtain efficient estimates of the regression parameters. The statistical software now includes the covariance structure as part of the statistical model and thus the covariance matrix can be used to estimate the fixed effects of treatment and time by means of the generalized least squares method [9].
The general Objective of this study was Non Gaussian longitudinal analysis of progression of Diabetes mellitus Patients using fasting blood sugar level following insulin and metformin follow up and identify associated factors of fasting blood sugar level in Debre Berhan referral hospital, Ethiopia. 2. Data and Methodology: 2.1 Data: All Diabetes mellitus patients who were both Type I and Type II, and placed under insulin and metformin follow up the case unit of 1 st September, 2012-30 th August, 2015 G.C, To categorized fasting blood sugar level, that means under normal condition and below normal condition which was used to assess whether good control of fasting blood sugar level over time in Debre Berhan referral hospital for a period of three years. The total number of patients included in this study was 248 of whom 111(44.8%) were females and the rest 137(55.2%) were males.

Methodology:
The data set was a longitudinal observational study and the data also unbalanced, since some patients do not have data until the end of the study. But in this case the response variables are categorized, approaches were proposed to tackle this problem .i.e. GEE method was proposed.

Generalized Estimating Equations (GEE):
Generalized Estimating Equations were introduced by [10] as a method of dealing with correlated data. The GEE approach is the most popular method seen in marginal models. GEE is an extension of GLM for the analysis of longitudinal data .In this method, the correlation between measurements is modelled by assuming a working correlation matrix. Estimating the correct working correlation matrix provides efficiency parameter estimates. Using the GEE method are marginal models that only estimate population average regression coefficients. They have consistent and asymptotically normal solutions by relying on the independence across subjects to estimate constantly the variance of the regression coefficient even when the assumed correlation structure is incorrect [12]. GEE analysis is generally valid only when the data are missing completely at random (MCAR) and it gives a biased estimator of the regression parameter in the mean model. The Marginal Mean Model: Assumed that N patients measured repeatedly through time and let denote the response for i th patient at j th time y ij is count response variable with non-negative integer values .The mean is related to X by a log link function. g( )= Log( ) =X ij …………………. (1) Where: :The mean of Y ij, which is related to the covariates of X ij by the link function X ij : Apx1vector of covariates : Apx1 vector of unknown regression coefficients of X, and g(.): Log link function as Y ij is count Working Correlation Structures: GEE estimator for the regression parameter will be the most efficient if the working correlation matrix is correctly specified. Hence it is desirable to choose a working correlation matrix that is the closest to the underlying structure among a set of working structures. With GEE, this correction is carried out by assuming a priori certain working correlation structure for the repeated measurements of the outcome variable Y. Before carrying out a GEE analysis, the within-subject correlation structure was chosen based on the results of exploring correlation structure of the observed data. Accordingly two propose working correlations were compared. I. Independent Structure: This is the correlation that GEE model assumes by default. With this structure the correlations between subsequent measurements are assumed to be zero or measurements are independent to each other within individuals.

II. Exchangeable Correlation Structure (Compound Symmetry):
It assumes the correlations between subsequent measurements are assumed to be the same, irrespective of the length of the time interval. Generally, assuming no missing data, the JxJ covariance matrix for y is modeled as: Where ∅ is a GLM dispersion parameter which is assumed 1 for count data, A i is a diagonal matrix of variance functions, and R i is the working correlation matrix of Y.

Method of Estimation and Statistical Inference:
Estimation is more difficult in the mixed model than in the general linear model. This is because in mixed model estimation of random effects and covariance structure of the random error is necessary besides to the fixed effect. Both the ML and REML were used for estimation of the parameters in this study. The ML method finds the parameter estimates that are most likely to occur given

International Journal of Current Research and Modern Education (IJCRME)
Impact Factor: 6.925, ISSN (Online): 2455 -5428 (www.rdmodernresearch.com) Volume 4, Issue 1, 2019 the data. Both are based on the likelihood principles, which have the properties of consistency, asymptotic normality, and efficiency. The difference between the two likelihood principles is:  REML handles strong correlations among the responses more effectively  The difference between ML and REML estimation increase as the number of fixed effects in the model increases .Instead inference can only use Wald statistics constructed with asymptotic normality of the estimators together with their estimated covariance matrix. 3. Results and Discussions: 3.1 Results: Fasting Blood Sugar level of Diabetes mellitus patients enrolled in the case unit of 1 st September, 2012-30 th August, 2015 G.C, in Debre Berhan referral hospital for a period of three years. The total number of patients included in this study was 248 of whom 111(44.8%) were females and the rest 137 (55.2%) were males. The aim of exploratory data analysis is to understand the data structure and determine the relevant modelling approaches suitable for the data.  Table 2: the mean of fasting blood sugar level of patient's increases with an increasing rate until at time two, then decreases slowly until at time five. The largest value of standard deviation with at time two ,the value was 96.85, so the number of measurements for fasting blood sugar level count showed increasing and decreasing observations between follow up times for the response indicating that the data have both intermittent and dropout missing observations and also, missing value was increasing over time. In any data analysis, before going to make analysis, first we have to check the assumption of data. In fasting blood sugar level of data we must using Shapiro wilk test, box plots and q-q plots, were used for checking the normality of the data. The Shapiro-Wilk test for normality is available when using the distribution platform to examine a continuous variable. The null hypothesis for this test is that the data are normally distributed. Count data can be well approximated by a normal distribution when the number of the continuous becomes large, so to normalize the data using logarithmic and square root transformation were carried out. Now identify which condition is more to normal approximation satisfied. Now this study was found that the actual fasting blood sugar level count were not normal at all-time points as the test showed significant deviation from normality. Likewise, the square root fasting blood sugar level was not normal by the Shapiro wilk test. But, the test approves normality of logarithm transformation fasting blood sugar level at all-time were normality satisfied and also when to compare the significance level at each time point greater than 0.05, except at time five And also, the q-q plots of the overall of the data signifying at logarithm transformation of the data to satisfied normality by follow up time

International Journal of Current Research and Modern Education (IJCRME)
Impact Factor: 6.925, ISSN (Online): 2455 -5428  (www.rdmodernresearch.com) Volume 4, Issue 1, 2019 Table 4: In GEE parameter estimate, the empirical standard error estimates are robust estimates that do not depend on the correctness of the structure imposed on the working correlation matrix. The model-based standard error estimates are based directly on the assumed correlation structure. The model-based standard errors are better estimates if the assumed model for the correlation structure is correct [13] and also the model based standard errors are obtained using the models option on the repeated statement. The GEE analysis estimates the scale parameter in empirical standard error estimates, but the scale parameter is not required with the GEE analysis this were shown in the (appendix B, Table A4, Table A5. Table 4.8, the intercept exp(0.7702) = 2.1601; is an estimate of the mean fasting blood sugar level ,at base line(Time=0)for which is significantly different from zero (p = 0:0001).

Estimated Coefficients for GEE models for Fasting Data Taken in Debre Berhan
The estimated value for sex of male patients, exp(0.0175) = 1.0176; p = 0.0551,this implies that the mean of fasting blood sugar level for males 1.0176 higher than for the reference group. Now their difference was highly significant at 5% level of significance. Moreover, at base line the mean of fasting blood sugar level among working patients were 0.9961; p = 0:7353 times lower than the mean fasting blood sugar level among the ambulatory patients patients (reference group). The education levels of illiterate patients are 1.0036; p = 0.862 times higher than the other educational level groups, and also the primary and secondary patients were 1.0671; p = 0.0293 and 1.0189; p = 0.8315 counts per month higher than the rate of increase among subjects in the tertiary respectively, but there is no significant effect in the variable for illiterate and secondary of patients, but also the primary of patients are significant effect at 5% level of significance. From the above discussion depends on the GEE model for the exchangeable working correlation equation because the selected working correlation.

Discussions:
In GEE model the appropriate working correlation structure was exchangeable correlation structure (compound symmetry) selected based on exploratory analysis result and independence correlation simply taken for the sake of comparison, for GEE model were compared in this paper and found that exchangeable working correlation structure fits the fasting data better than independence and AR(1). This study also compared LMM and GEE models using their standard error estimates ratio [11] and obtained GEE fits better than LMM with a small disturbance provided their marginal interpretations. Which support the findings of [11] .The term marginal means that in the model specification the expected value of the response variable log fasting blood sugar level, depends only on covariates (fixed effects) and does not depend on subject specific random effects nor directly on previous responses of the subject. Since the purpose is to describe the changes in population mean rather than changes within-subject correlation is regarded as a nuisance characteristic in GEE model [3].
The result of this study has indicated that the education level of patients is a key determinant of fasting blood sugar level. The result obtained in this study showed that illiterate and primary educational level of patients is a significant effect (p=0.0504 and 0.021 respectively) interaction with time in generalized linear mixed model. Education exposed to information empowers to makes them more aware of their own health for fasting blood sugar level. Therefore, supports the research findings of [8] who found that the risk of diabetes decreases with increasing education level. From the final model results of generalized linear mixed model; age,

Conclusions:
This study evaluated the relationship between the progressions of diabetic infection using longitudinally measured fasting blood sugar level count and its possible covariates using longitudinal analysis methodologies. The results of this study showed that the main effect independent variables age, male, urban and time and the interaction effects of age ,illiterate, primary, secondary and urban with time are significantly associated with the progression of fasting blood sugar level count over time in linear mixed model .In generalized linear mixed model the results showed that the main effect independent variables age, male and time, and the interaction effects of illiterate, primary, urban and age with time are significantly associated with the progression of fasting blood sugar level over time.