Geostatistics Analysis of Infant Mortality Rate in Ethiopia

In this paper, spatial statistical analysis of infant mortality rate in Ethiopia is addressed. The analysis investigated of a significance spatial autocorrelation attendance as well as an adapting of a generalized linear mixed model with spatial covariance structure. The results showed the distribution is much spatially associated. Some geographical, economical and healthy variables are used to estimate the model. Several examined variables have a significant effect in the model contrast to other have an insignificant impact. The results highlight the role of improving education to decline the risk of infant mortality rate. Male and children with extra weight are higher exposed and the risk is highly different from one zone to another.


Introduction
Infant mortality is the death of a child before completing the first year of age. The infant deaths number from every 1000 live births called infant mortality rate. This rate can be taken as an indicator to measure the health care and wellbeing of the society [3]. The most causes of infant mortality are birth defects, preterm, low birth weight, maternal complications of pregnancy and injuries such as suffocation [3,6]. The infant mortality causes are significantly associated to structural factors like economic development, general living conditions, social wellbeing and the environment quality [4]. In 2005 the United Nations stated in the human development report the most powerful indicator to capture the divergence in the human development is child mortality [5].
The recent World Health Organization (WHO) reports showed 75% of under-five mortality happens within the first year of age. The risk of infant mortality in African countries is 55 per 1000 live births, and this is more than five times higher compared to European countries which the rate is 10 per 1000 live births [1]. Longitudinally, rates of infant mortality have clearly declined. In 1960, the estimated rate was 122 deaths per 1000 live births, while is 32 deaths per 1000 live births in 2015 [1,2,7].
When the analyzing data is collected in a geographical dimension, it is important to test for spatial dependence. If a spatial autocorrelation is detected, the locational attributes of units contain information about the variables. If this association is ignored, this might lead to biased estimators and false conclusion from the study. So, spatial statistical methods are necessary to be used to improve the precision of the results [8].
Nowadays, it is not uncommon for spatial statistical modelling to be used in medicine, biology, demography, environment and other fields because of the need for describing spatial variability in the data. The mixed generalized linear model is a useful tool to analyze spatial data [9,10].
In this paper, spatial autocorrelation of the infant mortality rate in Ethiopia will be investigated. Moran's I and other related tests are the tools to examine the significance of spatial autocorrelation of infant mortality rate among the Ethiopian regions. Depending on the results of spatial autocorrelation, a generalized linear mixed model with spatial covariance structure will be adapted. Fifteen independent variables covered geographic, demographic and social domains are used in the model. For analyzing purpose, ArcGis, GeoDa and SAS software are used.

Spatial Autocorrelation
Spatial autocorrelation investigates in the term of what happens in a location is related or is not related to what happens in the neighboring locations, in addition to measure of this relationship depending on the geographical data [11,12,13]. The required data should be points or polygons. The measures which used to scale spatial autocorrelation classified into two types, global and local measures of spatial autocorrelation. Global measures of spatial autocorrelation are measures apply to display in a single value the pattern of distribution for a single variable [14,15]. The distribution pattern investigates in the randomness of data spread among the whole region. The investigation has to figure out one of two cases, there is clustering in data distribution or no clustering [16,17]. Several measures are used to calculate these measures. The most common measure of spatial autocorrelation is Moran's I [11]. It uses the points or polygons of the regions as well as variable values to compute the value of spatial autocorrelation [15]. Its formula is: Where: N is the number of observations (points or polygons), x is the mean of the variable values, x is the variable value at a particular location, x is the variable value at another location, W is a weight indexing location of i relative to j and n is number of neighboring regions. Local measures are used to calculate the spatial autocorrelation between each sub-region and its neighboring which shared the borders. There are local versions of the Moran's I as follow [14]: , W is a weight indexing location of i relative to j and n is the number of neighboring regions. These formulas are commonly used to test the spatial association in demography [8,9], economics [18], diseases [19] and ecology [20].

Generalized Linear Mixed Model
Generalized Linear Mixed Models (GLMM) have attracted statisticians attention over the last decades. The word "Generalized" means non-normal distributions of the response variable, and the word "Mixed" means random effects in addition to the usual fixed effects of regression analysis [21,22]. In other words, the GLMM are an extension of the generalized linear model in which the linear predictor contains random effects in addition to the fixed effects. The general formula of GLMM is: Where, y is the response is a vector, g is a linked function and η is the linear predictor function of fixed and random effects, then: Where, X is a matrix of fixed effects, β is a vector of unknown parameters of fixed effects, Z is a matrix of random effects, γ is vector of unknown parameters of random effects assumed to be normally distributed with mean 0 and variance G, and ε is a vector of random errors term assumed to be uncorrelated to γ as well as normally distributed with mean 0 and variance R [23,24,25]. GLMM have become widely used in all fields regarding its flexibility to deal with different types of data and distributions.

Data and Analysis
The cross-sectional studies for Demographic and Health Surveys (DHS) are not uncommon in many countries over the world. The data in this paper is selected from the DHS conducted by the Ethiopian government in 2011 [26]. The survey objected to provide a demographic and health estimates for Ethiopia as a whole. The survey involved 596 randomly selected clusters and the total of 11654 households covered all administrative areas in the country. In this paper, the dependent variable is the dead of the child within the first year of age (binary variable) in addition to 15 independent variables potentially have an effect of infant mortality. The analysis of spatial association in the data and estimation of a generalized linear mixed model with spatial covariance structure will be done in this section.
To test the spatial autocorrelation in the data, Moran's I technique is used. The null hypothesis, in this case, is the spatial distribution of infant mortality rate is random, contrast to alternative hypothesis; infant mortality rate distribution is spatially correlated. The observed value for Moran's I is 0.141 with standard error 0.07, Z-value 2.149 and P-value 0.017. This value clearly pointed out there is a spatial autocorrelation in the infant mortality rate distribution in Ethiopia. Figure 1 and figure 2 below display the Moran's I scatter plot in addition to infant mortality rate map to show how the rates are distributed.   Once, the spatial association is attendant in the data, the estimated model must be depends on this correlation. So, a generalized linear mixed model for the data is estimated. Death of the child is assigned as binary distributed variable, the link function is logit, the fixed effect variables are region, place of residence, respondent education, electricity availability, wealth index, twin of child, sex of child, size of child at birth, members of household, number of children less than five years, total children ever born, age of respondent at 1st birth, number of living children, hemoglobin level and husband/partner's age and the random effect is the cluster location with Gaussian spatial covariance structure (SP(GAU)). The results in table 1 showed there are seven variables of examined effects have a significant impact on infant mortality rate in Ethiopia. These variables are the twin of the child, the size of the child at birth, the number of children less than five years, total children ever born, respondent age at 1st birth, the number of living children and husband/partner's age where the p-value is less than 0.05 for all these effects. On the other hand, the remaining variables in the model have an insignificant impact.  Table 2 present the estimates and the odd ration of the model including spatial variability covariance structure (SP (GAU)). Based on these results, by reference to the region by Dire Dawa zone, the odds of a child to die within the first year is 86.6% higher for those children in Tigray zone (OR=1.686). In general, with the exception of Gambela zone, there is a higher risk of a child to die before completing the first year in all other regions compared to Dire Dawa (OR > 1). But in Gambela zone (OR=0.969) the probability is 3.1% times less than Dire Dawa. In term of the place of residence the risk of infant mortality in urban is 7.2% times higher than rural (OR=1.072). Respondent education is clearly effected the risk of infant mortality, where the chance of risk is more than four times if the respondent has no education, more than three times if the respondent has primary education, more than twice times if the respondent has secondary education referenced to higher education. Availability of electricity reduces the infant mortality risk, when by reference to yes (electricity available) odd ratio for no is 1.470. Wealth index has shown different directions, where the risk is higher in the case of the respondent is poorer (OR=1.139), lower in the case of the respondent is poorest (OR=0.892), middle (OR=0.841) and richer (OR=0.629) referenced by richest. A number of children in the pregnancy is clearly effects the infant mortality, when, if the child is a single there is 80.3% lower to die within the first year referenced to 2nd of multiple (OR=0.197) compared to 88.5% higher if the child is 1st of multiple (OR=1.885). With respect to sex of the child, males are higher exposed to infant mortality than female, where there is 21.2% times greater for male than female (OR=1.212). Regarding size of the child at birth, the relationship is obviously positive with infant mortality, where when the size is very large or larger than average the risk is more than double (OR=2.382/2.406) referenced to very small size compared to OR=1.066 and 0.878 for average and smaller than average size respectively. Members of household, total children ever born and respondent age at first birth have a positive effect in infant mortality when each additional unit over the mean increase the risk (OR=1.046, 2.475 and 1.054 respectively). On the other hand, the number of children less than five years, the number of living children, hemoglobin level and Husband/partner's age negatively effect in the infant mortality, where the risk is decreasing by each unit above the mean (OR=0.318, 0.344, 0.998 and 0.966 respectively).

Conclusion
From the results in the paper, we can conclude the following statements: 1) Infant mortality rate is spatially correlated among Ethiopian regions. 2) The results highlight the role of improving education to decline the risk of infant mortality rate. 3) Availability of electricity for the household increase the probability for the child to complete the first year of age. 4) Male are more exposed to the risk than female. 5) The size of the child at birth significantly affect infant mortality rate and the relationship is negative. 6) The infant mortality risk is highly different from one zone to another.