Principal Component Analysis of Birth Weight of Child, Maternal Pregnancy Weight and Maternal Pregnancy Body Mass Index: A Multivariate Analysis

Background: Birth weight, maternal body mass index and maternal weight is perhaps the most important and reliable indicator for neonatal and infant survival as well as their physical growth and mental development. The main objective of this study was identifying the determinants of birth weight, maternal body mass index and maternal weight simultaneously based on Ethiopia demographic health survey 2016 which implemented in statistical package R. Methods: Cross sectional study design was used from Ethiopia demographic health survey 2016. From principal component model shows the total population variance of first two components were 97% of the variation then the two components replace the original three responses variables birth weight, maternal body mass index and maternal weight without much loss of information. Therefore bi-variate linear regression model was used to identify factors that affect the first two principal components of birth weight, maternal body mass index and maternal weight simultaneously. Results: This study shows family size, region, frequency of read newspaper, frequency of watch television and preferred waiting time for birth were statistically significant at 5% level of significance for first principal component. In addition, size of child, region and maternal age group are statistically significant for second principal components of birth weight of child, maternal pregnancy weight and maternal pregnancy body mass index in Ethiopia. Conclusion: From this finding family size, region, frequency of read newspaper, and frequency of watch television, size of child, maternal age group and preferred waiting time were significant predictors of the first two principal components simultaneously. Hence,-intervention should be given to the pregnant during antenatal care for minimizing the risk.


Introduction
Birth weight, maternal body mass index and maternal weight is one of the most important and reliable determinant of neonatal and infant survival as well as their physical growth and mental development. Study by Ronnenberg et.al reported that maternal nutritional status is important to maternal and fetal wellbeing and BMI were influenced by ethnicity and genetics [1].
According to Akgun et.al reported shown that nutrient intake and weight gain during pregnancy are the two main factors affecting maternal and infant outcomes [2].
The study by Dönmez et.al reported that women gained excess weight during pregnancy and before pregnancy which causes to obesity. The rapid increase of obesity prevalence especially among women in the World cause women begins pregnancy overweight or obese and this can cause problems about pregnancy and birth [3].
A study by Restrepo-Méndez et.al discovered U-shaped relationship between age and low birth weight [4]. The study by Auger et.al concluded that rural relative to urban area as well as low socio-economic status (represented by maternal education) as having an association with low birth weight [5].
Globally, out of 139 million live births, about 20 million of them are low birth weight and nearly 95.6% of them are in developing countries. According to Ethiopian Demographic and Health Survey, 2011, only 5% of children were weighed at birth.
Study conducted in Scotland from 2002 to 2004 shown that the 20% of women who received antenatal care were obese that representing a twofold increase over the past 10 years [9]. A similar study conducted in United States shown that women who received antenatal care increased from 16% to 36% from 1980 to 1999 [10]. Babies who born in Sub-Saharan Africa (SSA) with LBW are 13% for each year [6,7]. Babies who born in Ethiopia with low birth weight are 13-11% [8]. Gestational weight gain is also higher than ever before, with approximately 40% of pregnant women gaining more weight than is recommended [11]. Obesity during pregnancy may cause adverse outcomes, not only in the mother but also in the child.

Study Design and Methods
Cross-sectional study was conducted to identify directions along which predictor variables determine the most variation of birth weight, pregnancy weight and pregnancy body mass index. This study was identifying factors that have maximum effects on pregnancy and its outcome simultaneously with chronically order.
Study Area and Population This study carried out in Ethiopia based on demographic and health survey 2016 which included pregnant women who participated on it.

Data Collection Procedures
This research utilized Ethiopia 2016 demography and health survey as its source of data that is the fourth comprehensive and nationally representative population and health survey. It is important feature of the data set that avails in-depth information on demographic and health aspects of households. The data would be collected by the central statistical agency at the request of the ministry of health [8]. Data collection took place from January 18, 2016, to June 27, 2016.
Inclusionand Exclusion Criteriaof the Study Mothers who are pregnant and remember her child birth weight, pregnancy weight and body mass index which record from January18, 2016, toJune27, 2016 would be include in the study. Therefore, this study was including 1996 pregnant.
Variables Included in the Study Response Variables This study used child birth weight, pregnancy maternal weight and pregnancy body mass index as response variables.
Explanatory Variables The predictor variables to be studied as determinants of child birth weight, maternal weight and maternal body mass index simultaneously would be included number of tetanus injections before pregnancy, age group, family size, frequency of watch tv, preferred wait time, husband educational level, frequency of reading newspaper or magazine, desire for more children, size of child at birth and region. Principal Component Analysis A principal component analysis is concerned with explaining the variance-covariance structure of a set of variables through a few linear combinations of these variables. It is one of a family of techniques for taking highdimensional data, and using the dependencies between the variables to represent it in a more tractable, lowerdimensional form, without losing too much information. PCA is one of the simplest and most robust ways of doing such dimensionality reduction. Principal component analysis (PCA) is a multivariate technique that analyzes a datatable in which observations are described by several inter-correlated quantitative dependent variables.
Goals of Principal Component Analysisunder this study, we will used PCA are:- To extract the most important information from the data table.
To compress the size of the data set by keeping only this important information.
To simplify the description of the data set. To analyze the structure of the observations and the variables.
In order to achieve these goals, PCA computes new variables called principal componentswhich are obtained as linear combinations of the original variables. The first principal component is required to have the largest possible variance.
The second component is computed under the constraint of being orthogonal to the first componentand to have the largest possible inertia. The other components are computed likewise. The values of these new variables forthe observations are called factor scores, and these factors scores can be interpreted geometrically as the projectionsof the observations onto the principal components.
The Principal Component Model If the observed variables are Y 1 , Y 2 , Y m , and the new transformed variables of PCA are PCA 1 , PCA 2 ,…,PCA m then the variables may be expressed as linear functions of the PCA: ….
PCA n = e n1 Y 1 + e n2 Y 2 + e n3 Y 3 + … + e nm Y m PCA i are uncorrelated, PCA 1 explain as much as possible of original variance in the dataset and PCA 2 explain the remaining variance of the original data set etc.The equation (1) shows small set of linear combinations of the covariates which are uncorrelated with each other.This will avoid the multicollinearity problem. However the linear combinations chosen have maximal variance. A good regression design chooses values of the covariates which are spread out.
Estimation of principal components coefficients To estimate the coefficients from principal components first estimate the variance for i th principal components is equal to i th eigenvalue.
The eigenvalue of variance covariance matrix ∑ and the corresponding eigenvectors e 1 through e n will be principal component coefficients. However the order of eigenvalue or variance is λ 1 ≥λ 2, ….,≥λ n . The eigenvalues and eigenvectors of covariance matrix differ from those of the associated correlation matrix. Therefore PCA of covariance matrix is meaningful only if the variance expressed in the same units, and PCA of correlation matrix to be use when variables on different scales.
Proportion of total population variance of principal components Analysis Proportion of total variance due to k th components is equal to , …., Proportion of total variance due to first k th components is .k=1, 2, …, n In order to decide how many principal components should be retained, it is common to summarize the results of a principal components analysis by proportion of total variance.
If most (for instance, 80% to 90%) of total population variance for large n, can be attribute to the first one, two, three then these components replace the original n variables without much loss of information [13].
The correlation between component and variables The correlation between components PCA i , and the variableY k are: Multivariate Multiple Linear Regression Models This study usedmultivariate multiple linear regression models after perform PCAthen we have p>1 predictors and m > 1 response variables. Furthermore, the response variable is linear function of parameters ( ! , , ,…, " are parameters).
Each response is assumed to follow its own regression model, so that Y= Xβ + € (6) where β'={ ! , , ,…, " },ε′= #ε , ε … . ε $ % has E(ε)=0 and var(ε)=∑. Thus,the error terms associated with different responses on the same trial are correlated. The parameters value is obtained from parameter estimation. According to Nkurunziza et.al reported the mostly used estimation methods are the multivariate least squares estimation [12].Under this study we used backward elimination. There are also different model diagnostic frameworks for identifying, analyzing and interpreting data in a given context to identify possible needs.A first step of the regression diagnostic is to inspect the significancy of the regression beta coefficients, as well as, the coefficients of determination (R 2 ) that tells us how well the linear regression model fit to the data. For this study we used plots of residuals vs fitted, Normal Q-Q and scale location (spread-location).  Table 2 results shows that the average and standard deviation of weight of mother who lived Somali and Tigray region is 64.73(16.47) and 50.11(8.02) kg respectively. In addition the average and standard deviation of body mass index of mother who lived Addis Ababa and Tigray region is 24.84(4.22) and 20.31(2.92) kg/m 2 respectively. Furthermore, the average and standard deviation of birth weight of child who bornSNNP and Amhara region is 3.5 (1.12) and 3(0.87) kg respectively.

Statistical Results
The mean and standard deviation of mother who is not read newspaper or magazine and who read more than or equal to 1times per a week were 55.47(11.74) and 60.13(11.86) kg respectively. Moreover, the mean and SD of body mass index who is read newspaper or magazine for more than or equal to 1times pera week and not read at all were 23.82(4.53) and 22.12(4.32) kg/m 2 respectively.
Mother who watch TV more than or equal to 1 times per a week and not watch at all had mean and SD of maternal weight were 60.45(12.29) and 52.48(9.41) respectively. In addition the mean and SD of maternal body mass index ofmother who watch TV more than or equal to 1 times per a week and not watch at all were 24.13(4.55) and 20.85(3.37) kg/m 2 respectively.
From size of child point of view, mother who had larger size of child and small size of child were higher and lower weight of mother respectively. The mean and SD of weight of mother who preferred less than 12 month and do not known the wait time for birth of other child were 61.07 (12.99) and 60.92 (17.09) kg respectively. Moreover, the mean and SD of birth weight of child born from mother who preferred 4 year and do not known the wait time for birth of other child were 3.17(0.83) and 3.42 (0.91) kg respectively. The mean of maternal weight, body mass index and birth weight of child for young mother is higher than old mother.
Furthermore, the mean of maternal weight and body mass index of mother who received more tetanus injection before pregnancy is higher but the mean of weight of child born from those mothers is lower. Under this study E-statistic of multivariate normality was used and it was show multivariate normality with P value equal to 0.25. In addition residuals plot shown in Figure 2, indicate that the residuals and the fitted values confirm linearity without distinct patterns and shows constant variance and Figure 1 confirms normality of errors which was residual points follow the straight dashed line.  Principal Component Analysis Principal component analysis was a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables. Therefore we did the principal components of birth weight; body mass index and maternal weight with proportion to the total population variance of principal components are given in Table 3.  Table 3 shows the first two components account for 97% of the variance and the bar plot for each component's variance (see Figure 3) shown how the first two components dominate. In order to achieve the goals of PCA computes new variables which were obtained as linear combinations of the original variables? The first and second principal component were required to have proportion of variance would be 0.64 and 0.33 respectively. Table 4 shows the estimated coefficients of principal components of birth weight, body mass index and maternal weight were given below. Based on Table 4 results the response variables birth weight, maternal weight and body mass index reduce from three to two without loss of information. Therefore, the new response variables are explaining 97% of variation of the original variables which expressed as linear functions of the PCA shows on Table 3. Furthermore, this study used PCA 1 and PCA 2 as response variables which obtain from principal component model shown below.
&'( = )0.7Maternal Weight ) 0.7Maternal Body Mass Index ) 0.1Birth Weight &'( )0.07Maternal Weight ) 0.07Maternal Body Mass Index ) 0.99Birth Weight From principal component model results was shows that mother who was increases the weight by one kg, the mean of first and second principal components was decreased by 0.7 and 0.07 units respectively when other variable remain constant. Similarly, mother who was increase in body mass index by one @A B ⁄ , the mean of first and second principal components was decreased by 0.7and 0.07 units respectively when other variables remain constant. In addition the first weight of baby was increases by one kg, the mean of first and second principal components was decreased by 0.1 and 0.99 units respectively when other variable remain constant.Finally, the two principal components would be replacing three original variables without much loss of information and the original variables would be contributed for each principal component even if the contributions differ. Therefore, we determined the predictor which has effects statistically for principal components then that predictor would be effect on original variable indirectly.  After the overall assumptions checked and reduce the number of response variables were fit bi-variate multiple linear regression model of PCA 1 and PCA 2 . Therefore, the significant predictors was modeled based on estimated value of the parameter that shown on Table 3. The fitted bi-variate linear regression model that relating PCA 1 and PCA 2 with the explanatory variables is given as:- The predictor variables that accounts to explain the first and second principal components of birth weight of child, maternal body mass index and maternal weight is 81% and 80% respectively. Therefore, the model is good fit to the data.
The model parameter of first principal component interpreted as follows. The mean of PCA 1 decreased by 0.19 units when household number changes by one when the effect of other variable remains constant. This result lined with the previous study [14]. In addition, the mean of PCA 1 for mother who lived Somali, SNNP, Harari, AddisAbeba and DireDawa were increased by a factor 1.3, 0.77, 0.95, 1.11 and 1.02 respectively as compared to mother who lived Tigray when the effect of other variable remains constant. This finding is consistent with the study by Ronnenberg et.al [1]. Furthermore, the mean of PCA 1 of mother who read newspaper for greater than or equal to one per a week were increased by a factor 0.54 as compared with mothers who is not read at all when the effect of other variable remain constant. Moreover, the mean of PCA1 of mother who watch TV for less than 1 and greater than or equal to 1 per a week were decreased by factor 0.57 and 0.9 respectively as compared to not watch at all. This finding is consistent with the study by Gupta et.al [15].
The model parameter of second principal component interpreted as follows:-The mean of second principal component of mother who lived Amhara,Somali, Gambela,and AddisAbeba were increased by a factor 0.81,0.47,0.49and 0.52 respectively as compared to those who lived Tigray when the effect of other variable remains constant. This result in lined with the previous study byRonnenberg et.al [1]. In addition, the mean of second principal component for mother who has age between 20 -24 had increased by factor 0.63 as compared to those who had between 15 -19 when the effects of other variable remain constant. This result in lined with the previous study byRestrepo-Méndez et.al [4].
Finally, the mean of second principal components for child who had who larger than average size, average size, smaller than average and very small size increased by factor 0.53, 1.02, 1.5 and 1.78 as compared to those who had very large size when the effect of other variable remains constant. This result in lined with the previous study by Furlong et. al [16,17]. Bi-variate Multiple Linear Regression

Conclusion
Principal component analysis of birth weight of child, maternal pregnancy weight and maternal pregnancy body mass index reveals 97% of the variation account by the first two principal components. The first and second principal component were required to have proportion of variance would be 0.64 and 0.33 respectively. Therefore, this study discussed on the effect of explanatory variables when the principal component analysis of birth weight of child, maternal pregnancy weight and maternal pregnancy body mass index were fitted jointly.
This study determines the factors that affect the first two principal componentsof birth weight of child, maternal pregnancy weight and maternal pregnancy body mass indexamong pregnant women in Ethiopia simultaneously. Therefore, family size, region, frequency of read newspaper, frequency of watch television and preferred waiting time for birth were statistically significant at 5% level of significance for first principal component in Ethiopia. Furthermore,size of child, region and maternal age group are statistically significant at 5% level of significance for second principal components of birth weight of child, maternal pregnancy weight and maternal pregnancy body mass index in Ethiopia. From the result of the study shows that the first principal componentincreased when family size decreased in the household. However, the family size decreased in the household then birth weight of child, maternal weight and maternal body mass indexwere decreased. Furthermore,the first principal components of mother who live Somali, SNNP, Hrari,Adiss Ababa and Dire Dawa decreased as compared those who lived Tigray. But the birth weight of child, maternal pregnancy weight and body mass index were increased. In addition, the first principal component of mother who is read newspaper more than 1 times per a week were increased as compared to those who are not read at all implies birth weight of child, maternal body mass index and maternal weight decreased. Moreover, the first principal components of mother who watch TV were increased as compared to those who is not watch at all implies birth weight,bodymass index and maternal weight were decreased.
Finally, the second principal components of mother who live Somali,Amhara, Gambela and Adiss Ababa increased as compared those who lived Tigray even if the size of child and the age of mother different implies except birth weight of child maternal pregnancy weight and maternal body mass index were increased.