Determinants of Household Food Insecurity in Rural Ethiopia: Multiple Linear Regression (Classical and Bayesian Approaches)

This paper examined the determinants of food insecurity among rural households in Ethiopia using data obtained from Households Consumption and Expenditure (HCE) and Welfare Monitoring (WMS) Survey conducted in 2011 by Central Statistical Agency (CSA). Bayesian multiple liner regression analysis was employed to identify determinant factors of rural household’s food insecurity, diet quality. The study revealed that the diet quality measure for rural households was obtained to be 68% who food secured and 32% who food in secured. The results of the analysis show that the variables, educational level of head of households, annual per capita expenditure of a households, farm land size of a households, number of oxen owned by the farm households, distance to input source, age of the households head, household size, gender of head of household, participating in off-farm activities, production storage and shocks such as: price rice of food items, flood, drought and illness were found to be the most important determinants of households food insecurity. Accordingly, the study suggests that a judicious combination of interventions that enhance income diversification opportunities in rural areas through promoting offfarm activities, family planning, and education, training and extension services could help enhance household food security. Provision of awareness creation on better and productive utilization of such resources as production storage should also be emphasized in rural areas. Generally improvements in fourteen predictor variables have the potential to increase the number of food secured households in rural households of Ethiopia.


Introduction
The latest report on the State of Food Insecurity in the World (FAO, 2015) estimates the number of people undernourished in 2014-16 at 795 million or 10.9% of the total, a reduction from 18.6% in 1990-92 [1]. The report notes that the vast majority of the hungry (780 million people) live in the developing world and the overall share of the hungry currently stands at 12.9% of the total population. The same report estimates that the share of people in Ethiopia who are undernourished in 2014-16 is 32%, a reduction from 74.8% in 1990-92. According to the report, this improvement in Ethiopia could be attributed to several interlinked factors including the high GDP growth rate the country has been experiencing in the recent years and the existing Productive Safety Net Program (PSNP). This assertion of attribution echoes other studies such as [2,3].

Statement of the Problems
The problem of food insecurity has continued to persist in the country as many rural households have already lost their means of livelihood due to recurrent drought and crop failures. Thus this study is intended to fill the gap by addressing the following research questions: 1. Do demographic and socio-economic, predictors significantly affects diet quality/dietary diversity score of rural households of Ethiopia? 2. What is the level of contribution of these variables towards diet quality/dietary diversity score of rural household's of Ethiopia?

Objectives of the Study
The major objective of this study is to determine factors affecting food insecurity in rural Ethiopia. The Specific objectives are: 1. To determine factors influencing diet quality/dietary diversity score in rural households.
2. To apply Bayesian multiple linear regression model on the diet quality.
3. To provide relevant recommendations for policy makers and suggest directions for future studies.

Data Source
The source of the data used in this study is from the Household Consumption and Expenditure (HCE) survey and Welfare Monitoring (WM) survey which were conducted by Central Statistical Agency (CSA) in 2011 [4]. In each rural Enumeration Area (EA), 12 households were selected, and in each urban EA, 16 households were selected. The 2011 HCE and WM survey covered all rural and urban areas of the country except the non-sedentary populations in Afar (three zones) and Somali (six zones).

Variables Considered in the Study
The response and explanatory variables that were considered to affect the status of household food security were selected based on experiences from the available similar studies and the available data on the subject.

Response Variables
The response variable is Household Dietary Diversity Score (HDDS) based on diet quality which is continuous. We analyze the response variable using Bayesian Multiple Linear Regression Model (BMLRM).

Explanatory Variables/Factors
Independent variables which might have an effect on Household Food Security Status (HFS) are selected for investigation in this study. The primary choice of explanatory variables for this study was based on literature reviews on the factors influencing HFS in the country. Therefore, those variables which are reviewed from literature as determinants of Household Food Insecurity Status (HFIS) are classified into demographic, socioeconomic and shocks variables.
(i) Demographic Variables A demographic characteristic of Household Food Insecurity Status (HFIS) includes, Age of head of household (AGE), Gender of head of household (GEND), Household size (HHSZ).
(ii) Socio-Economic Variables As socio-economic factors the following variables are included in the model, Educational level of head of household (EDUC), Annual per capita expenditure of a household (EXPEND), Farm land size of a household (FLSZ), Number of oxen owned by the farm household (OXEN), Household distance to input source (DIST), Household Use of improved technology (USETECH), Household Crop store (PRODUCTION), Household members participating in off-farm activities (OFFFARM).
(iii) Shocks Variables Households facing shocks such as; price rice of food items (PRICE), DROUGHT, ILLNESS and FLOOD.

The Model
Let y denotes the dependent (or study) variable that is linearly related to k independent (or explanatory) variables ; ; through the parameters ; ; and we write This is called as the multiple linear regression model. The parameters ; ; are the regression coefficients associated with ; ; respectively and is the random error component reflecting the difference between the observed and fitted linear relationship. There can be various reasons for such difference, e.g., joint effect of those variables not included in the model, random factors which can't be accounted in the model etc.
The regression coefficient represents the expected change in y per unit change in independent variable . In the usual multiple regression problem, we are interested in describing the variation in a response variable y in terms of p predictor variables ;:::; . We describe the mean value of , the response for the individual, as E( \β; X)= + ⋯ + ; i=1; 2;:::n (2) Where ;:::; are the predictor values for the individual and ;:::; are unknown regression parameters. If we let =( ;:::; ) denote the row vector of predictors for the ith individual and β= ; ; the column vector of regression coefficients, we can re express the mean value as The are assumed to be conditionally independent given values of the parameters and the predictor variables. In the ordinary linear regression setting, we assume equal variances where var( \ ; X)= .
We let θ=( ;:::; ; ) denote the vector of unknown parameters. Finally, we assume that the errors = -E ( \β; X) are independent normally distributed with mean 0 and variance .
In matrix notation, this model can be written for all observations as * = Using this notation the observed data model can now be expressed as Where y is the vector of observations; X is the design matrix with rows ;:::; ; I is the identity matrix; and . (µ; A) indicates a multivariate normal distribution of dimension p with mean vector µ and variance-covariance matrix A.

Parameter Estimation
We define the estimates of the parameters ; ;:::; as / ; / ;:::; / where these estimates minimize the least squares residuals ∑ 1 2 where 1 2 = −( / + / +::: + / ) When written out, formulas for these estimates can become complex for each individual parameter. However, by introducing matrix notation, the formula becomes compact. In fact, the formula for the least squares estimates is Where 4 is the transpose of the matrix X. The objective in using this formula is to minimize the euclidean length of the difference //Y -X 3 //. Suppose Is a minimizing vector. Then the line y= * + * + * +::: + * is referred to as the least squares line fit to the data. To see how to find 8 fix a vector y in 9 . As / varies, the vectors X 8 form a subspace of 9 . If we wish y − X 8 to have minimum length, then it must be orthogonal to the column space of the matrix X. Let 8 be such a vector so that y − X 8 is orthogonal to the column space. Then, the inner product of y − X 8 with X 8 is zero for any vector 8 . Thus, for all vectors 8 . Recall that 8 4 = 8 4 . Then, using a bit of algebra, For all vectors 3 . The fixed vector 4 Y − 4 X 8 * is orthogonal to every vector 8 if it is the zero vector. That, is, 4 Y= 4 X 8 * Since X is a n x p matrix and 4 X is a p x p matrix, and if 4 X is invertible, then 4 X= 4 X 8 * has a unique solution 3 = 8 * . Namely,

Hypothesis Testing
We wish to test the significance of the explanatory variables and whether these variables hold significant explanatory information. Our goal is to compare the full model, Y = * + * + * + ⋯ + * + : with the reduced model, Y = + + + ⋯ + ; ; + : where q < p, to test whether there is a significant difference between the two models. As one would expect, the null hypothesis is given by Rather than comparing the individual parameters and gathering data from each, it may be simpler to compare the models instead. The size of the residual tells the suitability of the model. "The smaller the residual, the better the model fits the data". Let the sum of squared residuals be denoted by SSR. Then @@9 ABCC and @@9 DEFGHEF can be used to compare the reduced model against the full model. The test statistic for the null hypothesis is given by Where V is an estimate of the distribution of the random errors. In order for V to be an unbiased Estimate of 2 . define Where n − (p + 1) is used rather than n -1. The use of p + 1 comes from the fact that we must estimate p + 1 unknown parameters ; ; ; in order to form the residuals 1 2 If we assume the random errors are distributed normally, then the test statistic F has an F distribution with p -q and n -(p + 1) degrees of freedom at significance level α, denoted by Fα ((p -q, n -(p + 1)), when we fail to reject the null.
The hypothesis test is then to "Reject < = : ;( =:: Rejecting the null hypothesis indicates that there is a significant relation between the response variable y and the explanatory variable X.

Statistical Tests of Individual Predictors
In case the test in analysis of variance is rejected, then another question arises is that which of the regression coefficients is/are responsible for the rejection of null hypothesis. The explanatory variables corresponding to such regression coefficients are important for the model.
Adding such explanatory variables also increases the variance of fitted values 2 , so one need to be cautious that only those regressors are added which are really important in explaining the response. Adding unimportant explanatory variables may increase the residual mean square which may decrease the usefulness of the model.
To test the null hypothesis < = : =0 against < = : ≠0 If < = is accepted, it implies that the explanatory variable Xj can be deleted from the model. The Corresponding test statistic is corresponding to d .

Confidence Interval
We can compute the 100 (1 -α)% confidence interval for multiple linear regression models. For a certain ; j=0;1;:::;n, we have the 100 (1 -α)% confidence interval as Where h is the diagonal element of C, a symmetric variance-covariance matrix of the estimated regression coefficients defined by C= V , 5 that represents the variance of n 8 . We use ^i 2 k X\ − Y + 1 ] for the confidence interval rather than ,, since we are finding the interval for a certain . In interval notation, we can express this as

Coefficient of determination (p q ) and adjusted p q
Let R be the multiple correlation coefficient between y and Then square of multiple correlation coefficient (R ) is called as coefficient of determination. The value of R q commonly describes that how well the sample regression line fits to the observed data. This is also treated as a measure of goodness of fit of the model.
Assuming that the intercept term is present in the model as Where @@ •ba : sum of squares due to residuals, @@ = ‚C : total sum of squares, @@ •bƒ : sum of squares due to regression. R Measure the explanatory power of the model which in turn reflects the goodness of fit of the model. It reflects the model adequacy in the sense that how much is the explanatory power of explanatory variable.
Adjusted 9 If more explanatory variables are added to the model, then R increases. In case the variables are irrelevant, then R will still increase and gives an overly optimistic picture. With a purpose of correction in overly optimistic picture, adjusted R , denoted as R or adj R is used which is defined as Where, (n-k) and (n-1) are the degrees of freedom associated with the distributions of @@ •ba and @@ = ‚C .

Bayesian Multiple Linear Regression Model
Bayesian inference is the method of statistical inference based on the collected data and additional information or prior information about the study populations. This approach treats the unknown parameters as random variables.
There are three main reasons for interpreting parameter estimation within a Bayesian Inference.
1. Least squares estimation (maximum likelihood) is a simplified version of Bayesian learning.
2. Bayesian learning allows the incorporation of prior knowledge about the parameter values.
3. It can be used to motivate iterative/recurrent learning, where data is sequentially received and the parameters are updated after each time step.
To complete the Bayesian formulation of the model, we assume (β; σ2) have the typical no informative prior.
Bayesian inference, which allows ready incorporation of prior beliefs and the combination of such beliefs with statistical data, is well suited for representing the uncertainties in the value of explanatory variables.
Bayesian inference for Multiple Linear Regression analysis follows the usual pattern for all Bayesian analysis: 1. Write down the likelihood function of the data. 2. Form a prior distribution over all unknown parameters. 3. Use Bayes theorem to find the posterior distribution over all parameters.
Mathematically, the conditional probability of observed data D given parameters β relates to the conr verse conditional probability of parameters β given observed data D: Where:-P (β; D) is a joint probability distribution for β and observed data D; p(β) is a prior probability for P( Š ⁄ ) is a posterior probability for parameters β; • Š ⁄ is the likelihood function, and P(D) is the probability distribution of observed data D. In Bayesian framework, there are three key components associated with parameter R: the prior distribution, the likelihood function, and the posterior distribution. These three components are formally combined by Bayes rule:

The Likelihood Function
In this concept, parameters are unknown and fixed.

Prior Distribution
It is prior information about the parameters from previous studies etc. There are two types of prior distributions. They are informative prior and non-informative prior. The different between them is that if we have no information about the prior distribution or the variance of the prior is large non informative prior is used.

The Posterior Distribution
The posterior analysis for the normal regression model has a form similar to the posterior analysis of a mean and variance for a normal sampling model. We represent the joint density of , as the product 1) π(, / , )∝Likelihood * prior ∼ IG(α; β): 2) π(β, / , ) ∝ Likelihood * prior ∼ N(¡ ‹ * , Ʃ ‹ * )

Markov Chain Monte Carlo (MCMC) Methods
As the number of variables increases in our models, the more difficult it becomes to evaluate and analyze the solution of a posterior distribution. Here is where the MCMC Methods become quite useful. MCMC techniques simulate the posterior so that it can be analyzed. The results can then be used to draw inferences about the models and parameters. There are many MCMC algorithms with which to choose.
Gibbs Sampling: Gibbs Sampling is one such algorithm and is especially useful in applications of Bayesian analysis. The Gibbs sampler is a technique that generates random variables indirectly from a distribution without having to calculate the density. Thus, we are able to create a sequence of easier calculations while avoiding the much more difficult ones. The main idea of the Gibbs sampler is to fix all values of the random variables, save for one. In other words, we consider univariate conditional distributions.
Gibbs sampler algorithm: 1) Gibbs sampling requires you to decompose the joint posterior distribution in to full conditional distribution for each parameter in the model and then sample from them. 2) Some researchers favor this distribution because it does not require instrumental proposal distribution as metropolis methods do.
, ¤¥^¥ are the full conditional distributions from which we simulate of the posterior distribution. Estimation of β on the posterior distribution may be difficult, for this reason we need to use non analytic method.
The most popular method of simulation technique is Markov Chain Monte Carlo (MCMC) methods.

Prediction of Future Observations
Suppose we are interested in predicting a future observation ¦ corresponding to a covariate vector x. From the regression sampling model we have that ¦, conditional on β and , is N(xβ; σ). The posterior predictive density of ¦, p( ¦jy), can be represented by a mixture of these sampling densities p( ¦ /β; ), where they are averaged over the posterior distribution of the parameters β and :

Computation
The expressions for the posterior and predictive distributions lead to efficient simulation algorithms. To simulate from the joint posterior distribution of the regression coefficient vector β and the error variance , one 1) simulates a value of the error variance σ 2 from its marginal posterior density π( /y) 2) simulates a value of β from the conditional posterior density π(β/ ; y). Since the two component distributions (inverse gamma and multivariate normal) are convenient functional forms, it is relatively easy to construct an algorithm in R such as the one programmed in the function blinreg to perform this simulation.

Model Checking
Residuals: One method of assessing the goodness of fit of the model uses the posterior predictive distribution defined as.
Suppose one simulates many samples ¦ ;:::; ¦ from the posterior predictive distribution conditional on the same covariate vectors ;:::; used to simulate the data. To judge if a particular response value is consistent with the fitted model, one looks at the position of relative to the histogram of simulated values of ¦ from the corresponding predictive distribution. If y i is in the tail of the distribution that indicates that this observation is a potential outlier. To cheek the adequacy of model parameter or to cheek the model assumption, we can use plots such as histograms for each parameters.
A second approach is based on the use of Bayesian residuals. In a traditional regression analysis, one judges the adequacy of the fitted model by inspection of the standardized residuals Where 3 and V are the usual estimates of the regression vector and error standard deviation, and hii is the diagonal element of the hat matrix. From a Bayesian perspective, one can consider the distribution of the parametric residual ( = − ) Before any data are observed, the parametric residuals are a random sample from an N(0; σ) distribution.
Suppose we say that the observation is an outlier if | | > c , where k is a predetermined constant such as 2 or 3. The prior probability that a particular observation is an outlier is 2Φ(−k), where Φ(z) is the standard normal cdf.
After data y are observed, we can compute the posterior probability that each observation is an outlier. Define the functionsandas Where, = − 3 Then the posterior probability that the observation is an outlier is In practice, this can be computed and compared to the prior probability 2Φ(k). The R function Bayes residuals can be used to compute the posterior outlying probabilities for a linear regression model.

Results and Discussion
The primary objective of this study is to determine major factors that affects food insecurity in rural Ethiopia. The data analyzed in the study obtained from Households Consumption and Expenditure (HCE) and Welfare Monitoring (WM) Survey conducted in 2011 by CSA [4]. Bayesian multiple liner Regression Model was employed to identify determinant factors of rural households. The response variables, the diet quality (HDDS) measures of food insecurity indicator, are continuous variable.

Descriptive Results for Food Insecurity in Rural Ethiopia
Based on the HDDS measurement out of the 10,309 sample rural households of Ethiopia (32%) and (68%) were found food insecure and food secure households, respectively. Past studies have reported even higher figures: 64% in secured and 37% were secured [5]. 69.2% of the sampled households of the Woreda were food insecured while 30.8% were food secure [6].

Univariate Results for Food Insecurity in Rural Ethiopia
For each covariate we used a univaraite linear regression model analysis that contains a single independent variable in order to have an idea about each covariate. In Table 1 univaraite analysis, using t test, the variables that are found to be significant are participating in off farm activities, educational level of head of household, annual per capita expenditure of a household, farm land size of a household, number of oxen owned by the farm household, Production stored, gender of head of household, distance to input source, age of the household head and shocks such as: price rice of food items, drought, illness and flood were found statistically significantly associated with HDDS (diet quality) (at p<0.05). Whereas, use of improved technology were found to be insignificant.  Based on the results of univariate analysis, a model containing 15 selected predictor variables were included in the multivariate analysis. Using the forward step wise method, fourteen out of fifteen predictor variables were selected and have a significant joint impact in determining household food insecurity in Table 2.

Multivariable BMLR Results for Food Insecurity in Rural Ethiopia
According to the results for posterior estimates of the BMLR Parameters on Table 3 we observed that the 95% credible intervals for the given parameters (intercept, slopes). The test statistics for < and < is < : = = 0 ¹e < : = ≠ 0 For some j. Decision rule: Reject < , if the 95% credible interval does not contain zero.
Among the variables included in the analysis: The confidence region of the posterior estimates for the dietary diversity score (diet quality) in table 3 indicates that participating in off farm activities, educational level of head of household, annual per capita expenditure of a household, farm land size of a household, number of oxen owned by the farm household, Production stored, gender of head of household, distance to input source, age of the household head and shocks such as: price rice of food items, drought, illness and flood does not contains zero, Therefore we reject < at α=5% level of significance implies that there is a significant linear relationship between explanatory variables and the response variable diet quality (HDDS) on the food insecurity.
According to the posterior estimates for the dietary diversity score coefficients of the predictors included in the model given in table 3 shows that, positive significant association (p<0.05) of off-farm activities with Dietary Diversity Score (diet quality). The result of this study indicate that holding all other predictors constant the Dietary Diversity Score higher by 0.332 for households participating in off farm activities as compared with households who did not participate on off-farm activities Consequently, households not participating in off farm activities, the more likely the household will be food insecure. Similar findings have been reported in [7][8][9][10]. Negative significant association (at p<0.05) of price rice of food items with Dietary Diversity Score (diet quality). The result of this study indicate that holding all other factors constant, the Dietary Diversity Score is expected to be decrease by 0.181 for the households exposed to price rice of food items as compared with households who do not exposed. As a result, households who exposed to price rice of food items are more likely to be food insecure than households who do not exposed.
Positive significant association (p < 0.05) of household size with Dietary Diversity Score (diet quality).
The result of this study indicate that holding all other predictors constant, the Dietary Diversity Score higher by 0.062 for household size increase by one person. Consequently, the smaller household size the household has, the more likely the household will be food insecure This result contradicts with the previous studies in South Africa; in Nigeria and in Ethiopia [11,12,13,6]. The implication of the results is when households size increased they can have more food groups (diet quality). The possible explanation is as family size increases, the amount of food group's consumption in one's household increases thereby that additional household member shares different food groups. The positive significant association (p<0.05) of gender of head of household with Dietary Diversity Score (diet quality). The result of this study indicate that holding all other factors constant, being male households head, the value of Dietary Diversity Score is expected to be higher by 0.131. Hence, female headed households are more likely to be food insecure as compared to the male headed households This result is in line with the previous studies [13]. in Nigeria.
Positive significant association (p < 0.05) of education level of the households head with Dietary Diversity Score (diet quality). The result of this study indicate that holding the effect of other predictors constant, the Dietary Diversity Score higher by 0.029 for education level of the households head increases by one year. Consequently, the lower the educational level of the household head, the more likely the household will be food insecure. The negative significant association (p < 0.05) of distance to input sources with Dietary Diversity Score (diet quality) indicates that, holding all other predictors constant, distance to input sources increases by one kilometer, the Dietary Diversity score decreases by 0.0032. hence, the farther the household reside from the agricultural input sources, the more likely the household will be food insecure. Similar findings have been reported in [7,8,14,9].
Positive significant association (p < 0.05) of number of oxen owned by the farm households with Dietary Diversity Score (diet quality). The result of this study indicate that holding the effect of other predictors constant, for the number of oxen owned by the farm households increased by one ox, the Dietary Diversity Score higher by 0.063. Consequently, the lower number of oxen owned by the farm households, the more likely the household will be food insecure. Keeping the other variables constant the Dietary Diversity score increases by 0.039 for a hectare increase in farm land size. Hence, the smaller farm land size the household has, the more likely the household will be food insecure. Similar findings have been reported in [7][8][9]. Holding all other factors constant, the Dietary Diversity is expected to be decrease by 0.166 for the households exposed to flood shock than not exposed. As a result, households who exposed to flood shock are more likely to be food insecure as compared with households who do not exposed. The Dietary Diversity is expected to be decrease by 0.194 for the households exposed to drought shock than not exposed. As a result, households who exposed to drought shock are more likely to be food insecure as compared with households who do not exposed. This result confirms with other findings [15]. The negative significant association of illness implies that Dietary Diversity is expected to be decrease by 0.271 for the households exposed to illness than not exposed. As a result, households who exposed to illness shock are more likely to be food insecure as compared with households who do not exposed. This result of shocks such as; drought and illness confirms with other findings [16,13]. In Ethiopia [17]. noted that the people of Bambara of Kala in Mali.
The age of household heads negative significant association at (p<0.05) implies that keeping the effect of other predictors constant, as age of the household head Increases by one year, the Dietary Diversity score decreases by 0.0019. Hence, the higher the age of the household head, the more likely the household will be food insecure. This is possible because older household heads are less productive and they lead their life by remittance and gifts. They could not participate in other income generating activities. On the other hand, older households have large number of families and their resources were distributed among their members. This result confirms with other findings [13,18].
Holding the other predictors constant; as production stored increases by one month, the dietary diversity score increases by 0.0368. Hence, the lower production stored the household has, the more likely is the household to be food insecure. Similar findings have been reported in South Africa in Ethiopia [11,19,20]. Holding the other predictors constant; as annual per capita expenditure of the households increases by one Birr, the Dietary Diversity score increases by 9.178617e-05. Hence, The smaller annual per capita expenditure the household has, the more likely is the household to be food insecure.
Comparison of the discriminant analysis and Bayesian multiple linear regression model reveals that the variables, educational level of head of households, annual per capita expenditure of a households, farm land size of a households and number of oxen owned by the farm households are significantly and negatively associated with households food insecurity. Moreover distance to input source, age of the households head and shocks such as: price rice of food items and flood had a significant positive effect for food insecurity irrespective of the two measure used. On the other hand, the variable Use of improved technology were insignificant in both models. And also variables, participating in Off-farm activity, Production storage and shock such as; drought and illness were found that insignificant in discriminant analysis.
However, the variables household size and gender of head of household while significant in both models assumed opposite signs. Though this could be due to the quite different aspects of household food security the two indicators measure, these results are nevertheless suggestive of the need for caution in the use of different indicators for the same purpose. As mentioned in the literature review part, dietary diversity score indicators have their own limitation in that, they are more subjective and less comprehensive. To this end, the diet quantity based may be a more comprehensive indicator, as it is an indirect measure that takes into account of the various dimensions of food security. Quite interestingly, participation in off-farm activity, which was insignificant in discriminant analysis turned out to assume a significant negative impact in BMLRM. In fact the negative impact of off farm activities on food insecurity has been well acknowledged in the theoretical as well as empirical literature.
For instance [5,21] for Ethiopia; and [22]. for Ghana have reported a negative and significant effect on household food insecurity of off-farm in rural areas. Our findings are, therefore, consistent with the theory and past empirical findings. Where,*, significant (p<0.05)

Classical Multiple Linear Regression
In order to use the proposed multiple linear regression analysis, it is necessary to test and verify that the proposed equation satisfies the assumptions. Assumptions of multiple linear regression tested in this study to validate the proposed multiple regression analysis are: homoscedasticity (Constant variance), nonautoregression (randomness of residuals), non stochastic (errors are uncorrelated with the individual predictors), normality of the error distribution, were examined by plotting of the residuals against predicted values multicollinearity among predictor variables were tested by Variance Inflation Factor (VIF).

Checking Multivariate Normality and Residuals Plots
The results shown in Figure 1 the Q-Q Normal plot and the residual vs fitted plot that fulfills the assumptions. For this result the assessing of assumption reveals that the normality and residual plot assumption is not violated.

Bayesian Multiple Linear Regression Checking Multivariate Normality and Residuals Plots
To cheek the adequacy of model parameter or to cheek the model assumption, we can use plots such as histograms and MC realization plots. The results shown in Figure 2 the histograms and MC realization plots of posterior mean parameters that fulfills the assumptions of normality for β0 s and inverse gamma for sigma. Thus, assumptions the posterior normality and inverse gamma were not violated. Generally, as we have seen very small Monte Carlo (MC) error (posterior standard deviation) than the classical standard deviation which, indicates the good model fit (good estimate of the posterior mean and standard deviation). Thus, the model was good fitted model and good convergence was attained as we have seen in fifteen plots.

Overall Goodness of fit Test
As show in Table 4 the residual standard error for both classical and Bayesian multiple linear regression model are σ=1:372 (OLSE) and σ=1:371 (posterior) respectively. This results posterior residual standard error, is less than the ordinary least square (OLS) estimates residual standard error.
Which implies, Bayesian approach is better to give parameters estimation than ordinary least square estimates (OLSE). Thus, for our study the appropriate parameter estimation methods could be Bayesian MLR approach [23]. The Overall goodness of fit of the model is, first, approximated by F-value. The results depicted in Table 4 show that the F-statistics is significant (F=98.81, p < 2.2e-16). Therefore, we reject the null hypothesis< : = =:::= and this implies that the Overall goodness of fit of the model is significant. In addition the Adjusted 9 =0:37, which is 37% the predictor variables that explains variation on the response variable diet quality (dietary diversity score).

Conclusions
The major objective of this study is to examined the determinants of household food insecurity in rural Ethiopia using data obtained from Households Consumption and Expenditure (HCE) and Welfare Monitoring (WM) Survey conducted in 2011 by CSA. Bayesian multiple liner Regression Model was employed to identify determinant factors of rural households, diet quality.
According to the analysis of independent variables with dependent variables household size, annual per capita expenditure of a household, age of head of household, educational level of head of household, farm land size of a household, distance to input source, gender of head of household, number of oxen owned by the farm household and shocks such as; price rice of food items, drought, illness and flood have a significant association with diet quality on food insecurity.
The rural households with; lower educational level of head of household, being female head of household, lower household size, lower annual per capita expenditure of a household, having small farm land size of a household, having lower number of oxen, not having stored production, non-participating in Off-farm activities, longer distance to input source and households exposed to shocks such as: price Rice of food item, drought, illness and flood were found to be consumed lower number of food groups, diet quality. Generally, the food security indices estimated in this study were fair representations of the extent and dimension of food security/insecurity in rural households of Ethiopia. In order to achieve food security, strategies should be designed in a way that.
Would focus on and address the identified determinants as well as other factors that are useful to achieve household food security.

Recommendations
Based on the findings and conclusions of the study, the researcher forwarded following points as a recommendation to policy makers and planers: 1) These findings suggest that rural food security could be improved through a comprehensive and judicious combination of interventions aiming at enhancing income diversification opportunities in rural areas such as off-farm activities, promoting education, higher utilizing land farms and improving Agricultural input sources survives. i.e. The longer the distance that the farmers travel from their home to the agricultural input sources, the more food insecure they are likely to be. Thus there is a need to formulate intervention strategies by the governments to work in order to alleviate the transportation problems and build a corporate institute that can supply agricultural inputs and provide information about the market situation.
2) The rural households and the government should give attention on shocks such as; price rice of food items, illness, flood and drought that makes the rural households to be food insecure. 3) It should be noted that large household size is known to be one of the leading causes of food insecurity in the rural Ethiopia. This implies that policy measures directed towards the provision of better family planning to reduce household size should be given adequate attention and priority by the governments. 4) Generally, food insecurity is a multifaceted concept, which cannot be treated in isolation from other causes of poverty. Therefore efforts geared towards achieving food security should be addressed to the areas of human and infrastructure development.

Limitation of the Study
1) Due to lack of recent data on households food insecurity of rural Ethiopia, the study used data taken from both HCE and WM surveys which were conducted in 2011 by CSA. 2) Since the HCE and WM survey did not covered nonsedentary populations in Afar (three zones) and Somali (six zones) we could not fully addressed the food security situation in rural Ethiopia.