Bayesian Analysis of Retention and Graduation of Female Students of Higher Education Institution: The Case of Hawassa University (HU), Ethiopia

The study was conducted on female students who were 2005, 2006, 2007, and 2008 entries in the fields of Natural Science, Agriculture, and Social Science. From 1931 female students a sample of 605 was taken using stratified random sampling, Primary and secondary data were collected using questionnaire and analyzed using the Bayesian logistic regression analysis. The results showed that the percentage of graduation among 362 females who were enrolled in 2005, 2006, and 2007 was 72.1%. Similarly the retention rate among 243 females of 2008 entry was 75.7%. From the Bayesian logistic regression analyses, significant predicators of both graduation and retention were choice of field, preparatory average result, entrance exam score and first year cumulative GPA. Moreover pregnancy, organizing studying and leisure time, habit of chewing Khat, satisfaction with instructors, parent income, habit of smoking cigarette and using drugs, and feel safe to study at night in classrooms appeared as significant predictors of retention. The graduation rate and retention rate for the students who assigned to the field they did not choose were lower than that for those assigned to the field they chose. Those with first year CGPA less than 2.0 were having lower rates of graduation and retention than those having greater than 2.0. The graduation and retention rates for the students having higher preparatory average result and higher entrance exam score were higher than that for those having lower. The students having parents’ income less than 500 were less likely to retain than those having parents’ income greater than 1500. The retention rate for the students who were not satisfied with their instructors was lower than those were satisfied. The students who cannot organize their study and leisure time easily were less likely to retain than those can organize. In conclusion, the factors those mainly affect female students’ graduation and retention were more of academic variables; hence we recommend that assigning to the field they choose by their interest may help female students’ graduation and retention. The teaching method at secondary and preparatory schools should be designed to challenge and motivate them to adequately prepare them for Higher Education Institutions. Moreover, campus and Department administrators in collaboration with the students themselves and academic staff need to work hard to bring change in behavior, academics, and social aspects of female students at the University.


Introduction
Education is the only essential measure that can guarantee women for their full participation in different development programs like leadership, health, education, Agriculture, nutrition, and socio-economic programs. Educated Women contribute to the society by improving quality of life and by enhancing national development through increasing economic production rate, better sanitation and nutritional practices, reduced child and maternal mortality rate. They are also mothers of better educated children. Women's participation in all fields of the world has become significant [1,2].
Retention is the current enrolment in a program, which is the withholding of students in the program and is important for both the students and the University in the sense that the students can complete the degree that they are striving for and the University is able to complete the goal of retaining the student [3]. Female Student's retention is one of the major concerns currently in higher education [4]. Many researchers found that retention rates for women in the higher education system are significantly lower than for men [5][6][7][8][9].
Graduation is the completion of a program and graduation rate is the completion rate [10]. If a female student retains all the consecutive semesters with higher performance, then we can be sure that she will graduate with in the latest year.
Increasing the retention and graduation rates of female students is vital to the mission of the University system [11]. The participation gap of females in education is wide in developing countries like Ethiopia, which has low females participation than their male counterparts [12]. The ministry of education in Ethiopia is taking many measures to improve girls' enrollment in the higher education. One of the measures is taking affirmative action strategies. As per the strategy, for Higher education enrollment, Ethiopian Higher Education Entrance Qualification Certificate (EHEEQC) grades favor girls in which for girls, the entrance exam score is lower than their male counterparts.
Women success can be measured based on the percentage of women graduates. Female graduates contribute more likely to avoid poverty and more likely to continue with graduate and PhD programs. They have also less economic risk than those dropouts [13]. Increasing the number of female graduates will not only benefit the individual but also will have broader public benefitsuch aspaying tax.
Seyoum [14] indicated that the number of female graduates in Ethiopia between 1963 and 1973 years was only 6 11.7%, which is very low [15]. Furthermore; on the same area, on average about 1,451 female students in the Agriculture, Natural Science and Social Science faculties were enrolled in 2003/04 and 2004/05 academic years in which about 74% (1075) of them has retained [16].
The dropout rates and educational experiences of female students cannot be ignored [13]. A high rate of Female students'dismissal indicates a failure on the part of an institution to achieve its goal [17]. Not retaining has a negative psychological implication for the female students such as feelings of low self-esteem [4]. The many problems that drop out female students face lead them to abnormal socio-economic crisis [13].
Different questions arise as to whether female students' retention or graduation is the result of characteristics of individual students or of factors inherent in the structure, process and culture of education. Many studies have required identifying models and sets of variables to explain what affects female student's retention and graduation in the Higher education system [16,[18][19].

Statement of the Problem
Female students are failing to graduate at alarmingly high rates [13]. The dropout rates of female students result in severe economic risks for themselves, their families, and the whole society. These dropout female students are not only at economic risk but also they are more likely to be unemployed, get lower salaries, and suffer from health problems [20].
The female students that do not retain, even though they are given a chance to return to the University after having one year break at their home do not want to go to their home. They are more likely to become psychologically ill, financially dependent, pregnant, single mothers, prostitutes, and street girls. If they even give birth, their children would also likely to drop out [13].
There are so many factors that may influence the retention and graduation of female students in Higher Education. Therefore the following research questions were counteredin this study: 1) What are the retention and graduation rates of female students? 2) What are the factors that mainly affect the retention and graduation of female students? 3) Do female students in different fields of study have the same rates of retention and graduation? 4) Does the first year performance matter for graduation/ retention?

Objective of the Study
The general objective of this study was to identify the factors that influence the retention and graduation of female students at Higher Education Institution at HU, Ethiopia.
The specific objectives of this study are:-1) To estimate the retention and graduation rates of female students. 2) To identify the major factors that influences the retention and graduation of female students. 3) To provide information for policy makers, University administrators, and researchers.

Significance of the Study
The study of analysis of Female students Retention and Graduation can be an example in the Higher Education Institution to suggest approaches for improving female student's success in different fields and also to help them to improve their academic competence by giving them information about how they can retain and graduate. It will be useful for the University by providing information about the factors that make female students to retain and graduate, for University administrators and researchers to give guidance to work more on this area and seek a better understanding of why female students are dismissed and also assist the administration to develop counseling and advising. It is also useful in providing the society information about the factors that may endanger the female students of higher Education.

Description of the Study Area and Population
The study was carried out at HU, which is located in the Southern Nations Nationalities People's Region (SNNPR). SNNPR is located in the South and South western part of Ethiopia. Hawassa is the regional capital city of SNNPR, which is one of the fastest growing and most dynamic cities in the country with high potential to attract researchers, residents, and investors. It is located in the Sidama Zone and is 275 kilometers south of Addis Ababa via Debre Zeit with a population of 159,013 [21]. HU was established in April 2000 and today it is a comprehensive University engaged in the provision of all-round education, research, training, and community service through its diversified areas of academic units.

Data
The data for this study was obtained from Primary and Secondary data. The secondary data was collected from the HU registrar offices to identify factors that may predict female students' graduation. These data included data of 2005/06, 2006/07 and 2007/08 entries. The primary data was collected from 2008/09entry female students in HU Main Campus and College of Agriculture through questionnaire to identify the factors those may affect female student's retention. The data was collected by 8 trained enumerators and the researcher. The data satisfied the following criteria.
Exclusion and Inclusion Criteria: This study excluded female students who were having four year or above program of study. It also excluded female students who were before 2005/ 06and after 2008/ 09entries.

Sampling Design and Procedure
Stratified random sampling method was used as asampling design for selecting a representative sample of female students. It is a technique, which is explained by Cochran [22] and Al-Subaihi [23].
The stratification was based on 3 Faculties and 4 years of entry, which wasmade as follows: Stratum 1: Female students who were 2005 entries in Agriculture with population size N 1 and sample size n 1 .
Stratum 2: Female students who were 2006 entries in Agriculture with population size N 2 and sample size n 2 .
Stratum 3: Female students who were 2007 entries in Agriculture with population size N 3 and sample size n 3 .
Stratum 4: Female students who were 2008 entries in Agriculture with population size N 4 and sample size n 4 . Stratum 5: Female students who were 2005 entries in Natural Science with population size N 5 and sample size n 5 .
Stratum 6: Female students who were 2006 entries in Natural Science with population size N 6 and sample size n 6 . Stratum 7: Female students who were 2007 entries in Natural Science with population size N 7 and sample size n 7 .
Stratum 8: Female students who were 2008 entries in Natural Science with population size N 8 and sample size n 8 .
Stratum 9: Female students who were 2005 entries in Social Science with population size N 9 and sample size n 9 .
Stratum 10: Female students who were 2006 entries in Social Science with population size N 10 and sample size n 10 .
Stratum 11: Female students who were 2007 entries in Social Science with population size N 11 and sample size n 11 . Stratum 12: Female students who were 2008 entries in Social Science with population size N 12 and sample size n 12 .

Sample Size Determination
When Stratified random sampling technique is used to estimate the population proportion, the following formula for sample size n is used [22][23]. The formula that gives the sample size needed was: where n = is the sample size needed, p i = is the subpopulation proportion for stratum i, which is the probability that a female student retained or graduate. It is obtained from a previous study, which was conducted by Tesfaye [16]. Hence, for both retention and graduation, the estimate for p i was taken to be 0.74, then q i = 1-p i = 1-0.74 = 0.26. d = is the precision level, which is the margin of error. The specification of d must be small to have a good precision. In this study d = 0.03 was used to minimize cost. L = 12 is the total number of strata, N i = Total number of population in stratum i, W i = is the estimated proportion of N i to N, and Z α/2 = is the inverse of the standard normal cumulative distribution that correspond to the level of confidence, which is equal to the upper α/ 2 point of standard normal distribution, where α = 0.05, i.e., Z α/2 = 1.96. Using the formula in "(1)", the calculated sample size was to be n = 576. To compensate a non-response, 10% of 576 = 57.6 ≈ 58 was added to the computed n. Thus, the required sample size for this study was n = 634. Applying the strata weights, the sample allocation to each stratum was proportional to the total number of units in the stratum. The sample size for the i th stratumwas n i = W i *n, so that. n 1 = 39, n 2 = 51, n 3 = 35, n 4 = 55, n 5 = 45, n 6 = 27, n 7 = 39 n 8 = 137, n 9 = 52, n 10 = 39, n 11 = 35, n 12 = 80 Further, using proportional allocation of the above each sample sizes from each stratum, the sample size breakdown for each department in each respective year was also made.

Variables in the Study
Variables in this study were selected based on some past studies and thosewere expected to be factors that may affect female students' retention and graduation at HU. Some of these variables are continuous and others are categorical.

Graduation Variables
The dependent variable is female students' graduation status. It has two outcomes, graduated or not graduated and it is dichotomized as 1 if the student graduated and 0 if she did not (See Table 1).

Independent Variables Value Labels
Year of entry

Retention Variables
The dependent variable is female students' retention status. It has also two outcomes, retained coded as 1 or not retained coded as 0. They are classified as demographic and socio-economic variables, family backgrounds etc.

Statistical Models
In this study, Descriptive Statistics and Bayesian logistic regression was used. Logistic regression was used to test for each predictor's significance while controlling other predictors.

Bayesian Logistic Regression
Statistical inferences are usually based on maximum likelihood estimation (MLE), which chooses the parameters that maximize the likelihood of the data. In MLE, parameters are assumed to be unknown but fixed, and are estimated with some confidence. In Bayesian analysis, the uncertainty about the unknown parameters is quantified using probability so that the unknown parameters are regarded as random variables. Bayesian inference is the process of analyzing statistical models with the incorporation of prior knowledge about the model or model parameters. The source of such inference is Thomas Bayes. This is done by applying Bayes' theorem. The posterior distribution is written as: The prior distribution expresses the information available to the researcher before any "data" are involved in the statistical analysis.
Logistic regression is appropriate for data where the response variable is dichotomous and Bayesian logistic regression procedure is used to make inference about the parameters of a logistic regression model. Bayesian inference for logistic Regression analyses follows the usual pattern for all Bayesian analysis. The basic steps and concepts that should be considered in analyzing Bayesian inference should be the likelihood function of the data, a prior distribution over all unknown parameters, and the posterior distribution over all parameters.

Likelihood Function
The joint distribution of n independent Bernoulli trials is the product of each Bernoulli distributions and the sum of independent and identically distributed Bernoulli trials makes up a Binomial distribution.
Let Y 1 , Y 2 ,..., Y n be independent Bernoulli trials having probability of success p 1 , p 2 , ..., p n respectively. That is, y i = 1 with probability of p i or y i = 0 with probability of 1 -p i , for i = 1, 2,..., n. Since the trials are independent, the joint distribution of Y 1 , Y 2 , ..., Y n is the product of n Bernoulli probabilities. That is: where p i represents the probability of retaining or graduating for female student i with covariate vector X i , Y i = 1 represents for retained or graduated, and Y i = 0 represents for did not retain or did not graduate.

Prior Distribution
It is important to specify the prior distribution in Bayesian inference because it influences the posterior inference. Emphasis has to be given in specifying the prior mean and variance. The prior mean provides a prior point estimate for the parameter of interest, while the variance expresses the uncertainty concerning this estimate.
In general, any prior distributions can be used, depending on the available prior information. When there is some information about the likely values of the unknown parametersβ 0 , β 1 , β 2 , ..., β m , then informative priors are used but when no prior information is available, priors, which are called non informative or vague priors are used.
Most of the time, priors with mean zero and large variance are most common priors for logistic regression parameters. The assumed prior normal distribution for parameter is given by: where µ j is the normal distribution mean and σ j is the variance.

The Posterior Distribution
The posterior distribution is obtained from the product of the full likelihood function and the prior distribution over all parameters. The posterior distribution is written as: wheref (β| data) is the posterior distribution, which is the product of the logistic regression likelihood and the normal prior distributions for theβparameters.
The mean of the posterior distribution can be used as a point estimate of βbut computing the estimate of βfrom the posterior distribution may be difficult analytically. Therefore non -analytical methods like simulation techniques are used. Of these techniques Markov Chain Monte Carlo (MCMC) methods is the most popular. Hence, the posterior mean of the parameters' can be obtained from the mean of the sampled values of the Markov Chain after the burn-in period.

Assessing the Bayesian Logistic Regression Model
i. Markov Chain Monte Carlo Methods (MCMC): MCMC methods are attempted to simulate direct draws from some complex distribution of interest. The most widely used MCMC technique is the Gibbs sampler. The MCMC methods are explained in detail by Stamps [24].
ii. Convergence of the Algorithm: Before simulated parameters are summarized, we must ensure that the chains have converged. Several diagnostic tests were developed to monitor the convergence of the algorithm. Among several ways, the most popular and straight forward convergence assessment methods are: Autocorrelation, Time series plots, Gelman-Rubin statistic, and Density plot [25].
iii. The Burn-in Period: Burn-in is aninformal term that describes the practice of throwing away some iteration at the beginning of an MCMC run. It is explained by Merkle et al. [26].
iv. Assessing Accuracy of the Bayesian Logistic Regression: Once we achieved convergence, we need to run the simulation for a further number of iterations to obtain samples that can be used for posterior inference. The simplest way to assess the accuracy of the posterior estimates is by calculating the Monte Carlo error (MCE) for each parameter, indicating the quantity of interest is calculated with precision [24].

Results and Discussion
The sample size determined for this study was 634. However there were 29 non-respondents and so the data analyzed in this study were based on 605 respondents. All computations were conducted in Statistical Package for Social Science (SPSS) version 13.0 and WinBUGS version 14.0.
The analysis has two parts: one for graduation and the other for retention. In the first case, 362 female students enrolled in the academic years 2005/ 06-2007/ 08with 37.6% from 2005/ 06, 32.3% from 2006/ 07, and 30.1% from 2007/08entries were considered. In terms of their fields of study 30.7% were from Natural Science, 34.5% were from Agriculture, and 34.8% were from Social Science.
For the second case, 243 female students who enrolled to the University in 2008/09academic year were considered. 47.7% were from the Natural Science, 19.3% were from Agriculture, and 32.9% were from Social Science.

Results from Analysis of Graduation Data
From Table3, among the 362 female students of 2005/06-2007/08 entries, 72.1% graduated and the rest 27.9% did not. Regarding their background, about 72.4% were assigned to the field of their choice of interest while 27.6% of them were not assigned to the field they chose. During their first year stay, 58.3% of them scored CGPA of 2.00 and above while the rest 41.7% scored CGPA of less than 2.00. The mean of their first year CGPA was 2.10 with standard deviation 0.7198. The mean entrance exam score of the respondents was 238.31 with standard deviation of 35.097, while that of preparatory average result was 69.07 with standard deviation of 6.2839. The mean age of the respondents was 19.09 with standard deviation 1.048. 99.7% were never married.  Table 4 displays the Wald test for the continuous covariates age, entrance exam score, and preparatory average result. The tests are all significant at 5% level.

Bivariate Analysis Results
As can be seen in Table 5, the graduation rate was estimated to be 84.7% among the respondents who enrolled to the field of their choices. The estimate was 76.6% for Natural Science, 71.2% for Agriculture, and 69.0% for Social Science. The graduation rate among those having first year CGPA greater or equals to 2.00 was 93.8%. Per entry, it was 71.3% for 2005/06, 75.2% for 2006/07, and 69.7% for 2007/ 08. For the chi-square tests, it was found that graduation status of the female students considered was associated with their field choices and first year CGPA.  Three parameter chains were set up to be sampled for 300000 iterations each. The first 30000 iterations were discarded from each chain, leaving a total sample of 810000 to summarize. Four of the predictor variables except age at first year (b. A) were significant. This is because the 95.5% confidence interval for b. A contains zero (See in Table 6). In the absence of conjugate prior distributions, WinBUGS uses sampling methods. These sampling methods typically guarantee that, under regularity conditions, the resulting sample converges to the posterior distribution of interest. Thus, before we summarize simulated parameters, we must ensure that the chains have converged.

Bayesian Logistic Regression Results for Graduation
a. Time Series Plots: If the plot looks like a horizontal band, with no long upward or downward trends, then we have evidence that the chain has converged. Here, the three independently generated chains mixed together or overlapped and these plots display good mixing of the chains. b. Autocorrelation Plot: Autocorrelation plots can also be used to test for convergence. High autocorrelations in parameter chains show that a model that is slow to converge. Here, the plots showed that the three independent chains were mixed or overlapped to each other and died out for higher lags and hence this is an evidence of convergence (Results are displayed in Figure 2). c. Gelman-Rubin Statistic (GR): For a given parameter, this statistic assesses the variability within parallel chains as compared to variability between parallel chains. The model is judged to have converged if the ratio of between to within variability is close to 1. From the Plots of this statistic, the green line represents the between variability, the blue line represents the within variability, and the red line represents the ratio. Evidence for convergence comes from the red line being close to 1 on the y-axis and from the blue and green lines being stable across the width of the plot [26]. Hence, evidence for convergence has been reached (Results are displayed in Figure 3). d. Density plot: This is another recommended technique for identifying convergence. It is a smoothed kernel density estimate for continuous variable or a histogram for discrete variable. Since the plots for the most predictor variables indicated that the coefficient has normal distribution, the simulated parameter value indicated convergence (See Figure 4).

ii. Assessing Accuracy of Bayesian Logistic Regression
Model of Graduation Data. Once convergence has been achieved, we need further simulation for a further number of iterations to obtain samples that can be used for posterior inference. The more samples we have, the more accurate the posterior estimates will be. One way to assess the accuracy of the posterior estimates is by calculating the Monte Carlo error for each parameter. This is an estimate of the difference between the mean of the sampled values and the true posterior mean. As a rule of thumb, the simulation should be run until the Monte Carlo error for each parameter of interest is less than about 5% of the sample standard deviation. Table 7 contains the estimated coefficients, Mean, the standard deviation (sd), Monte Carlo (MC) errors, 5% of the standard deviation (0.05*sd) and 95.5% Confidence Interval. The MC error for choice of field (CH) is 0.0012 and this is less than 0.05*sd = 0.01902. MC error for entrance exam score is 0.0001 and this is less than 0.05*sd = 0.00034, etc. This implied that the MC error for each significant predictor variable is less than 5% of its posterior standard deviation and hence convergence and accuracy of posterior estimates are attained and the model is appropriate to estimate posterior statistics.
Confidence intervals for the constant (alpha) and for the predictor variables choice of field (b. CH), entrance exam score (b. EN), first year CGPA (b. FGP) and preparatory average result (b. PAR) do not include zero. Therefore, these predictor variables were significant and hence mainly affect female student's graduation. The Mean values for b. CH, b. EN, b. FGP, and b. PAR predictor variables were 1.448, 0.0304, 2.333, and 0.1806 respectively. These values are all positive indicating positive relationship between the dependent variable graduation and these predicator variables. For example, in the case of choice of field (b. CH), since 'No' is coded as 0 and 'Yes' is coded as 1, the positive value indicated that female students who assigned to the field they chose were more likely to graduate than those who didn't assigned to their field choice by their interest. In the case of entrance exam score (b. EN), female students having higher entrance exam score were more likely to graduate than those having lower entrance exam score (Results are displayed in Table 7).

Interpretation and Discussion for the Graduation Analysis Data Results
This study tried to estimate female students' graduation rate and also has provided some information about the factors that determine female students' graduation at Hawassa University. According to the results, among 362 female students of the academic year 2005/06-2007/08 with mean age at first year equals 19.09; 72.1% graduated but the rest 27.9% did not. About 37.6% were 2005/06 entry, 32.3% were 2006/07 entry and 30.1% were 2007/08 entry. 30.7% of them were from the Natural Science Faculty, 34.8% were from the Social Science Faculty and 34.5% were from the Agriculture Faculty. About 72.4% were assigned to the field they chose by their interest but 27.6% of them did not. During their first year stay, 58.3% of them scored CGPA of 2.00 and above while the rest 41.7% scored CGPA of less than 2.00.
From the bivariate and Univariate results, choice of field, first year CGPA, entrance exam score, preparatory average result and age at first year variables were found significant predicators of the dependent variable Graduation.
From the Bayesian logistic regression analysis, the constant (alpha) and the predictor variables choice of field (b. CH), entrance exam score (b. EN), first year CGPA (b. FGP) and preparatory average result (b. PAR) were significant predictors of female students' graduation. In this case all Mean (β ) values of the significant predictors were positive indicated positive relationship between the dependent variable graduation and these predicator variables. One variable that affects female students' graduation was preparatory average result (b. PAR). Female students having higher preparatory average result were more likely to graduate than those having lower preparatory average result and similar results were found by earlier studies conducted by Zhang et al. [11].

Results from Analysis of Retention Data
From the descriptive results in Table 8, among the 243 female students who were 2008/09 entry, 75.7% (184) retained but the rest 24.3% did not retain.
The mean age of the participants and its standard deviation were 18.576 and 0.7856 respectively, the mean entrance exam score of the students and its standard deviation were 241.90 and 32.429, the average first year CGPA of the students and its standard deviation were 2.352 and 0.4970 and the mean of preparatory average result of the students and its standard deviation were 71.116 and 6.2114 respectively (Results are displayed in Table 8).

Univariate Analysis Results for Retention Data
The results in Table 9 indicated that the Wald statistic for entrance exam score and preparatory average result predictor variables is significant at 5% level of significance. Hence, these predictor variables were selected for the logistic regression analysis.

Bivariate Analysis Results for Retention Data
About 95.9% (233) were single but 4.1% were married. About 20.2% were from the region Amhara, 17.3% were from the region SNNPR, 28.4% were from the region Oromia, 28.8% were from the region Addis Ababa, 2.9% were from the region Tigray and 2.5% were from other than these regions. In the case of choice of field, 'Yes' stands for the 1 st or 2 nd choices but 'No' stands for the student's 3 rd or above choices. Hence 79.4% were assigned to their 1 st or 2 nd choices and the rest 20.6% were assigned to their 3 rd or above choices. About 3.3% of female students were having a father with no job, 26.7% were having a father who is a farmer, 23.5% were having a father who runs his own business and 46.5% were having an employed father. The percent of female students having their parents' income per month less than 500, between 500 and 1000, between 1001 and 1500, and greater than 1500 was 22.6%, 17.7%, 19.8% and 39.9% respectively. The percent of female students who ever got pregnancy was 5.8%. The percent of female students who were having a habit of chewing Khat was 8.6% and that of having a habit of smoking cigarette and using drugs was 3.3%. During their first year stay, 77.8% of them were having a CGPA of greater than or equal to 2.0 but the rest 22.2% of them were having a CGPA of less than 2.0 (See Table 10).
From Table 11, about 53.5% of female students were satisfied with their instructors. About 77.0% of them were committed to their academic activities. 71.6% of them were able to organize their study and leisure time easily. 51.9% of female students were able to focus on exam. 40.7% of them were being nervous and forgot facts they know on exam. About 37.9% of female students felt that they never achieve academic respects even if they work hard. The sociability of 66.7% female students helped them to reach their educational goals. About 36.2% of them felt safe to study in classrooms at night. About 62.1% of them matched their expectation about the field with the reality they faced.
Among the 79.4% of female students who were assigned to their 1 st or 2 nd choices, 87.6% of them retained but among the 20.6% who were assigned to their 3 rd or above choices only 30.0% retained. Among the 94.2% female students who did not get pregnancy 79.9% retained but among the 5.8% ever got pregnancy female students, only 7.1% retained. Among the 53.5% who were satisfied with their instructors, 80.8% retained but among the 25.9% who were not satisfied with their instructors 60.3% retained. Among the 77.0% who were committed on their academic activities, 80.2% retained. Among the 77.8% who have their first year CGPA greater or equal to 2.0, 86.8% retained but among the 22.2% who have first year CGPA less than 2.0, 37.0% retained As we can see from Table 10 and Table 11, the asymptotical significance value for both chi -square and likelihood ratio tests and for both Kendall's tau-b and tau-c tests is below 0.05 for predictor variables: choice of field, first year CGPA, pregnancy, parent income, habit of chewing Khat, habit of smoking cigarette and using drugs, focus on exam, commitment to academic activities, organize study and leisure time, satisfaction with instructors and never achieve academic respects. This indicated that there is a significant relationship between the dependent variable retention and these predictor variables. The following predictor variables are also significant at 25% level of significance. These are: father occupation, being nervous and forget facts on exam, help of sociability in campus to reach educational goals, feel safe to study at night in classrooms, expectation about field and weak because of personal problems and hence these predictor variables can be selected with the other significant predictor variables for the logistic regression analysis. 0.000** **Significant at 5% level and *Significant at 25% level.

Bayesian Logistic Regression Results for Retention Data
The predictor variables those were significant in the Bivariate and Univariate analyses were selected for the Bayesian logistic regression. These were choice of field, first year CGPA, entrance exam score, preparatory average result, parent income, habit of chewing Khat, habit of smoking cigarette and using drugs, pregnancy, focus on exam, commitment to academic activities, organize study and leisure time, satisfaction with instructors, father occupation, being nervous and forget facts on exam, help of sociability in campus to reach educational goals, never achieve academic respects, feel safe to study at night in classrooms, expectation about field, and weak because of personal problems.
Three parameter chains were set up to be sampled for 400000 iterations each. The first 150000 iterations were discarded from each chain, leaving a total sample of 750000 to summarize. 11 predictor variables were significant in the Bayesian logistic regression (See Table 12).
i. Assessing Convergence for Retention Data Before we summarize simulated parameters, we must ensure that the chains have converged.

a. Time series plots for retention Data
The three independently generated chains mixed together or overlapped and these plots display good mixing of the chains (Results are displayed in Figure 5). b. Autocorrelation Plot for retention data: The plots showed that the three independent chains were mixed or overlapped to each other and died out for higher lags and hence this is an evidence of convergence (See Figure 6). c. Gelman-Rubin statistic (GR) for retention data: Evidence for convergence comes from the red line being close to 1 on the y-axis and from the blue and green lines being stableacross the width of the plot [26]. Hence, evidence for convergence has been reached (See in Figure 7). d. Density Plot for retention data: Since the plots for the most predictor variables indicated that the coefficient has normal distribution, the simulated parameter value indicated convergence (See Figure 8) ii. Assessing Accuracy of Bayesian Logistic Regression Model for Retention Data. Once convergence has been achieved, we need further simulation for a further number of iterations to obtain samples that can be used for posterior inference. Simulation should be run until the Monte Carlo error for each parameter of interest is less than about 5% of the sample standard deviation. Table 12 contains the estimated coefficients, Mean ( β ⌢ ), the standard deviation sd, Monte Carlo (MC) errors, 5% of the standard deviation (0.05*sd) and 95.5% Confidence interval.
The MCE for each significant predictor variable is less than 5% of its posterior standard deviation and hence convergence and accuracy of posterior estimates are attained and the model is appropriate to estimate posterior statistics. The confidence intervals for the constant (alpha) and for the predictor variables; choice of field (b. CH), habit of chewing chat (b. Chew), entrance exam score (b. EN), first year CGPA (b. FGP), feel safe to study at night in classrooms (b. FeelSfniClas), organizing study and leisure time (b. OrganStu), preparatory average result (b. PAR), pregnancy (b. Pregn), satisfaction withinstructors (b. SatInstr), habit of smoking cigarette and using drugs (b. SmokUsedrug) and parent income (b. parincom) does not include zero indicating they are significant. Therefore, these predictor variables mainly affect female students' retention. The Mean ( β ⌢ ), values for habit of chewing chat (b. Chew), feel safe to study at night in classrooms (b. FeelSfniClas), pregnancy (b. Pregn) and smoking cigarette and using drugs (b. SmokUsedrug) are negative indicating negative relationship between the dependent variable retention and these predicator variables.
While those values for the other significant predictors were positive indicating positive relationship. For example, in the case of choice of field (b. CH), since 'No' is coded as 0 and 'Yes' is coded as 1, the positive value indicated that female students who assigned to the field they chose were more likely to retain than those didn't assigned to their field choice by their interest. In the case of pregnancy (b. pregn), since 'No' is coded as 0 and 'Yes' is coded as 1, the negative value indicated that female students who ever got pregnancy were less likely to retain than those did not get pregnancy (Results are displayed in Table 12).

Interpretation and Discussion for the Retention Analysis Data Results
This study has provided some information about the factors those determine female students' retention at HU. According to the results, among the 243 female students who were 2008/09 entry with mean age at first year equals 18.5761, 75.7% retained but the rest 24.3% did not. Of which, 47.7% were from the Natural Science Faculty, 32.9% were from the Social Science Faculty and 19.3% were from the Agriculture Faculty. In the case of choice of field, 'Yes' stands for the 1 st or 2 nd choices but 'No' stands for the student's 3 rd or above choices. Hence 79.4% were assigned to their first or second choices and the rest 20.6% were assigned to their 3 rd or above choices. About 77% of them were committed to their academic activities. About 71.6% of female students were able to organize their study and leisure time easily. About 40.7% of them were being nervous and forgot facts they know on exam and 77.8% of them were having a CGPA of greater than or equal to 2.0 but the rest 22.2% of them were having a CGPA of less than 2.0. About 53.5% of female students were satisfied with their instructors.
From the bivariate and Univariate results, choice of field, first year CGPA, pregnancy, parent income, habit of chewing chat, habit of smoking cigarette and using drugs, focus on exam, commitment to academic activities, organize study and leisure time, satisfaction with instructors, never achieve academic respects, father occupation, being nervous and forget facts on exam, help of sociability in campus to reach educational goals, feel safe to study at night in classrooms, expectation about field and weak because of personal problems were found significant predictors of female students' retention.

Conclusions
The objective of this study was to identify the factors those affect female students' graduation and retention.
Graduation of female students depends significantly upon several factors. The bivariate and univariate analyses indicated that preparatory average result, choice of field, entrance exam score, first year CGPA and age significantly affect female students' graduation. The Bayesian logistic regression analyses showed that preparatory average result, choice of field, entrance exam score and first year cumulative GPA mainly affected female students' graduation. The graduation rate for female students who assigned to the field they chose by their interest was higher than those assigned to the field they did not chose. Female students whose first year CGPA less than 2.0 were having lower rate of graduation than those having first year CGPA greater than 2.0. Graduation rate for female students having higher preparatory average result was higher than that for female students having lower preparatory average result.
Retention of female students also depends significantly upon several factors. Here, the Bayesian logistic regression analyses showed that preparatory average result, choice of field, first year CGPA, pregnancy, satisfaction with instructors, habit of chewing chat, parent income and organize study and leisure time mainly affected female students' retention. Entrance exam score, feel safe to study at night in classrooms and habit of smoking cigarette and using drugs were also significant. The retention rate for female students who assigned to their 1 st or 2 nd choice was higher than those assigned to their 3 rd or above choice. Female students whose first year CGPA less than 2.0 were having lower rate of retention than those having first year CGPA greater than 2.0. Retention rate for female students having higher preparatory average result was higher than that for those having lower preparatory average result. Female students having parents' income less than 500 were less likely to retain than those having parents' income greater than 1500. The retention rate for female students who were satisfied with their instructors was higher than those not satisfied. Female students who can organize their study and leisure time easily were more likely to retain than those cannot. Female students who have a habit of chewing Khat and also who have a habit of smoking cigarette and using drugs were less likely to retain than those have no. The retention rate for female students who ever got pregnancy was lower than that for those who did not. The retention rate for female students who feel safe to study at night in classrooms was lower than that for those who did not. We recommended that designing an appropriate method of enrolment and allocation of students to different faculties and departments may help female students' graduation and retention. Campus and Department officers in collaboration with the students themselves and academic staffs are expected to work hard to bring change in behaviour, academics and social aspects of female students. Special attention must be given to the task of female students advising in order to help integrate them both socially and academically into the university environment. We also recommend that the teaching method at secondary and preparatory schools should be designed to challenge students and motivate them to prepare adequately for Higher Education Institutions.