Predictors of Human Death by Road Traffic Crashes in Bahir Dar City, North Western Ethiopia; A Count Data Analysis Regression Model

Road traffic crashes are a major socio-economic and public health problem, affecting all people of the world and Ethiopia is a country with a very large number of traffic crashes and fatality rate. This study has major objective of assessing the predictors of road traffic accident in Bahir Dar city, Ethiopia and identifies factors that contribute to the occurrence of road traffic crashes that leads human death. Data regarding to the number of deaths per road traffic crash were obtained from Bahir Dar city administration traffic police office for a two year period from July 2015-June 2017. In this study we applied six count models namely Poisson, negative binomial, generalized Poisson, zero inflated Poisson, zero-inflated negative binomial and zero inflated generalized Poisson regression models. Based on different models comparison criteria, e.g. AIC, log likelihood and Vuong test ZIGP regression model provides more appropriate fit to the number of human death per road traffic crashes data considered in this study. Sex, age, driving under alcohol, fatigue, not give priority, days of weeks, road condition, overloading, over speeding, and type of accident were found to be statistically significant predictors of human death due to road traffic crash.


Background of the Study
Accident/crash is defined as anything which happens by chance, anything occurring unexpectedly and un-designed [1]. Road traffic crash is a collision or similar incident involving a moving vehicle, resulting in property damage, personal death or injury [2]. Road traffic crash is an unexpected phenomenon that occurs as a result of the use or operation of vehicles including bicycles and handcarts on the public highways and roads. A road traffic crash is defined as any vehicle accident occurring on a public highway. It includes collisions between vehicles with vehicles, vehicles with animals, vehicles with pedestrians, or vehicles with fixed obstacles. Single vehicle accidents, which involve a single vehicle, that means without other road user, are also included [3]. Accidents may be fatal, resulting in deaths of the road users (passengers, drivers or pedestrians), or minor when it is not severe enough as to cause substantial hardship [4]. Globally over 1.2 million people are killed and more than 20-50 million injured in crashes every year. The global economic losses due to road traffic crashes exceed US$ 500 billion [5]. In Africa over 80% of goods and people are transported by roads [6], and in Ethiopia road transport accounts for over 90% of all the inter-urban freight and passenger movements in the country [7].
Road traffic injuries pose a significant burden in Ethiopia, as is the case for other developing countries. Currently, developing countries contribute over 90% of the world's road traffic fatalities [5]. However, road traffic crash and poverty are linked because family bread winners are highly represented among the road traffic crashes. Road traffic crashes are a major public health concern. Ethiopia is one of those developing countries with low level of income accompanied by the high rate of population growth. As part of the developing world, Ethiopia is predominantly an agrarian country with a low level of urbanization. Transport is an important sector in facilitating different economic activities in the national economy. Ethiopia experiences the highest rate of deaths such accidents in Sub-Saharan Africa. Road traffic injuries are growing as the vehicle use of developing countries rises [8][9][10].
By 2020, road traffic accidents are expected to be the third leading cause of death and disability worldwide, by some calculations matching the toll of AIDS. Residents of developing countries are at much higher risk of road traffic injuries than residents of high-income countries. They are also at greater risk of death injuries and property damage when a crash occurs. Developing countries also have inadequate trauma systems and are often unable to care for crash victims. It was indicated that unless action is taken to improve road safety systems, poor countries will continue to bear the heavy toll of road traffic fatalities [11]. According to the WHO data published in April 2011, road traffic accident deaths in Ethiopia reached 22,786 per year (2.77% of total deaths). Road accidents appear to occur regularly at some flash points such as where there are sharp bends, potholes and at bad sections of the highways. At such points over speeding drivers usually find it difficult to control their vehicles, which then results in fatal traffic accidents, especially at night [12]. Accident rates in developing countries are often 10-70 times higher than in developed countries. Whereas traffic crash situation is slowly improving in the industrialized societies (e.g. Australia, USA, UK etc.), most developing countries face a worsening situation. For developing measures aimed at reducing the rate of road traffic crashes and the consequent deaths, injuries, fatalities and property damage, there is the need for regular evaluation of the road traffic accidents.

Statement of the Problem
Road traffic crashes are major public safety and development obstacle. According to WHO [13] the current situation required high level of political dedication and took immediate action to reduce road traffic crashes. Road safety tends not to receive due consideration because not all road accidents and casualties are reported to the police and there is usually no other system of estimating road accidents and the corresponding casualties nationwide. Road accidents are too often accepted as inevitable negative side effects of motorization [14].
Ethiopia is one of the developing countries having a very low road network density and vehicle ownership level, currently Ethiopia has a relatively high accident record [15]. Road traffic accident problem in Ethiopia, especially in the metropolitan cities, is increasing at an alarming rate. Bahir Dar is one of the metropolitan cities of Ethiopia and has high road traffic accident record by different causalities [16]. In connection with the above facts, traffic volume is becoming huge and is increasing from time to time; as a result of different factors, road traffic accidents have increased over the years and are becoming a common day to day phenomenon resulting in loss of life, human suffering, destruction of properties and the environment. The number of victims treated in hospital, health center and clinics also show upward trend. Bahir Dar special zone health department reported that in the previous years (2000)(2001)(2002) only 3,188 road casualties received medical treatment as inpatients and out-patients [17]. Some researchers have investigated the suitability of the binary logistic regression, Poisson regression and negative binomial model to predict accident frequencies at intersections or roadways [10,[18][19][20][21][22]. These researches have foreground the fact, because accident occurrences are necessarily discrete, often discontinuous and more likely random events, it is better to use Poisson regression for equal variance and for over dispersion negative binomial models than multiple linear regression models but there many count models that used to handle over dispersion. So the purpose of the study was to determine the significant factors by applying GLMs in road traffic crashes data.

General objective
The general objective of this study was to identify the predictors of human death related to road traffic crashes by using count data regression model analysis.
Specific objective a. To identify the factors significantly affect/causes of road traffic accidents (that leads human death by road traffic crashes). b. To identify the models and select the robust model for count response data related to human death per traffic crash.

Significance of the Study
The findings of this study will be used for making awareness for the concerned bodies about problems related to the causes of road traffic accident to take appropriate measures. To show the severity of the road traffic accident for the readers so that they will save their lives and livelihoods from loss. To serve as information for those researchers interested in conducting further studies in the area. To help for the policy makers to design appropriate strategies to reduce road traffic crashes which results human death. Generally, the results obtained from this study and recommendations were made used for all members of the community of the city.

Description of Study Area
Bahir Dar is special zone and capital city of Amhara National Regional State (ANRS). Bahir Dar is one of the leading tourist destinations in Ethiopia, with a variety of attractions in the nearby Lake Tana and Blue Nile River.

Sources of Data and Study Design
Secondary data sources were used in cross-sectional survey of the road traffic crashes. The data used in this study was recorded from July 2015 to June 2017 by the traffic police in Bahir Dar city administration traffic police office on daily basis. The data provide information on road traffic crashes that occur within two years on consecutive days. The variables are used in this study are the number of human death per road traffic crashes as response variable and for the explanatory or predictor variables sex of driver, age of driver, driver-vehicle relationship, education level of driver, driving under alcohol, owner ship of vehicle, driving under fatigue, not give priority, day of weeks, accident time, type of road, road geometry, road condition, type of vehicle, overloading, over speeding and type of accident.

Generalized Linear Models (GLMs)
GLMs represent a class of regression models that allow us to generalize the linear regression approach to accommodate many types of response variables including count, binary, proportions and positive valued continuous distributions [24,25]. Because of its flexibility in addressing a variety of statistical problems and the availability of software to fit the models, it is considered a valuable statistical tool and is widely used. In fact, the generalized linear model has been referred to as the most significant advance in regression analysis in the past twenty years [25]. Generalized linear models GLMs extend ordinary regression models to encompass non normal response distributions and modeling functions of the mean. Three components specify a generalized linear model: A random component identifies the response variable Y and its probability distribution; a systematic component specifies explanatory variables used in a linear predictor function; and a link function specifies the function of E(Y) that the model equates to the systematic component. [24] introduced the class of GLMs, although many models in the class were well established by them.
A generalized linear model (GLM) consists of three components: A random component, specifying the conditional distribution of the response variable, Yi (for the i th of n independently sampled observations), given the values of the explanatory variables in the model. In the initial formulation of GLMs, the distribution of Y i was a member of an exponential family. This family has probability density function or mass function of form ; = . Several important distributions are special cases, including the Poisson and binomial. The value of the parameter may vary for is i=1,..., N, depending on values of explanatory variables. The term is called the natural parameter is sufficient for basic discrete data models. The systematic component of a GLM relates a vector , … , to the explanatory variables through a linear model. Let x ij denote the value of predictor j (j=1,2,..., p) for subject i. Then = ∑ , = 1, … , . This linear combination of explanatory variables is called the linear predictor. Usually, one = 1 for all i, for the coefficient of an intercept (often denoted by) in the model. The third component of a GLM is a link function that connects the random and systematic components.
where the link function g is a monotonic, differentiable function. Thus, g links E(Y i ) to explanatory variables through the formula " = ∑ , = 1, … , . The link function " = called the identity link, has = . It ii specifies a linear model for the mean itself. This is the link function for ordinary regression with normally distributed Y. The link function that transforms the mean to the natural parameter is called the canonical link. In summary, a GLM is a linear model for a transformed mean of a response variable that has distribution in the natural exponential family. We now illustrate the three components by introducing the key GLMs for discrete response variables. The following subsections show example models for count data. Count data are non-negative integers; they represent the number of occurrence of an event within a fixed period. e.g., number of death per road traffic crashes. (

i) Poisson Regression Model
The standard Poisson distribution is a fundamental distribution to understand regression count models. According to [26], the apparent simplicity of Poisson comes with two restrictive assumptions. First, the variance and mean of the count variable are assumed to be equal. The other restrictive assumption is that occurrences of the event are assumed to be independent of each other. A regression model based on this distribution follows by conditioning the distribution of on a k-dimensional vector of covariates, x i =[x 1 , …, x k ], and parameters β, through a continuous function | = λ % [27]. The Poisson mass function is given by; ƒ y % /x % = ) *+ ,-, In the log-linear version of the model the mean parameter is parameterized as λ % = 5 67 log λ % = x % 5 . Given independent observations with the density function, the log-likelihood function can be obtained by: Over-dispersion Models: Over dispersion is not expected that the residual deviance roughly equal to residual degree of freedom when the Poisson model fits the data reasonably. But, large residual deviance implies that the conditional variance exceeds the mean the Poisson model does not fit [27]. This usual incidence in the analysis of discrete outcome data is referred as over-dispersion (Var(y i ) > E(y i )). If there is over-dispersion causing the variance to be larger than the mean, then the estimation will be inefficient using a Poisson regression.
(ii) Negative Binomial Regression Model Negative binomial regression model is applicable for modeling over-dispersion, which is a conjugate mixture distribution for count data. When the Poisson model assumption fails, negative binomial regression model may fit better, and address the over-dispersion problem. The probability mass function of negative binomial distribution is given by: The regression model is also given by = ′ ) or log = x % 5 . With mean E / = = exp ' and variance, var y % /x % = 1 + F , where Ґ . is the gamma function and the index F read as delta is called the dispersion parameter. As F approaches to zero, the variance and mean become identical, then the negative binomial model reduces to the classical Poisson model. If F >0, the variance will exceed the mean, that is var > E and the distribution allows for over dispersion [28]. The negative binomial log-likelihood function is given by:

(iii) The Generalized Poisson Regression Model
The Generalized Poisson regression model is another alternative way for modeling over-dispersion and it's a good competitor with negative binomial model. The advantage of using the generalized Poisson regression model, is that it can be fitted for over-dispersion, \ ] >^ [29]. Suppose i is a count response variable that follows a generalized Poisson distribution, the probability density function of i ,=1,2,…,6 is given as [29,30];

Zero Inflated Regression Models
In some cases, excess zeros exist in count data and considered as a result of over dispersion. In such a case, the NB and GPR model cannot be used to handle the overdispersion which is due to the high amount of zeros. To do this, zero-inflation (ZI) models can be alternatively used.
(i) Zero Inflated Poisson Regression Model Zero inflated (ZI) models can be used to account for excess zeros. ZIP models have less adequate than ZINB and ZIGP models when the presence of over dispersion due to excess zeros and unobserved heterogeneity. The probability mass function of ZIP is given by ƒ Y % =/ψ % , λ % = l ψ % + 1 − ψ % e K-, , if y % = 0 where λ % is the mean of the non-zero outcomes that can be expressed with the associated explanatory covariates using a natural logarithmic link is given by: log λ % = x % 5 , where X i =(1, x i1 , x i2, ..., x ip-1) ' is a px1 vector of explanatory variable of the i th observation and is px1 vector of regression coefficient parameters and ψ i is the probability of an excess zero which can be estimated by the logistic regression [31,32]. That is Logit ψ % = ;6 o p , K p , q = r ′s where ψ % = ) t / uv b) t / uv , i = 1, 2, … , n, where Z i =(1, Z i1 , Z i2 , Z iq-1 ) is a qx1 vector of explanatory variable for the zero-inflation part model of the i th observation and s = 1, s , s h , … , s wK ) is qx1 vector of regression coefficient parameters. Unlike the Poisson distribution, which is determined by a single parameter, the ZIP distribution is determined by two parameters, x and1 ψ i . The ZIP model has mean and variance = 1 − ψ % x i and y ] = 1 − ψ % x + ψ % x h respectively [33].
(ii) Zero-Inflated Negative Binomial Regression Model If the dependent variable presents a high proportion of zeros which could create problems for the negative binomial estimation, a modified count model is the zero inflated Negative Binomial (ZINB) models which take the existence of excess zeros in to account. Zero-Inflated Negative Binomial (ZINB) regression model is an extension of the NB regression model. Then the probability density function of ZINB the random variable Y i distributed as ZINB is given by: The ZINB model with mean and variance, = 1 − ψ % λ % and y ] = 1 − ψ % λ % + -% € M *N ) respectively [34]. In order to obtain the parameter estimates of ZINB regression models, • ,s ‚and F • the Newton-Raphson method can be used [27].
(iii) Zero Inflated Generalized Poisson Regression Model Besides ZINB, zero-inflated generalized Poisson (ZIGP) regression has been proposed as an alternative to handle zero-inflation and additional over dispersion in count data. Zero inflated generalized Poisson (ZIGP) distribution is another alternative for modeling over dispersed count data with excess zeroes. [35] have used ZIGP distribution to model domestic violence data. A zero-inflated generalized Poisson (ZIGP) regression model is defined as.
The ZIGP model is a special case of a two-class finite mixture model with mean and variance = 1 − ψ % i and y ] = 1 − ψ % h + 1 + F h − ‡1 − ψ % hˆ h respectively.

Goodness of Fit Tests
Over-dispersion Test: Poisson model is a special case of negative binomial and generalized Poisson model. To assess the adequacy of the negative binomial and generalized Poisson model over the Poisson regression model, we can test the hypothesis: H Š : δ = 0 vs H OE : δ > 0. This is to test for the significance of the over-dispersion parameter F . The presence of the over-dispersion parameter F in the NB and GP regression model is justified when the null hypothesis• Š : F = 0, is rejected. A likelihood-ratio (LR) tests for the over-dispersion parameter F, in the negative binomial (NB) and generalized Poisson (GP) specification against the Poisson model specification [27]. In order to test the hypothesis the likelihood ratio test (LRT) is given by: Ž•• M = −2'; ̂ − ; ‡ ", F •ˆ" where ; ̂ 67 ; ‡ • , F •ˆ is the maximized log-likelihood under the given models respectively.
Likelihood Ratio Test (LRT): The LRT is a test of the overall model and a test of a null hypothesis H 0 against an alternative H A based on the ratio of two log-likelihood functions. The overall test statistic for LRT is given as LRT=G 2 =-2 (L R -L F )~x 2 p-1 , where: L R is the log-likelihood of the null model (reduced model) and L F is the log-likelihood of the model comprising k predictors, p is number of parameters and x 2 p -1 is a chi-square distribution. If the test statistics exceeds the critical value, the null hypothesis is rejected. That means the overall model is significant. The statistic of LRT for F is given by the following equation: LRT=-2 (L 1 -L 2 ). This statistic has a Chi-squared distribution and L is log-likelihood. If the statistic is greater than the critical value then, the model 2 is better than the model 1.
Information Criteria: Akakie information criteria (AIC) and Bayesian's information criteria (BIC) are goodness of criteria used for model selection. AIC and BIC are the most common means of identifying the model which fits well by comparing two or more than two nested models and models with the largest log-likelihood value can be chosen as the best model for describing the data under consideration. The formula is given as: -˜™ = −2Ž + 2š ›˜™ = −2Ž + š ; 6 6 oe where L is the log-likelihood of a model that will compare with the other models, n is the sample size of the data and k is the number of parameters in the model including the intercept. The comparison will start from the model without any independent variable with the model with adding the independent variable one by one through the full model. The model which has the minimum value of AIC and BIC and largest log-likelihood value is the most appropriate fitted model to the dataset. Vuong's test: The Vuong's test is a non-nested test that is based on a comparison of the predicted probabilities of two models that do not nest [38]. That means Vuong test statistics are needed to provide the appropriateness of zero-inflated models against the standard count models. Under the null that the models are indistinguishable, the test statistic is asymptotically distributed standard normal. Given that /• and h /• are the predicted probability of the zero inflated models versus ordinary models respectively. That means set as model1 zero inflated models and model2 ordinary models. We want to test the following hypotheses of Vuong test are: H o : The two models are equivalent versus H A : The two models are not equivalent. Vuong showed that asymptotically, V has a standard normal distribution. As Vuong notes, the test is directional [38]. If V > Z α /2 , the first model is preferred, if V < -Z α /2 , the second model is preferred and if | V | < Z α /2 , none of the models are preferred (the two models are equivalent). The Vuong test statistics can be expressed as [38] Test for individual predictors: Let β denote an arbitrary parameter. Consider a significance test of H 0 : β 0 =0. The simplest test statistic uses the large-sample normality of the ML estimator • , let SE ‡ •ˆ denote the standard error of • , evaluated by substituting the ML estimate for the unknown parameter in the expression for the true standard error. The hypothesis is given as follows: When H 0 is true, the test statistics is Z =°K¯Š

±² ‡³°ˆ
The significance test for each coefficient in the model will be done using Wald chi-square the Wald statistic (Z 2 ) is: 2 . Under H 0 true, Z 2 is a chi-square distribution with 1 degree of freedom. Likelihood-ratio tests are generally considered to be superior [39]. In this study the analysis pertains to search the important factors. It is always a good idea to start with descriptive statistics. Figure 1 showed that there are large numbers of zero values (54%) highly picked at zero and a positively (or right) skewed distribution of human death per road traffic crashes. In this case zero-inflated count data models s better fit the data. The results also indicated that the maximum frequency of number of deaths per accident recorded was 13. As shown in Table 1, the variance of human death per road traffic accident (3.45) was greater than its mean (1.13). This indicated the possibility of over dispersion and hence the standard Poisson regression model was not appropriate to fit the road traffic accident data. Thus one might expect that ZIP, ZINB and ZIGP would possibly be better models to predict the traffic crashes dataset.

Statistical Model Results
Goodness of fit and Comparison of Models: In order to select the best model which fits the data well, from the above six models, different model selection criteria were considered "*"shows the significance codes.
Then the Table 2 showed that the model selection criteria among the candidates. First, the calculated value of the Vuong test (2.67) was greater than the hypothetical value (1.96) for ZIP versus Poisson model. This value revealed that ZIP model was preferred to Poisson model. In the second case, comparison of ZINB versus NB models, the calculated value of the Vuong test is 6.168, revealed that the ZINB model was preferred to NB regression model and in the third case, the comparison ZIGP versus GP models, the statistic value of Vuong test is 4.15, and revealed that the ZIGP model was preferred to GP regression model. Finally, to compare the ZIP, ZINB and ZIGP models, log likelihood, AIC and BIC were used. Therefore, ZIGP model is better fitted model for human death per road traffic crashes data than ZIP and ZINB models. Log likelihood value is large and AIC and BIC values were found to be small for ZIGP model as compared to other count models. Therefore, it is possible to conclude that ZIGP model was more appropriate than the ZIP and ZINB models to fit road traffic crashes dataset.

Interpretations of Count model coefficients of ZIGP regression model
As shown in Table 3 sex of driver had significant impact on the number of human deaths per accident. The expected number of deaths per road traffic accident had decreased by 58.0% for the female drivers as compared to male drivers while holding all other variables in the model constant. This result was consistent with the study [40].
Driver's age had significant impact on the number of deaths per accident. The expected number of deaths per traffic accident had decreased by 70.9% for drivers in the age group above 50 years as compared to the drivers in the age group 18-30 years while holding all other variables in the model constant. This result was similar to the study [8,19,21,[41][42][43][44][45]. The model also shows that the driver's driving under alcohol had significant impact on the number of deaths per accident. The expected number of deaths per accident had decreased by 92.9% for driver driving without drinking alcohol as compared to the drivers driving with drinking alcohol while holding all other variables in the model constant. This finding seemed to be in accordance with other studies [22,43,45]. From the result drivers driving under fatigue had significant impact on the number of human deaths per accident. The expected number of deaths per accident had decreased by 85.4% for drivers driving without fatigue as compared to drivers driving with fatigue while holding all other variables constant in the model. Fatigue consider only in this study. The model also revealed that driver not give priority had statistically significant impact on the number of death per traffic accidents. The expected number deaths per accident were increased by 95.4% and 57.0% for the driver not give priority to pedestrian and to others respectively, compared to the drivers not give priority to vehicles controlling for the other variables in the model constant. This result was consistent with [45].
The finding of this study also revealed that the day of weeks had a significant impact on the number of deaths per accident. The expected number of deaths per traffic accident had increased by 64.5% and 99.6% for Monday and Saturday respectively, compared to the day of weeks Sunday and the rest categories have the same effect with Sunday while holding all other variables in the model are constant.
The finding shows the road condition had statistically significant impact on the number of death per traffic accidents. The expected number deaths per road traffic accident had increased by 86.6% for dry condition of road as compared to for a wet condition of road controlling for the other variables in the model constant. This result is contradicted with [21,45].
In the model the variable overloading had significant impact on the number of deaths per accident. The expected number of deaths per accident had decreased by 69.5% for drivers driving without overloading as compared to the drivers driving with overloading while holding all other variables in the model constant.
The model revealed that the predictor over speeding had significant impact on the number of deaths per traffic accident. The expected number of deaths per accident had decreased by 85.5% for drivers driving without over speeding as compared to the drivers driving with over speeding while holding all other variables in the model constant. This result is consistent with [22,43]. Finally the type of accident had the significant effect on the number of deaths per accidents. The expected number of deaths per road traffic accident had increased by 48.7%, 68.4% and 80.4% for the accident type of vehicle to pedestrian, vehicle to others and reverse of vehicle respectively, compared to the accident type of vehicle to vehicle while holding all other variables in the model constant. This result was similar to the study [21,44].
Interpretations of zero-inflation part of the model Zero inflated models are interpreted as a mixture of structural and sampling zeros from two processes; the process that generates excess zeros from a binary distribution which are the structural zeros, and the process that generates both non-negative and zero counts from GP distributions which are the sampling zeros. The results of Table 3 above indicated the parameter estimates of the Zero-Inflated (logit model) part of the ZIGP regression model for examining the impact of explanatory variable on the odds of being in the always zero group. Table 3 showed that drivers driving under alcohol has a significant impact on the odds of being in the always zero group. The odds of no occurrence of human death (being always zero group) decreased by a factor 0.614 (38.6%) for drivers driving without drinking alcohol as compared to driving with drinking alcohol holding all other variables in the model constant. And also the variable over speeding has a significant impact on the odds of being in the always zero group. The odds of no occurrence of deaths (being always zero group) decreased by a factor 0.248 (75.2%) for drivers driving without over speeding as compared to driving with over speeding holding all other variables in the model constant.

Conclusions
Road traffic crashes are increasing at an alarming rate, causing the loss of life and resources. This study revealed that the predictor variables that had significant impact on number of human death per road traffic crashes and also identifies the best count fit model in order to analyze the road traffic crash (human death due to road traffic crashes) data. For the selected ZIGP model, the generalized Poisson part, the predictor variables like sex of driver, age of the drivers, driving under alcohol, driving under fatigue, not give priority, day of weeks, road condition, over loading, over speeding and type of accident were statistically significant factors on the number of human death per road traffic crashes in this study. Then by giving more attention for these factors we can reduce the number of human death due to road traffic crashes. Finally, in our belief the road traffic crashes can be reduced if the significant factors are properly taken care of.

Recommendations
a. Bahir Dar city administration traffic police office and the police commission should prepare appropriate policies and strategies & accomplish on those selected statistically significant variables in order to reduce the number of human death due to road traffic injuries. b. Further studies can be made on the area of road traffic crashes by considering detail and accurate information on the determinant variables that are recorded in detail instead of broad categories results could be more accurate and efficient in the study.