Classification Analysis of Gender Among Diabetic Patients in Nigeria Hospital

In medical research there is few record on scientific method of discriminating and classifying gender statistically into groups of study. The purpose of this study is to use discriminant analysis and classification analysis to classify diabetic patient into groups of gender; to estimate the proportion of observations in each of the prior group; and to estimate the probability of correct classification and misclassification respectively. To this effect, a sample of 152 cases (diabetic patients) was observed with the following measurements: Age (x1), Urea (x2), temperature (x3), Fasting blood sugar (x4), Body mass index (x5), and marital status (x6). The gender was classified into male and female. We observed that the Discriminant Function Z=0.036x1+0.008 x2-0.897 x3-0.021 x4-0.017 x5-2.872 x6. Also 64.5% of the original grouped cases were correctly classified. The percentage of misclassification is 34.5%. Conclusively the measure of the predictive ability which is the percentage of correct classification shows that discriminant analysis can be used to predict diabetic patients into two classes of gender and can also be used to predict group membership of any subject matter.


Introduction
Diabetes mellitus is recognized as being a syndrome, a collection of disorders that have hyperglycemia and glucose intolerance as their hallmark, due either to insulin deficiency or to the impaired effectiveness of insulin's action, or to a combination of these. In order to understand diabetes it is necessary to understand the normal physiological process occurring during and after a meal. Food passes through the digestive system, where nutrients, including proteins, fat and carbohydrates are absorbed into the bloodstream. The presence of sugar, a carbohydrate, signals to the endocrine pancreas to secrete the hormone insulin. Insulin causes the uptake and storage of sugar by almost all tissue types in the body, especially the liver, musculature and fat tissues [1].
Diabetes is a lifelong condition that causes a person's blood sugar level to become too high. Persons with diabetes have too much sugar in their blood. There is no cure for diabetes. People with diabetes need to manage their disease to stay healthy. it is a metabolic disorder of chronic hyperglycemia characterized by disturbances to carbohydrate, protein, and fat metabolism resulting from absolute or relative insulin deficiency with dysfunction in organ systems [14]. Populations previously unaffected or minimally affected by DM are now reporting soaring prevalence figures, which poses a real challenge to health financing by governments and nongovernmental organizations. The latest prevalence figure published by the International Diabetes Federation (IDF) is 425 million persons living with DM worldwide, with nearly 50% of these undiagnosed. The developing economies of Africa and Asia contribute a significant fraction of this figure. There is also a rising burden from the complications of DM alongside the ever-increasing prevalence of the disease. We now see high rates of DM-related amputations, cerebra vascular disease, heart-related problems and in Nigeria, the current prevalence of DM among adults aged 20-69 years is reported to be 1.7% [4].
Linear Discriminant Analysis easily handles the case where the within-class frequencies are unequal and their performances has been examined on randomly generated test data. This method maximizes the ratio of between-class variance to the within-class variance in any particular data set thereby guaranteeing maximal separability. [2] The use of discrimant analysis is related to group memberships. The use of this technique is for classifying individuals or objects into one the alternative groups. This is done on the basis of a set the existing predictor variables in research. The dependent variable that is there is discriminant analysis has to be categorical in nature and put on nominal scale. On the other hand the independent variable which is also called the predictor variable has interval or ratio scale tendency [3,[5][6][7]].

Methodology
Multivariate Analysis consists of a collection of methods that can be used when several measurements are made on each individual or object in one or more samples. Measurements are referred to as variables and individuals or objects as units (research units, sampling units, or experimental units) or observations. In practice, Multivariate data sets are common, although they are not always analyzed as such. But the exclusive of univariate procedures with such data is no longer excusable, given the availability of multivariate techniques and inexpensive computing power to carry them out. Research in behavioural sciences mostly involves developing prediction and classification models. Discriminant function Analysis also called Discriminant Analysis is used to classify cases into the values of a categorical dependent, usually a dichotomy.
Discriminant analysis is a statistical technique that is used to classify the dependent variable between two or more categories. Discriminant Analysis also has a regression technique which is used for predicting the value of the dependent categorical variable. Discriminant analysis may be applied in a number of settings: in the armed forces, it is used in assigning new personnel to training programs. In the industry, it is used in assigning new employees to a particular job category. In health, it is used in classifying a patient into one of several diagnostic categories. In education, it is found useful as aids in both educational and vocational counseling. [5][6][7] This concern for the classification ability of the linear discriminant function has obscured and even confused the fact that two very distinct purposes and procedures for conducting discriminant analysis exist. The first procedure, discriminant predictive analysis is used to optimize the predictive functions, that is the objective is to develop an equation that maximally discriminate the groups using P independent variables. The second procedure, discriminant classification analysis uses the predictive functions derived in the first procedure to either classify fresh sets of data of known group membership, there by validating the predictive function; or if the function has previously been validated [8,9,12].
The goal of discriminant analysis include identifying the relative contribution of the P variables to separation of the groups and finding the optimal plane on which the points can be projected to best illustrate the configuration of the groups (see [2,12]).

Classification Functions of Ronald Aylmer Fisher
Rule: The observation unit belongs to the group for which the value is maximal.
The data used for this research work comprises of 152 Diabetes patients with the following measurements on each (Age, fasting blood sugar, temperature, body mass index, marital and Urea).

Conclusion
Discriminant score was gotten and it was used in classifying the variables into two groups. 64.5% of the original grouped cases were correctly classified while 35.5% were misclassified.
The probability of misclassification is 0.355. That is to say that the level of accuracy is high and significant.
In Summary, the measure of the predictive ability which is the percentage of correct classification shows that discriminant analysis can be used to predict diabetes into two classes of gender and can also be used to predict group membership of any subject matter.