Automatic Machine Learning Classification of Alzheimer's Disease Based on Selected Slices from 3D Magnetic Resonance Imagining
Abdalla R. Gad1, N. M. Hussein Hassan2, Rania A. Abul Seoud2, Tamer M. Nassef3, *
1Electronic and Communication Department, Faculty of Engineering, October High Institute for Engineering and Technology, Giza, Egypt
2Electronic and Communication Department, Faculty of Engineering, Fayoum University, Fayoum, Egypt
3Computer and Software Department, Faculty of Engineering, Misr University for Science and Technology, Giza, Egypt
To cite this article:
Abdalla R. Gad, N. M. Hussein Hassan, Rania A. Abul Seoud, Tamer M. Nassef. Automatic Machine Learning Classification of Alzheimer's Disease Based on Selected Slices from 3D Magnetic Resonance Imagining. International Journal of Biomedical Science and Engineering. Vol. 4, No. 6, 2016, pp. 50-54. doi: 10.11648/j.ijbse.20160406.11
Received: October 31, 2016; Accepted: December 26, 2016; Published: February 15, 2017
Abstract: The most dominant form of dementia, memory loss, is Alzheimer's disease (AD). Imaging is important for monitoring, diagnosis, and education of Alzheimer's disease prediction. Automated classification of subjects could provide support for clinicians. This study examined two classification methods to separate among elderly persons with normal cognitive (NC), Alzheimer's disease (AD), and mild cognitive impairment (MCI) by using images from the magnetic resonance imaging (MRI). The dataset consists of 120 subjects separated into 40 ADs, 40 MCIs, and 40 NCs. The first technique was K-Nearest Neighbor (KNN) and the second technique was Support Vector Machine (SVM), firstly all the subjects were filtered and normalized, secondly twelve features were extracted. After feature selection, two techniques of classification were examined with Permutations and combinations for all features in order to select the best features which have the highest accuracy for identification the classes. The best average accuracy was 97.92% using SVM polynomial order three, and best all average accuracy was 95.833% using KNN with K=6, and K=7 for random selection of testing data with SVM and KNN. The results show a relatively high classification accuracy between the three clinical categories. In summary, the proposed automatic classification technique can be used as a noninvasive diagnostic tool for Alzheimer's disease, with the capability of defining early stages of the disease.
Keywords: Alzheimer's Disease, Magnetic Resonance Imaging, Feature Extraction, Classification, Support Vector Machine, K-nearest Neighbor
Alzheimer's disease (AD) is a neurological issue that consequences for individuals of age more than 60 years of age. The illness is multiplied in the scope of 4 to 6 years . It was described for the first time by Alois Alzheimer, according to the increase in a number of persons with Alzheimer disease, symptoms and treatment have been intensively investigated, Wither while it's apart from a few exceptions. Risk factors have been raised only in the last years the factors that trigger the AD onset of AD remained unknown . It is a standout amongst the most widely recognized pathologies infections that consequences for individuals. It turned out to be awful after some time, it influences cerebrum cells and causes the degeneration of those cells Responsible for memory [3,4].
Classification and Features extraction are essential Parts for the recognition procedure that have the significant impact on the performance of the system . Several approaches are used to classify AD whose diagnosis has already been achieved by using mostly the voxel intensity (VI) of MRI 3D or PET images as a feature [6, 7], physical Characteristics such as size and shape , histograms of the gradient  and textural analysis features [10, 11]. Different methods used features employing to k- Nearest-Neighbours, Support Vector Machines (SVMs), artificial neural network (ANN), and Naïve Bayes in the classification phase [12-15] by using different data sets like Alzheimer's Disease Neuroimaging Initiative (ADNI) and Oasis Brain Dataset (OASIS) and the accuracy of all classifiers techniques were different from the difference of the datasets [16, 17]. The accuracy of ANN and KNN classifiers achieved 99% for classification of AD . Another classification accuracies succeed by 98.87% , while 90.97% has been obtained by Boosting classifier . Different studies have different classification steps according to stages of this disease that have been classified according to these steps which are AD versus NC, MCI versus AD and AD versus MCI but there are differences that can classify all those stages of disease by one classification step using KNN, SVM, NN and other classification techniques [21, 22].
2. Materials & Methods
The dataset that was used in this study was from the National Alzheimer's Coordinating Centre (NACC) that consists of 120 subject Aged 57 to 91. Table 1 summarized the characteristics of the subjects included in this study.
|Number of Subjects||40||40||40|
|Age||67.3 ±10.5||71.1 ± 11||65.5 ± 9.5|
NORM COG; It means that the subject has normal cognition
(No MCI, dementia or other neurological condition).
MRI dx; identifying that the subject has NC or MCI or AD.
Ten slices were selected, which have the higher information for the brain from the 3D-T1 MRI for all subjects and after selection of slices by using image processing techniques. The selected slices were filtered to remove salt and paper noise and then all data set needs to normalized by using region of interest to remove the black area outside of the brain.
After noise reduction and normalization of all data set subjects, for reaching the final stage of classification of all subjects. First, the extraction of features from Dicom images. The feature extraction methodology is for developing features from the dataset that has been related to the identification of normal cognition, Alzheimer's disease, and mild cognitive impairment. The features that have been Chosen included gray level co-occurrence matrices (GLCMs) Textural Analysis Features, physical characteristics, and Asymmetry Features. Table 2. Shows the features extracted from slices.
|9-Total Area Brain (Pixel)|
|10-Total Black Area (Pixel)|
Classification of patterns is very important in the field of medical image processing after feature extraction for detection of disease before patients go to the dangerous side effect stage of these diseases, As it is shown in the flow chart in fig 1 the steps applied to the classification of NC, MCI, and AD.
Firstly, apply support vector machine Due to their greater performance in the recent years. The Support Vector Machines attracted considerable attention and was successfully applied to numerous applications from computer vision to computational biology, by using SVM Polynomial kernel according to the following equation [23,24].
Where x and y are two feature vectors, i is a free parameter trading off the influence of higher-order versus lower-order terms in the polynomial.
The procedure used in this study using SVM has two classification steps, the first step of classification is to classify normal subjects versus all the up normal subject (AD & MCI) and the second step of classification is to classify all up normal subject to mild cognitive impairment or Alzheimer's disease.
Secondly, k-nearest neighbor KNN which assigns each test subject to the nearest class according to the nearest distance between the test subject and three classes. The test belongs to the nearest distance for class by using Euclidean distance . KNN used for classifying all subjects to three classes NC, MCI, and AD in one step, not two steps as used in SVM classifier.
Where xi and yi are two feature vectors used in classification by using a variety number of neighbors to classify all subjects with this different values for k=4, 5, 6, 7.
3. Results and Discussion
Permutations and combinations for all extracted features for two classifiers to select the best average accuracy for each class, Twelve feature were extracted for each slice from ten selected slices, after extracting all features from slices, the averaging of these features was applied to classifiers, fig 2 shown the sample of features extracted from one slice.
Resulting accuracy of each classifier was the highest accuracy according to the best-selected features that actually identify each class. Data consist of 120 subjects was partitioned into 24 normal, 24 mild cognitive impairment and
24 with Alzheimer disease that was used for training the classifiers and 48 subjects for testing classifiers.
Testing data was partitioned into16 NC, 16 MCI, and 16 AD. By random selection of the training and testing data. The result was a huge data for different parameters according to different classifiers. For KNN, the total accuracies from different values of K for KNN algorithm was summarized in table 3.
|Number of neighbors (K)||4||5||6||7|
The value of K is changed and data were tested for 212 for all features and then select the best number of features that have the highest accuracy for each class with different values of neighbors as Illustrated in table 4.
|Number of neighbors (K)||Selected features||Accuracy|
|4||2, 4, 5, 6, 12 NC||100%|
|1, 5, 8, 10 AD||68.75%|
|2, 6, 8, 9, 11, 12MCI||87.5%|
|5||8, 10, 11, 12 NC||100%|
|2, 7, 10, 12 AD||100%|
|5, 6, 7, 8, 9, 10, 11, 12 MCI||75%|
|6||2, 4, 5, 6, 12 NC||100%|
|1, 5, 8, 10 AD||100%|
|2, 6, 8, 9, 11, 12MCI||87.5%|
|7||3, 8, 12 NC||87.5%|
|3, 6, 7, 8 AD||100%|
|4, 6, 7, 10 MCI||100%|
For Support Vector Machine polynomial order, the value of (P) is changed and tested the data for 212 permutations of all features. Finally select the best number of features that have the highest accuracy for each class (NC, MCI, AD) with different values of SVM Polynomial order 3, 4 as shown in fig 3. & 4.
Resulting accuracy for SVM was 212 for 212 permutations for all features. The highest average accuracy was taken for the best-selected features that identify each class, and the total result accuracy for all classes with different SVM polynomial order was summarized in table 5.
This study presented two different approaches to assist doctors in their distinguishing, classification of NC, MCI, and AD by reducing the dimensionality of the feature vectors of the data set by selection of the best features that have made the highest accuracy, while the old results accuracies are not high enough to completely eliminate the need for a doctor's approval. The two Approaches we proposed have the highest accuracy rather than elder studies, SVM has 97.92% accuracy with polynomial order 3. The selected features mean, contrast, Kurtosis, and total area brain are the best-selected features for identifying normal cognitive, energy, homogeneity And Skewness for identifying Alzheimer's disease. And mean, entropy, contrast, homogeneity, kurtosis, and image symmetry for identifying mild cognitive impairment. These accuracies for SVM is the best accuracy compared to older accuracies for classifying AD as mentioned before. KNN use combinations of different features that are extracted from the image with permutation for these features we have 95.83% accuracy using KNN with K=6 and K=7, these accuracies were the best for using these traditional combinations of features and classical classification compared to previous studies.
The contributions of the paper, in this study the classification based on ten slices selected from 3D T1 MRI not all 3D volume. This study classifies according to averaging of features selected from ten slices, not all features extracted from ten slices. According to these, we reduce running time to process data. Finally, this study using new data set from NACC.
The Dataset for this project was obtained by the National Alzheimer's Coordinating Center (NACC) Department of Epidemiology, School of Public Health and Community Medicine, University of Washington, grant number U01 AG016976 . This work is partially supported and funded by October high institute for engineering and technology (OHI).