Diagnosis of Epilepsy Using Signal Time Domain Specifications and SVM Neural Network

Epilepsy is a central nervous system (neurological) disorder that is caused by abnormal pathologic oscillating activity of a group of nerve cells in the brain. The electroencephalographic signals gained from brain electrical activities are mostly used for the diagnosis of neurological diseases. These signals indicate electrical activities in the brain and they contain some data about the brain; however, gaining long-term EEG data with seizure activities specifically in regions lacking medical centers and educated neurologists would be very costly and unpleasant. In this article based on electroencephalogram (EEG) signals, a new method is proposed for the automatic detection of Epilepsy. The aim of this article is to provide a model for the detection of Epilepsy by SVM optimization using genetic algorithm for the classification of EEG data. SVMs are one the powerful technics of machine learning, and they are widely applicable in many fields. The training and testing data were obtained from investigating EEG signals of 367 healthy and ill individuals. The data used in this paper have been derived from Barekat Imam Khomeini (RAH) Hospital in Miyaneh city. In this study the noise removal was done over the data by FIR Filter and genetic algorithm was used for the calculation of filter coefficients and optimal sample number. This method classifies the signals of both healthy individuals and the ones with Epilepsy with an accuracy of 100%.


Introduction
Epilepsy is a central nervous system (neurological) disorder in which brain activity becomes abnormal, causing seizures, during which unusual behaviors, sensations, and sometimes loss of awareness occur consequently [1]. After (CAV) Epilepsy is the second cause of central nervous system (CNS) illnesses. And about 0.5-0.1% of people of the world suffer from that [2].
This disease might cause physical or mental damages; or in the worst possible condition it might lead to death due to the brain genetic and physical damages during seizure [3].
One of the simplest non-invasive ways to diagnose epilepsy is to record the electrical potential changes along the surface of the scalp by electroencephalograph with an accuracy of ms [4]. EEG technique in comparison to the other brain imaging techniques, as to FMRI and PET, is proper in terms of size, cost, both laboratory and field conditions [5].
EEG is composed of an analog-to-digital converter (ADC), which receives analog samples from specific areas of the scalp surface by certain electrodes, and it converts them to digital with a specific accuracy.
EEG signal is used for diagnose of epilepsy, as it's a condition related to the brain electrical activity [6]. Epilepsy is recognized with repeated seizures in EEG signal. In most cases the onset of seizure is not predictable in a short time. Constant record of EEG is required for epilepsy diagnosis. As the conventional analyzing methods are boring and time consuming, many EEG automatic systems for Epilepsy diagnosis have been developed [6]. Figure 1 illustrates samples of normal and epileptic EEG signals consequently.
Through visual inspection of EEG signals, neurologists might diagnose Epilepsy. But due to complexity and the high number of canals, and also the difficulty of visual recognition of EEG, researchers have proposed various methods for Epilepsy diagnosis. The importance of Epilepsy and its critical role in the phase of diagnosis has caused the development of different automatic Epilepsy diagnosis methods from EEG in recent years.
Distinct studies have been conducted on this subject for several years. In this section, only recent published work is studied. Sharmila et al. [8] proposed a framework for diagnosis of epileptic convulsion derived from EEG data listed for normal subjects and epileptic patients. This framework is based on analysis on Discrete Wavelet Transform (DWT) of EEG signals using linear and nonlinear classifiers. Statistical features derived from DWT were studied by means of Naïve Bayes (NB) Classifiers and Nearest Neighbor (k-NN) in this survey [8].
Zhang et al. [9] presented a method to extract the characteristic using a hybrid technique including Visual Molecular Dynamic (VMD) Analysis and extraction of second order Auto-Regence (AR) characteristic and random forest classifier was used for achieving this task [9].
Yol et al. [10] have compared performance of various classifiers e.g. Linear Discriminant Analysis (LDA), K-Nearest Neighbor (K-NN) and Naïve Bayes classifier to classify EEG signal for several signal extraction methods such as Renyi entropy and Tsallis entropy and integrated relative entropies of EEG signals [10].
Raghu et al. [11] have introduced a square matrix using EEG time intervals without filtered artifact and approximated their determinant. They classified several extracted characteristic classifiers and 5 layers in Bonn EEG datasets within reciprocal validation framework [11].
Li et al. have presented a new method based on Double-Tree Continuous Wavelet Transform (DT-CWT) with Support Vector Machine (SVM) to diagnose epileptic convulsion [12].
Zazzaro et al. have proposed a new technique to identify automatically EEG signals as a comprehensive tool for extraction of characteristic from time series data. This tool includes signal processing, window-placement, extraction and selection of characteristics and SVM [13].
The EEG data used in this paper have been derived from Barekat Imam Khomeini (RAH) Hospital in Miyaneh city. These data include two sets: The first set is EEG signal from healthy humans (249 cases) and the second set comprises of EEG signal from patients (118) under normal condition. Length of these signals lasts for 86.8s. These data were collected using method of placement of standard electrode (10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20). These subjects were located with open and closed eyes under totally conscious mode. Signals were recorded by amplifier system (19-128 canals) and digitized by means of resolution 16 bits per second) for 500 samples. In the following figure, 3 samples of EEG signal are shown and extracted from the given device for healthy subject (a) and patient.

Theory
The research works in recent studies might be analyzed in four classes including time field, frequency field, and frequency-time field and by nonlinear technique. Several studies are examined in frequency field in this section.
Mingyiang et al. (2016) have suggested an efficient technique using wavelet-based nonlinear analysis and Genetic Algorithm-Support Vector Machine (GA-SVM). DWT has been utilized instead of DD-DWT to transfer a signal from time field to frequency field in this paper. Hurst's power and fuzzy potential are extracted as input characteristics in this paper; Radial Base Function (RBF) has been accepted based on SVM for nonlinear mapping and both C (adjustment parameter) and S (kernel parameter) parameters are involved in this process. Thus, Genetic Algorithm (GA) has been proposed to search for optimal parameters [14].
Meenakshi et al. [15] have measured frequency range of EEG signal for epilepsy. They have divided them into five different fields (α, β, γ, δ and θ) that are related to total range and they have deleted frequency distribution by FFT for EEG signals to compare difference of epilepsy between patients and healthy subjects [15].
Martis et al. [16] have isolated existing various activities in EEG e.g. delta, theta, down alpha, up alpha, down beta, up beta and down gamma using Wavelet Packet Decomposition (WPD). Nonlinear characteristics such as Largest Lyapunov Exponent (LLE), Higuchi Fractal Dimension (HFD), Hurst Exponent (HE) and Sample Entropy (SE) have been calculated for each of EEG band of individual activities in this survey. These given parameters are tested in terms of diagnostic potential in the given isolation, classification of normal, ictal and interictal classes by means of ANOVA statistical test. The characteristics ranked by means of Support Vector Machine (SVM) with various kernel functions, Decision Tree (DT) and k-Nearest Neighbor (k-NN) are classified to select the best classifier. Using Radial Base Frequency (RBF) for the kernel, SVM gives a kernel with highest accuracy (98%), sensitivity and characteristic (99.5% and 100%, respectively) by means of five characteristics [16].
Whereas development of epilepsy is a dynamic and non-fixed process as usual and signals are composed of several frequencies thus visual frequency-based methods and normal techniques are employed with limitation.
In this paper, initially EEG signals are selected and then are deleted by means of the existing noise filter on signal and eventually data are classified using SVM classifier and optimized with genetic algorithm.
Receiving EEG signal is deemed as the first step taken toward processing of cerebral signal. We receive vital signals by connecting electrode to surface of scalp from various parts of skull to extract brain signal based on certain standard and prepare them for preprocessing phase.
World standard (10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20) is usually used for recording brain signal. This standard indicates way of electrode placement on various parts of scalp [17]. The standard points are shown for placement of electrodes on scalp surface in Figure 4. This standard enables covering almost all brain areas by 19 electrodes [17]. Location of electrodes is selected according to specific points of skull bone. Electrodes are placed at cross points on surfaces of skull bone where other middle electrodes will be arranged according to 10 and 20 percent of total distance [17].
A letter is allocated to any electrode that denotes a lobe for their placement including Frontal (F), Frontal Posterior (FP), Temporal (T), Occipital (O), Parietal (P) and Central (C). Title of any electrode is specified with certain numbers where the even numbers specify right lobe and the odd numbers determine the left lobe. Z-Subscript represents zero line or location for connection of two lobes at left and right sides. As the farther they are placed from zero-line (cross-line from nose through behind brain), the greater number is allocated to them [17].
The electrodes usually used on brain surface are composed of Ag-AgCl plates (silver-silver chloride) with diameter (1-3mm) and a long and flexible connection wire that can be connected to an amplifier [19]. A piece of cloth, made of cotton, covers on silver part of electrodes and fastened by elastic rings on electrodes. These electrodes have been kept already in saline solution to absorb this solution by cotton in order to create suitable electrical connection among electrode and shell of these electrodes. Method of arrangement of electrodes on brain to receive EEG signal according to international standard system 10-20 [18].
The shell surface should be well cleaned to minimize connection impedance by means of surface electrodes.
Electrodes are connected to an EEG head box on external surface of skull. This box is a separate panel that also includes a primary amplifier and is placed near patient's head.
EEG signal is usually recorded from a conscious and awake person, who lies on the bed in half-slept mode with closed eyes. The resulting artifacts from motion of lead-electrode are highly reduced under such condition.
Earth electrode is used as common reference for all voltages in the system. Earth electrode may be located anywhere on forehead or ear [20].
Most of EEG records are done on bipolar basis. Then potential difference is measured between the given active electrodes and reference electrode that is relatively passive. Conductor gel is used for reducing impedance at point of location of electrode.
The recorded EEG signal lasts for 86.8s. Whereas data extracted from device includes 500 data per second and recoded signals are 367 thus size of initial database is 367*824600.
These dimensions are noticeably high and consequently they suffer from some defects e.g. rising size of computations, lower speed in software and higher error probability. Therefore, samples have been compressed to increase processing speed and to decrease response time in system. The compression rate (n) may affect CCR value and this impact is illustrated in Figure 5.  Alternately, the recorded EEG signal includes a lot of noise that may originate from moving body organs, urban electric power and motion of cables etc. FIR filter has been utilized to remove noise. Finite Impulse Response (FIR smoothing filters (which are also called polynomial digital smoothing filters or the least square smoothing filters) are usually used for smoothing of noise signal in which frequency opening (without noise) is too large. FIR filters tend to filter the high frequency filter with noise in some programs in which they have better performance than FIR filters with standard mean. Filtering performance of EEG signal is shown in Figure 6. Similarly, the impact of filter degree on CCR value is indicated in Figure 7. And also, impact of removal of each of 19 channels on CCR value is illustrated in Figure 8. As it seen, diagnosis accuracy is derived 50%, 51% and 51% for channels 11, 14 and 15 respectively. These findings indicate that these canals, which related to electrodes T 5 , P 4 and O 1 , have the highest impact on diagnosis accuracy. These channels are located in temporal, parietal and occipital lobes, respectively. The processes for collection and formation of preprocessed database are illustrated in Figure 9.

Pattern Recognition
Recognition of pattern includes classification and separation of special patterns based on predefined characteristics from an available dataset. Implementation of many human skills including face recognition, speech recognition, reading hand-written letters with very high stability potential versus noise and under ambient conditions by the machine is one of the problems and subjects that have been addressed by researchers in various engineering fields such as Artificial Intelligence (AI) and machine vision.
Recognition of pattern is widely applicable in various branches of sciences including electrical engineering (medicine, computer and telecommunication), biology, machine vision, economics and psychology. Pattern recognition is called two modes of clustering and or classification (supervised and or unsupervised) and includes KNN, SVM and Bayesian classifications and types of neural networks such as RBF, perceptron, and SOM etc.
Among aforesaid techniques, SVM method is of binary classification type. Widely use of SVM can be attributed to various factors some of which comprise of their excellent generalization potential, fast computation, strength versus external environment and appropriate learning capacity for small samples. Nonetheless, SVM parameters play essential role in creating an accurate prediction model with high stability of prediction [21]. Thus, SVM classification method has been utilized for data classification in this study. Support Vector Machine (SVM) SVM is composed of machine learning algorithms that are linearly extended. SVM can be used for classification and regression of dataset. This algorithm can be an effective method for data modeling by increasing problem size and using kernel function [22].
This algorithm aims to find the best boundary between data so that to be the maximum distant from all classes. SVM usually restricts two classes, but classes are compared in pairwise form to find the given class for multiclass datasets [22].
If these classes are separated linearly, the cloud planes are obtained with maximum margin for separation of classes. However, if data are separable linearly, data are mapped into a space with higher dimensions in order to isolate them linearly in this new space [22].  Figure 10 using SVM classification technique, optimal hyperplane and maximum margin.

Separation of two classes is shown in
The best line (hyperplane) is found to isolate these two classes. In 2D mode, this line is described by (1) as follows: In (1) at above, w denotes gradient and b specifies distance of line from the origin. N-dimension case is illustrated with (2): Equation (2) bisects plane space in which equation of separated parts are shown by (3): Instead of using this line in SVM, take two parallel lines to create a more reliable boundary with (4) as follows: Suppose that we have a set of n data: {x , y }; i=1,2, …, n x ∈ R, y ∈ 1,1 If Given y-value and its multiplication in above equation, (5) is derived: The higher distance exists between two parallel lines, the greater confidence margin will result and separation is done better. We call distance of two parallel lines from the middle line as d. The best case for separation of classes is when this distance is d d .
This distance "d" is calculated by (6): Whereas d d is the optimal case, thus (7) is derived.
In order to make d d at maximum value, w should be minimum or d d is should be minimized.
It aims to find min 1 2 ⁄ w w provided y (w x b) > 1. The solution is quadratic programming method. The equation mode should be at saddle point in this technique: In (8), α is Lagrange coefficient. L is saddle equation and it should be minimum value. To minimize L, we should extract L from w and b and set it as zero. According to (9) and (10): )* ), ∑ α y 0 By assigning these values to L problem, the (8) is changed as (11): L=-∑ ∑ α αy y -X -X - (11) equation α Should be greater than zero to minimize L at last.
According to (12): For convenience, y y -X -X is indicated by h -.
The quadratic program solves this problem in MATLAB and gives use answer α that is calculated w and b using (17) and (18): And it determines totally mean of b bias coefficient b .
What it implied so far, is concerned with SVM with hard margin and it is not generally applicable because classification is fully linear. This method works well totally, but this case is not always appropriate [22].
To operationalize this method, we introduce SVM soft margin: The soft margin allows the program to include partial error and the curve serves as separator boundary in this technique, as it illustrated in Figure 11.
To convert the line to the curve, we define a kernel trick that shows x on z-set. According to (19): In other words, (20) illustrates as follows: And overall, the problem has been shown by (21), (22) and (23): b=mean (y − (∑ α y k(x , x -))) (23) These kernel tricks are introduced in (24), (25) and (26): ii. Gaussian: iii. MLP: k =x , x -> = tanh=β E + β x x -> The Gaussian function is usually chosen and σ is the variance of the Gaussian function. If a polynomial kernel is selected, p is the degree of the kernel polynomial and in the MLP function, β is constant coefficients.
Assume ξ as error rate: In formula at above, L should be minimum value and C is penalty coefficient that minimizes or plays up the error function.
Where, w w , C ∑ ξ will be minimum and ∑ μ ξ , ∑ α Oy (w x + b) − 1 + ξP will be maximum and according to Where μ is Lagrange coefficient. Therefore, we have three elements with (35): And we receive the following results that are characterized as Box Constraint.
In fact, Box Constraint only denotes difference among soft and hard margins and coefficient-C is infinite at hard margin.
Basically, SVM method is binary classification while it is related to multiclass classifiers in many cases. Under such conditions, a multiclass problem can be reduced to several binary problems and a multiclass problem can be solved by pairwise comparison of classes and composition of their outputs with each other [22].
In Figure 12, a flowchart is shown for way of designing SVM classifier and employing genetic algorithm to compute optimal values for filtering coefficients and selection of sample.

Definition of Problem
EEG signal enters SVM neural network by time lag T and data selection with sampling rate N and FIR filter at equation degree O and frame size F and CCR is optimized.
In this section, diagnosis accuracy is shown per various values if compression rate in Figure 13. The diagnosis accuracy is also illustrated for different values of filter degree in Figure 14.  As it illustrated, parameters of the given filter will highly impact the received EEG signals and also compression rate on classification accuracy. Artificial intelligence algorithms can be utilized to select optimal values for the various parameters. To this end, genetic algorithm has been used for minimization of classification error by defining above-said parameters as optimization parameters.

Genetic Algorithm (GA)
Genetic algorithm, modeled on principles of natural genetic system, shows compatible, strong and effective optimization method. GA starts with a group of individuals (population) that are randomly generated as usual. Any individual in a population is shown as a single chromosome whole any chromosome is composed of a vector of elements called genes [24,25].
GA, which is identified as a fitness function by optimization of a single criterion, solves the problems. The value of fitness function illustrates goodness level for each one (i.e. vector of values for the optimized parameters). To access better individuals (a new population out of previous population), an operation such as mutation, crossover and selection is done on chromosomes [24,25].
Crossover is a technique for data exchange between parent chromosomes with recombination of some parts of their genetic materials. Mutation trend is described as an arbitrary change in genetic structure of chromosome and creates genetic diversification inside population [24,25].
Using extracted data from EEG device in this paper, it has been tried to diagnose epilepsy and the related abnormalities occurred in brain signal. Due to longer time for recording EEG signal, the collected from device included high volume and this, in turn, increased size of computations and decreased speed. As a result, data compression was used for prevention from occurrence of such a consequence. Alternately, FIR filter with equation degree O and frame length F has been utilized to remove EEG signal noise.
The fitness function will be presented in such a way that the CCR value derived from SVM classifier to be the nearest quantity to 100. Diagrams of the optimal fitness function and the best individual are shown in Figure 15.
The given results from genetic algorithm are 4 for variables N and the values of variables O and F are 3 and 9 individual coefficients, respectively. Using results of genetic algorithm, rate of diagnosis accuracy was derived 100% for healthy subject and patients.

Final Data Classification by Optimized Parameters
Total dataset represents matrix 367*824600 that is related to EEG signal for 367 cases (249 healthy subjects and 118 patients). Among them, data relating to 267 cases have been separated as training data and for 100 cases as tested data. The results derived from SVM technique were obtained by means of genetic algorithm for the existing variables in fitness function (O, N and F) as 4 and individual coefficients of 3 and 9 respectively.

Conclusion
The genetic algorithm has been used for prediction and diagnosis of epilepsy and optimization of results from SVM algorithm in this paper. Due to high dimensions and existing noise in extracted data from EEG device, we prepare data for classification by means of data compression and filtration into healthy and epileptic groups. The results came from SVM classification have been optimized using genetic algorithm. Diagnosis accuracy rate was achieved 100% in the given method. Then, diagnosis accuracy was obtained for removal of each of channels and shown in Figure 16. As it observed, the given accuracy rates resulted from removal of channels 4 and 8 relating to electrodes F 8 and T 4 are at lowest level and they are 76% and 73% respectively. The aforesaid numbers suggest that these channels have the highest impact on diagnosis of epilepsy in the proposed method. The effective canals for diagnosis, which have been christened numbers 4 and 8, are related to frontal and temporal areas, respectively. These findings indicate that two areas of frontal and temporal point had the highest impact in diagnosis of healthy subjects and patients.