Early Diagnosis of Alzheimers Disease and Features ...

Viewer
Transcript

COMPUTATIONAL INTELLIGENCE & PR - ROWAN UNIVERSITY - FALL 2013

1

Early Diagnosis of Alzheimers Disease and Features Selection Belhassen Bayar Department of Electrical and Computer Engineering Rowan University Glassboro, New Jersey, USA Email: [email protected]

Abstract—Feature selection has shown to be very useful in many applications. In particular, it has proven to significantly improve the accuracy of machine learning algorithms on certain databases. In addition, feature selection is very efficient when considering high dimensional data. That is, dealing with handled data as a small dimensional one would be more interesting in terms of computation and time consumption. More specifically, we are interested in predicting the presence of disease within a population of patients by considering different sets of features. To this aim, in this project we test two of the top 10 algorithms in Machine Learning [1], i.e., Multilayer Perceptron (MLP) and Naive Bayes, to predict the presence of Alzheimer disease within a population of patients. Moreover we consider the feature selection to improve the capacity of each classifier. Index Terms—Computational intelligence; pattern recognition; machine learning; feature selection.

I. I NTRODUCTION

T

He detection of a disease in a certain databases is an important application in machine learning area. The importance of this application rises from the fact of the improvement brought in medical applications. A lot of progress has been achieved in machine learning algorithm mostly after the use of the feature extraction approaches. This latter technique has shown to be useful in terms of accuracy and computational cost. Many approaches has been proposed such as Principal Component Analysis (PCA) [2], Linear Discriminant Analysis (LDA) [3] and Singular Value Decomposition (SVD) [4]. Other similar technique, called feature selection, has been proposed which has shown good efficiency and it has many variants, e.g., filter approach and wrapper approach [5]. In this project we are interested in detecting the presence of Alzheimer disease within a set of 71 patients. And so, we consider to assess the performance of the MLP and Naive Bayes [1] classifiers using the filter based feature selection approach. More specifically, we

are given a data where each subject is represented by his or her EEG signal collected through 19 electrodes placed in different head locations. The data is collected under different test experiments and filtered under 4 different frequency ranges. Our main task is to determine the best sets of features that well represent the subject and hence increase the capacity of the classifiers. In the second part of this project we use the variance filter based approach in order to determine the most significant features in the data. To asses the performance, we suggest to run the 10 fold cross-validation test for every considered algorithm. We also compute the sensitivity, specificity and positive predictive value for each experiment. This report is organized as follows: In Section II we present a brief definition of the different measures that we consider in this report. We also give an overview of the used classifiers. In Section III we introduce the Alzheimer disease and the dataset structure that we have used. Then, we present our simulation results where we compare the performance of the algorithms using different feature sets. Finally, we briefly discuss in Section IV the most important results of this project and what are the points that should be investigated for the future work. II.

PRELIMINARIES AND DEFINITIONS

A. Sensitivity, Specificity and Relatives We present in this section a brief definition of the test statistics that we use in this report [6]. The considered approaches are assessed by computing the accuracy rate of disease recognition within a set of subjects. We first consider some useful definitions that we use to compute the different quantitative measures in this work. More specifically, the considered classification techniques are used here to recognize whether a subject has the Alzheimer disease or not. A diagnosis is positive when the patient has the disease, otherwise, we say it is negative. A diagnosis is True Positives (TP) when the

B. BAYAR

test is positive and the disease is present. False Positives (FP) diagnosis is when the test is positive and the disease is absent. We say a diagnosis is False Negatives (FN) when the test is negative and the disease is present. True Negative is when the test negative and the disease is absent. We note nP as the total number of positives (T P + F P ) and nN as the total number of negatives (T N + F N ). nD is the total number with disease present (T P + F N ) and nC is the total number without disease present (T N + F P ). n is the total sample size (T P + F P + F N + T N ). Table I shows the fundamental table for the underlying definitions. The matrix in this table represents the confusion matrix which summarizes the performance of a classifier. The words Sensitivity (Se) and Specificity (Sp) have their origins in screening tests for diseases. We define sensitivity as the probability that the test says a person has the disease when in fact they do have the disease. It is a measure of how likely it is for a test to pick up the presence of a disease in a person who has it. P TP The expression of Se is given as Se = T PT+F N = nD . The specificity is define as the probability that the test says a person does not have the disease when in fact they are disease free. The expression of Sp is given N TN as Sp = F PT+T N = nC . Ideally, a test should have high sensitivity and high specificity. Sometimes there are tradeoff in terms of sensitivity and specificity. If we make a test with high sensitivity, it could result in low specificity. In general, we can keep both sensitivity and specificity high in screening tests, but we still get false positives and false negatives. The population prevalence (Pre) measure is defined nD +F N as P re = T P +FT PP +F N +T N = n . One of the most important measures is Positive Predictive Value (PPV). It is the proportion of true positives among all positives TP nP . Correct only if the population prevalence is well estimated by nD n . B. Classifiers 1) Naive Bayes: Bayesian classifiers are statistical classifiers. They can predict class membership probabilities such as the probability that a given samples belongs to a particular class. Bayesian classification is based on Bayes theorem [7]. Studies comparing classification algorithms have found a simple Bayesian classifier known as the Naive Bayesian classifier to be comparable in performance with selected neural network classifier described next. Bayesian classifiers have also exhibited high accuracy and speed when applied to large databases. Naive Bayesian classifiers assume that the effect of an attribute value on a given class is independent of

2

the values of the other attributes [7]. This assumption is called class-conditional independence. It is made to simplify the computations involved and, in this sense, is considered naive. 2) Multilayer Perceptron: multilayer perceptron (MLP) is a network of perceptrons (artificial neurons) arranged in a feed-forward-only topology. The feedforward topology is often a requirement, imposed on a neural network due to limitations in the learning algorithms used [8] [1]. The feed-forward limitation leads to a topological layering of the neurons in the network. This leads to hidden layers, another topological feature of MLPs. Generally, while natural brains do organize, geometrically, into layers, they also allow for a great deal of topological feedback between layers and through the outside world between inputs and outputs. III. S IMULATION R ESULTS In this section we first introduce the Alzheimer disease and the structure of the considered dataset. Then we show our main experimental results. A. Alzheimer Disease Alzheimer’s disease (AD) [9] is a slowly progressive disease of the brain that is characterized by impairment of memory and eventually by disturbances in reasoning, planning, language, and perception. Many scientists believe that Alzheimer’s disease results from an increase in the production or accumulation of a specific protein (beta-amyloid protein) in the brain that leads to nerve cell death. The likelihood of having Alzheimer’s disease increases substantially after the age of 70 and may affect around 50% of persons over the age of 85 [9]. Nonetheless, Alzheimer’s disease is not a normal part of aging and is not something that inevitably happens in later life. For example, many people live to over 100 years of age and never develop Alzheimer’s disease. The main risk factor for Alzheimer’s disease is increased age. As a population ages, the frequency of Alzheimer’s disease continues to increase. There are also genetic risk factors for Alzheimer’s disease. Most patients develop Alzheimer’s disease after age 70. However, 2% − 5% of patients develop the disease in the fourth or fifth decade of life (40s or 50s) [9]. Other risk factors for Alzheimer’s disease include high blood pressure (hypertension), coronary artery disease, diabetes, and possibly elevated blood cholesterol. B. Alzheimer Dataset The AD dataset [7] contains 71 patients where 34 are diagnosed to have Alzheimer. This data is obtained from

B. BAYAR

3

TABLE I F UNDAMENTAL TABLE test positive (P) test negative (N) total

disease (D) TP FN nD = T P + F N

no disease (C) FP TN nC = F P + T N

total nP = T P + F P nN = F N + T N n

TABLE II F REQUENCY F ILTER BASED APPROACH : C LASSIFICATION ACCURACY OF MLP CLASSIFIER FOR DWT Electrode P3 FP1 P4 FZ P8 P7 T8 O2 C4 F8 FP2 C3 CZ FZ F3 T8 PZ O1 F4 P4 C3 F3 O1 P8 FP2 T7 F7 F8 T7 PZ F4 F8 FZ O2 FP1 C4 P7 CZ

Frequency (Hz) 1−2 1−2 0−1 0−1 0−1 2−4 1−2 1−2 1−2 1−2 2−4 2−4 1−2 2−4 0−1 0−1 0−1 2−4 1−2 1−2 0−1 0−1 0−1 1−2 1−2 1−2 2−4 4−8 4−8 1−2 1−2 1−2 0−1 0−1 2−4 4−8 2−4 2−4

Sound nov nov nov trg nov trg trg nov nov nov nov trg nov trg nov nov nov trg nov trg nov trg nov trg trg trg nov trg nov trg trg trg nov trg trg trg nov trg

Se (%) 75.00 ± 11.28 77.00 ± 12.37 81.33 ± 10.72 65.16 ± 10.14 75.16 ± 11.38 62.16 ± 9.96 52.00 ± 11.97 67.83 ± 14.33 69.16 ± 10.94 67.33 ± 11.12 67.33 ± 12.19 66.66 ± 13.94 62.50 ± 14.33 52.16 ± 15.50 57.50 ± 13.38 65.83 ± 10.37 77.66 ± 10.50 71.00 ± 13.74 63.50 ± 19.65 57.50 ± 15.01 68.00 ± 12.60 54.00 ± 13.32 61.83 ± 8.13 68.66 ± 12.12 62.50 ± 9.45 58.16 ± 10.47 57.50 ± 15.64 63.33 ± 12.45 61.50 ± 15.48 58.50 ± 8.66 52.50 ± 13.24 53.33 ± 5.77 63.16 ± 12.22 42.66 ± 16.53 50.00 ± 8.09 44.16 ± 15.71 52.66 ± 17.08 53.00 ± 12.65

acquired EEG signals of every patient. The raw data is 257 points long, originally sampled at 256. The EEG data data is an average of 90 individual event related potentials obtained through a protocol known as the oddball paradigm where the subjects listen to a series of tones, most at 1 kHz called non-target tone, some at 2 kHZ called target or oddball tone and some novel sounds which are short sound clips from Disney movies. 65% of the tones are standard, 20% are target and 15% are novel sounds.

Sp (%) 74.66 ± 8.80 72.66 ± 11.10 81.50 ± 10.80 68.00 ± 10.60 68.38 ± 7.60 69.50 ± 8.90 63.66 ± 10.75 74.00 ± 13.06 75.16 ± 12.82 71.54 ± 8.87 67.54 ± 9.50 65.00 ± 11.24 64.00 ± 9.92 70.33 ± 11.89 64.83 ± 10.55 64.38 ± 11.77 68.66 ± 6.61 63.66 ± 14.72 64.38 ± 10.56 65.66 ± 13.61 64.83 ± 9.18 56.00 ± 9.93 61.88 ± 3.99 62.54 ± 18.45 78.16 ± 13.76 58.00 ± 10.36 67.33 ± 12.59 59.66 ± 7.71 71.04 ± 11.60 62.38 ± 8.04 60.66 ± 11.54 60.83 ± 15.59 61.00 ± 8.88 58.33 ± 9.30 58.33 ± 11.01 48.50 ± 10.21 41.16 ± 15.25 64.16 ± 15.71

PPV (%) 64.16 ± 13.18 67.50 ± 12.02 71.66 ± 16.57 71.66 ± 8.60 60.83 ± 11.29 61.66 ± 13.49 57.50 ± 16.36 64.16 ± 18.22 69.16 ± 14.70 67.50 ± 12.02 63.33 ± 12.75 58.33 ± 13.88 53.33 ± 14.19 60.00 ± 19.22 57.50 ± 13.38 60.83 ± 12.43 61.66 ± 10.24 63.33 ± 9.66 47.50 ± 17.57 60.00 ± 18.29 55.83 ± 11.11 56.66 ± 11.41 55.83 ± 9.43 61.66 ± 15.36 72.50 ± 16.55 56.66 ± 8.28 58.33 ± 17.78 47.50 ± 9.83 57.50 ± 17.61 60.00 ± 12.23 56.66 ± 15.93 64.16 ± 11.46 51.66 ± 11.91 38.33 ± 17.25 54.16 ± 12.46 35.83 ± 13.18 45.83 ± 10.98 72.50 ± 14.57

DATA

Performance (%) 79.82 ± 2.58 76.48 ± 7.47 75.98 ± 6.16 75.62 ± 5.79 75.47 ± 5.23 74.37 ± 4.75 73.84 ± 4.37 73.83 ± 9.85 73.55 ± 5.24 72.96 ± 4.01 72.84 ± 3.29 72.74 ± 6.03 72.34 ± 8.92 72.25 ± 4.01 72.10 ± 6.22 72.00 ± 8.02 71.43 ± 7.63 71.24 ± 5.93 71.20 ± 4.94 71.13 ± 6.61 71.07 ± 6.19 70.96 ± 5.45 70.34 ± 5.36 70.33 ± 4.88 70.22 ± 9.58 70.05 ± 7.99 69.64 ± 4.13 69.40 ± 5.61 69.01 ± 5.65 68.17 ± 7.57 67.85 ± 8.55 66.98 ± 4.59 66.43 ± 10.83 65.90 ± 5.61 65.43 ± 8.56 64.39 ± 5.11 61.10 ± 7.88 60.50 ± 9.29

The experiments are as follow: The subjects are asked to press a button every time they hear a target tone. This creates the event related potentials (ERPs). ERPs known to have a positive peak around 300 ms after the beep that is fairly strong in normal individuals, but weak or much later in people of deteriorated mental ability. However, this occurs only in response to target tones and novel sounds. Data are collected from 19 electrodes placed in different head location, e.g., Pz, Cz, C4, T2, F3, etc., where

B. BAYAR

4

TABLE III F REQUENCY F ILTER BASED APPROACH : C LASSIFICATION ACCURACY OF NAIVE BAYES CLASSIFIER FOR DWT Electrode PZ F8 F4 P8 P4 PZ C4 P3 O1 C3 FP1 O1 P3 P8 FZ F7 CZ F7 O2 F3 T7 FP1 P4 F8 O2 T7 C4 CZ C3 P7 FP2 P7 F4 FP2 F3 T8 FZ T8

Frequency (Hz) 2−4 1−2 0−1 1−2 0−1 0−1 2−4 1−2 2−4 0−1 0−1 1−2 0−1 2−4 2−4 0−1 1−2 2−4 2−4 4−8 1−2 1−2 1−2 2−4 0−1 0−1 0−1 0−1 2−4 2−4 1−2 1−2 4−8 1−2 1−2 0−1 4−8 1−2

Sound nov nov nov trg nov trg nov trg trg trg nov nov nov nov nov nov trg trg trg nov nov trg trg trg nov trg trg nov nov trg nov nov trg trg trg trg trg nov

Se (%) 78.33 ± 11.81 63.66 ± 12.26 60.33 ± 13.13 57.66 ± 13.98 68.66 ± 16.23 65.66 ± 15.11 50.33 ± 17.05 59.50 ± 14.88 64.50 ± 19.52 53.66 ± 11.01 80.16 ± 11.05 66.00 ± 11.33 68.83 ± 12.86 66.66 ± 12.01 62.50 ± 8.49 36.16 ± 18.01 62.33 ± 16.92 60.16 ± 10.16 53.33 ± 10.69 55.16 ± 15.37 57.00 ± 16.12 50.00 ± 15.21 68.50 ± 10.68 52.66 ± 10.61 52.50 ± 21.34 55.66 ± 15.64 65.16 ± 12.36 63.16 ± 12.68 59.66 ± 11.90 54.33 ± 12.93 72.66 ± 16.86 63.83 ± 11.13 50.50 ± 11.39 58.33 ± 9.30 63.33 ± 14.72 47.00 ± 12.84 50.83 ± 5.99 43.50 ± 13.05

the letters represent the location such as ’F’ for frontal, P for Parietal and C for Central. The numbers represent the relative location, even numbers on the right of the brain, odd numbers on the left and ”z” indicates center line. To test the performance of the considered classifier we consider a frequency filter based feature selection where we have 4 different ranges of frequency selection, i.e., L01 within 0−1 Hz, L12 within 1−2 Hz L24 within 2−4 Hz and L48 within 4 − 8 Hz. In this application we only consider the discrete wavelet transform coefficients (dwt) of the data as a first step experiment. We use variance filter based feature selection approach [10] for the second step experiment and we test it on the raw average data (avg).

Sp (%) 76.00 ± 9.41 64.00 ± 12.74 66.00 ± 10.98 67.66 ± 17.91 70.33 ± 11.92 70.71 ± 11.00 57.66 ± 8.50 62.83 ± 11.58 65.66 ± 11.02 63.00 ± 14.73 71.66 ± 5.93 68.83 ± 12.64 64.16 ± 15.18 69.33 ± 10.16 71.83 ± 10.55 59.00 ± 12.24 67.16 ± 12.14 65.33 ± 10.48 51.83 ± 13.46 60.50 ± 12.88 57.00 ± 12.86 43.33 ± 13.50 69.50 ± 8.90 63.00 ± 9.56 42.33 ± 12.57 63.16 ± 9.46 58.83 ± 18.14 64.33 ± 8.08 72.66 ± 16.86 51.83 ± 10.93 69.11 ± 10.24 64.50 ± 7.55 52.50 ± 13.24 64.83 ± 10.64 62.71 ± 10.78 50.66 ± 12.44 58.33 ± 14.29 55.83 ± 15.14

PPV (%) 67.50 ± 13.09 60.00 ± 13.29 64.16 ± 12.88 68.33 ± 16.87 58.33 ± 17.78 60.00 ± 17.32 50.00 ± 14.56 48.33 ± 17.32 48.33 ± 18.29 55.00 ± 17.14 60.83 ± 11.29 66.66 ± 13.74 65.83 ± 12.64 68.33 ± 10.54 67.50 ± 13.09 40.00 ± 20.95 58.33 ± 15.95 62.50 ± 11.98 59.16 ± 11.69 56.66 ± 15.93 51.66 ± 15.44 44.16 ± 12.27 62.50 ± 13.35 53.33 ± 15.36 37.50 ± 13.35 57.50 ± 14.36 66.66 ± 11.62 58.33 ± 12.57 78.33 ± 12.45 49.16 ± 11.18 55.83 ± 15.71 60.83 ± 10.02 50.83 ± 12.64 66.66 ± 11.28 53.33 ± 14.19 45.00 ± 15.23 70.83 ± 10.06 50.00 ± 18.10

DATA

Performance (%) 76.27 ± 6.29 75.57 ± 4.42 75.35 ± 4.33 74.00 ± 4.45 73.81 ± 6.47 73.11 ± 7.27 72.96 ± 3.88 72.84 ± 5.08 72.62 ± 7.24 72.50 ± 4.33 72.10 ± 10.91 72.05 ± 5.27 71.87 ± 5.49 71.66 ± 6.05 71.46 ± 5.92 71.46 ± 1.82 71.43 ± 5.27 71.41 ± 6.09 71.28 ± 7.42 71.15 ± 5.56 71.12 ± 5.20 70.87 ± 3.04 70.66 ± 6.49 70.48 ± 3.86 70.37 ± 4.82 70.06 ± 7.79 69.09 ± 7.68 69.08 ± 9.52 69.03 ± 7.97 69.00 ± 4.62 68.70 ± 10.26 68.55 ± 6.69 68.16 ± 6.09 67.93 ± 13.13 67.28 ± 6.93 66.66 ± 6.48 65.85 ± 4.78 61.61 ± 7.37

C. Results Our simulation approach is divided into two parts. First we consider the dwt data to assess the performance of the underlying classifiers according to the different filtered frequencies. Then we use a different approach where we filter the data according to the variances of features [10] where we ignore the lowest ones. All the simulations are run using the 10-fold crossvalidation test and the statistics are measured as the mean for the 10 experiments. The confidence interval (CI) is the half of the standard deviation σ/2. 1) Frequency Filter based Feature Selection Approach: We test the MLP and the Naive Bayes classifiers on the different filtered datasets for 19 different electrodes. The data contains only the dwt of the original data where the subject listened to the novel and target

B. BAYAR

5

TABLE IV VARIANCE F ILTER BASED APPROACH : C LASSIFICATION ACCURACY OF MLP CLASSIFIER FOR RAW AVERAGE DATA Electrode C4 FP1 O1 P8 P4 P7 F4 FZ PZ P3 F7 T8 CZ F3 O2 FP2 F8 C3 T7

N br. of Features 64 51 103 193 39 39 26 64 116 180 26 103 77 39 26 39 39 64 206

Se (%) 65.71 ± 15.29 58.50 ± 17.12 68.17 ± 16.83 57.33 ± 16.02 48.50 ± 14.10 52.50 ± 15.72 69.67 ± 15.99 65.83 ± 15.70 62.50 ± 11.57 55.33 ± 8.88 74.33 ± 12.59 73.83 ± 10.41 66.67 ± 17.12 54.50 ± 15.02 61.67 ± 15.11 75.33 ± 10.91 71.83 ± 8.76 60.50 ± 16.43 79.50 ± 9.06

Sp (%) 64.83 ± 10.55 60.29 ± 11.36 59.55 ± 4.15 58.67 ± 14.93 50.67 ± 12.21 47.50 ± 17.73 69.33 ± 9.15 66.67 ± 8.98 66.33 ± 12.13 73.33 ± 10.35 64.67 ± 14.30 77.83 ± 8.13 68.67 ± 9.49 59.33 ± 10.41 68.67 ± 9.49 73.83 ± 8.9 74.33 ± 7.10 54.33 ± 14.38 83.38 ± 9.84

PPV (%) 50.83 ± 14.22 43.33 ± 16.17 40.83 ± 11.69 54.17 ± 16.11 45.00 ± 14.05 51.67 ± 16.29 55.83 ± 15.47 59.17 ± 12.79 65.00 ± 14.54 46.33 ± 13.43 63.33 ± 13.77 65.00 ± 14.80 51.67 ± 16.52 52.50 ± 14.83 55.00 ± 16.34 65.83 ± 12.33 67.50 ± 11.69 53.33 ± 12.71 83.38 ± 9.84

Performance (%) 75.60 ± 3.20 72.91 ± 1.58 75.24 ± 2.37 69.64 ± 6.24 70.35 ± 2.72 72.25 ± 3.85 77.48 ± 2.01 77.18 ± 3.36 72.13 ± 6.52 72.42 ± 4.90 75.64 ± 4.55 76.18 ± 2.71 72.78 ± 5.58 72.22 ± 4.72 71.60 ± 4.73 79.37 ± 1.49 74.38 ± 4.41 72.02 ± 4.24 77.43 ± 3.33

TABLE V VARIANCE F ILTER BASED APPROACH : C LASSIFICATION ACCURACY OF NAIVE BAYES CLASSIFIER FOR RAW AVERAGE DATA Electrode C4 FP1 O1 P8 P4 P7 F4 FZ PZ P3 F7 T8 CZ F3 O2 FP2 F8 C3 T7

N br. of Features 193 180 218 257 206 257 231 257 103 206 26 231 180 218 257 231 193 206 206

Se (%) 39.44 ± 4.90 83.65 ± 2.31 21.2 ± 5.39 41.18 ± 5.77 54.47 ± 4.65 52.23 ± 3.42 39.70 ± 5.07 63.37 ± 2.71 50.58 ± 9.38 59.70 ± 4.22 58.46 ± 3.62 48.50 ± 8.58 55.95 ± 4.19 54.47 ± 3.64 50.48 ± 4.20 38.27 ± 11.72 46.74 ± 5.16 63.58 ± 5.04 22.55 ± 3.89

sounds. In total we have 152 different datasets and we are interested in finding the best sets of features that well represent the data for classification purpose. Recall that the main goal is to recognize the patients that have the AD disease in a set of samples. To assess the performance of our classifier and the significance of the considered ranges of frequency we compute the test statistics cited above that properly describe our results. Table II summarizes the results that we found for the MLP classifier on the dwt data. For every electrode we consider the set of features that maximizes the performance of the classifier where we have in total 4 different sets referring to different filtered frequency.

Sp (%) 90.00 ± 3.14 27.28 ± 4.09 98.68 ± 0.86 91.81 ± 3.12 76.38 ± 4.93 90.91 ± 2.64 95.93 ± 2.02 86.03 ± 3.06 86.39 ± 5.06 77.87 ± 1.80 87.47 ± 2.48 80.63 ± 2.72 91.22 ± 2.44 83.05 ± 3.25 64.93 ± 2.18 92.56 ± 2.04 87.96 ± 2.48 80.65 ± 5.42 95.67 ± 2.05

PPV (%) 78.30 ± 6.84 52.21 ± 1.56 91.50 ± 5.50 84.29 ± 5.53 68.23 ± 6.17 84.16 ± 4.26 90.22 ± 4.65 80.87 ± 3.92 77.44 ± 8.08 70.91 ± 2.27 80.52 ± 3.02 68.08 ± 1.37 85.42 ± 3.92 74.59 ± 4.41 56.99 ± 3.21 80.21 ± 4.60 77.68 ± 4.28 77.68 ± 6.31 72.78 ± 10.18

Performance (%) 66.51 ± 3.00 54.59 ± 1.72 61.56 ± 3.46 68.39 ± 1.62 65.90 ± 3.74 72.44 ± 2.72 68.39 ± 2.95 75.44 ± 2.03 68.43 ± 6.59 69.33 ± 2.13 73.96 ± 2.74 65.53 ± 2.80 74.11 ± 2.94 69.50 ± 3.37 58.02 ± 2.19 66.37 ± 6.20 68.13 ± 3.50 71.91 ± 3.39 55.03 ± 2.32

We combine all the results for both experiments novel and target and we show them in decreasing order according to the classifier performance. Our experimental results show that the low frequency ranges, i.e., 0 − 1 Hz 1 − 2 Hz, yield good accuracy for the MLP classifier. Observe that the best recognition rate is equal 79.82% for the P3 electrode and the worst value is equal 60.50% for the CZ electrode. We have also computed the Sp, Se and PPV for every experiment with their CI. Similarly table III summarizes the results that we found for the Naive Bayes classifier. On average these two classifiers are performing similarly. Observe that the best recognition rate is equal 76.27% for the PZ

B. BAYAR

6

Classification Rate vs Nbre. of features for 19 electrodes data sets 90

C4 FP1 O1 P8 P4 P7 F4 FZ PZ P3 F7 T8 CZ F3 O2 FP2 F8 C3 T7

80

Classification Rate (%)

70 60 50 40 30 20 10 0 26

36

46

56

66

76

86

96

106

116

126

136

146

156

166

176

186

196

206

216

226

236

246

257

Number of features

Fig. 1.

Number of Features versus Classification Rate of MLP for raw average data. Classification Rate vs Nbre. of features for 19 electrodes data sets 90

C4 FP1 O1 P8 P4 P7 F4 FZ PZ P3 F7 T8 CZ F3 O2 FP2 F8 C3 T7

80

Classification Rate (%)

70 60 50 40 30 20 10 0 26

36

46

56

66

76

86

96

106

116

126

136

146

156

166

176

186

196

206

216

226

236

246

257

Number of features

Fig. 2.

Number of Features versus Classification Rate of Naive Bayes for raw average data.

electrode whereas the worst one is equal 61.11% for the T8 electrode. 2) Variance Filter based Feature Selection Approach: In this part of the project we consider a different data called raw average data. For every electrode we filter the corresponding dataset in such a way we ignore the features that have a variance less than a certain threshold called percentile. We tune the parameter decreasingly from 90 to 0 with a step equal to 5 hence we 19 different data set for every electrode. Note that when the percentile is set to 0 the data is the original one with 257 features. Therefore, in total we 19 × 19 = 361 datasets. To apply our approach we use the ’genevarfilter’ function from the bioinformatics toolbox in MATLAB [10]. Figures 1 and 2 shows the performance of respectively the MLP and the Naive Bayes Classifiers over 19 different set of features. We plot the curves of the classification rate versus number of features for 19 electrodes.

Table IV summarizes the results in Fig. 1 for the MLP classifier where we only show the best result for every electrode with the number of features. Observe that the best classification rate is equal 79.37% for the FP2 electrode and for only 39 features which are the most significant ones. Similarly, Table V summarizes the results in Fig. 2 for the Naive Bayes classifier. Note that this classifier requires more features in order to recognize the AD disease. And so, the best performance is equal 75.44% for the FZ electrode with 257 features.

IV. C ONCLUSION

AND

D ISCUSSION

In this project, we are interested in testing the MLP and Naive Bayes classifiers capacity on different sets of features in a given Alzheimer data. We have first tested

B. BAYAR

7

both algorithms for the given feature sets generated by the novek and target sonar experiments. We have seen that MLP and Naive Bayes are performing similarly and they are both working well for low filtered frequency. In the second part of the project, we have filtered the raw average data using the variance filtering based approach. We have seen that the MLP is able to recognize the disease with a population of patients even for small number of selected features whereas the Naive Bayes requires more features to performs well. The future work that might be considered for this project is to test more variety of classification approaches on the Alzheimer datasets since the MLP and Naive Bayes has given relatively good results. R EFERENCES [1] X. Wu, V. Kumar, J. R. Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z.-H. Zhou, M. Steinbach, D. J. Hand, and D. Steinberg, “Top 10 algorithms in data mining,” Knowledge and Information Systems In Knowledge and Information Systems, vol. 14, no. 1, pp. 1–37, 2007. [2] I. Jolliffe, Principal component analysis. Wiley Online Library, 2005. [3] S. Mika, G. Ratsch, J. Weston, B. Scholkopf, and K. Mullers, “Fisher discriminant analysis with kernels,” in Neural Networks for Signal Processing IX, 1999. Proceedings of the 1999 IEEE Signal Processing Society Workshop. IEEE, 1999, pp. 41–48. [4] G. H. Golub and C. Reinsch, “Singular value decomposition and least squares solutions,” Numerische Mathematik, vol. 14, no. 5, pp. 403–420, 1970. [5] M. A. Hall and L. A. Smith, “Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper.” in FLAIRS Conference, 1999, pp. 235–239. [6] D. G. Altman and J. M. Bland, “Diagnostic tests. 1: Sensitivity and specificity.” BMJ: British Medical Journal, vol. 308, no. 6943, p. 1552, 1994. [7] R. Polikar, “Lecture: Computational intelligence, ece department - rowan university.” [8] F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain,” Psychological Review, vol. 65, no. 6, p. 386408, 1958. [9] G. McKhann, D. Drachman, M. Folstein, R. Katzman, D. Price, and E. M. Stadlan, “Clinical diagnosis of alzheimer’s disease report of the nincds-adrda work group* under the auspices of department of health and human services task force on alzheimer’s disease,” Neurology, vol. 34, no. 7, pp. 939–939, 1984. [10] I. S. Kohane, A. J. Butte, and A. Kho, Microarrays for an integrative genomics. MIT press, 2002.

Early Electrocardiographic Diagnosis of Acute ...

Early detection of Alzheimer's disease: An fMRI ... - Stanford Memory Lab

Healthrelated quality of life in early Parkinsons disease ...

alzheimers unlocked.pdf

Diagnosis and management of anaphylaxis

Epidemiological, Clinical, and Laboratory Features of Brucellar ...

Early Melanoma Diagnosis with Mobile Imaging

$pdf-1867\pulse-diagnosis-in-early-chinese-medicine-the-telling ...$

pdf-1867\pulse-diagnosis-in-early-chinese-medicine-the-telling ...

Etiology, Diagnosis, and Management of Vaginitis - LTC