Diagnosis by Exclusion: Normal is the Absence of the ...

Viewer
Transcript

Diagnosis by Exclusion: Normal is the Absence of the Abnormal Varun Mithal, Snigdha Chaturvedi and Mudit Jain

Dr. Harish Karnick Indian Institute of Technology Kanpur, India

Indian Institute of Technology Kanpur, India

[email protected]

@gmail.com ABSTRACT This paper presents a novel technique for automated analysis of Electrocardiogram(ECG) signal records for disease diagnosis, based on constructing a forest of binary decision trees. Each individual binary decision tree is trained to identify a unique disease using medically important features pertaining to that disease. The predictions of the individual decision trees are then combined to get the final prediction for the test instance. We test our technique on the UCI Arrhythmia dataset and obtain a disease classification accuracy of 83.4%. We empirically demonstrate that medically relevant abnormalities in ECG are highly feature specific and having a large palette of signal features add to confusion and adversely affect identification of other diseases. Hence, a single model comprising all ECG features does not perform optimally for the task of disease classification. Instead, diagnosis by exclusion using a forest of decision trees works much better.

Categories and Subject Descriptors I.5.4 [Pattern Recognition Applications]: Miscellaneous

General Terms Design, Experiments, Evaluation

Keywords Heart Disease, Electrocardiogram, Data Mining, Decision Tree, Decision Forest

1.

INTRODUCTION

Human Cardiac activity is generated and regulated by a highly periodic electrical impulse. This electrical impulse travels through the cardiac muscles and is responsible for

their alternate contraction and expansion. Electrocardiograms(ECG or EKG) are bio-signals that represent this electrical activity. The electrodes of an ECG machine measure voltage differences along fixed planes and axes of the heart. The measured electric signals form unique patterns corresponding to different regions of the heart. A healthy and correctly functioning heart produces a typical signature pattern. Any abnormality in heart function leads to changes in this signature and can be used to identify the existence of the abnormality and prediction of the corresponding disease or class of diseases. An increasing number of people with heart disease has led to a need for automated systems for analyzing ECG signals. Automating this diagnostic procedure requires a system to understand features relevant for ECG monitoring, extract them from ECG time series data, learn a model capable of identifying heart diseases and finally make a prediction for a given previously unseen ECG signal. The aim of such a system is to identify abnormal patterns and predict the most likely disease responsible for this pattern. These identified abnormal patterns and their corresponding predictions can then be forwarded to a cardiologist for final diagnosis.

2.

MEDICAL THEORY

The ECG signal is generated due to electrical activity in myocardia(heart muscles). The depolarization impulse originates in the sino-atrial(SA) node, travels through the atria, enters the ventricles through the Atrio-Ventricular(AV) node, and then travels through the ventricles. After depolarization completes, repolarization phase starts. Once atria and ventricles are repolarized, the entire cycle starts over once again. Each phase of impulse propagation generates a distinctive pattern in the ECG recording. Depolarization of atria generates P waves; delay at AV node is observed as PR interval; deplorization in ventricles generates QRS wave; deplorarization from inner myocardia towards outer myocardia generates ST segment; repolarization of ventricles results in T wave. The repolarization of atria is hidden below the QRS wave and hence is not distinctively observed. A normal heart shows a characteristic pattern for each of the sub-waves and this pattern is repeated for every beat. Any abnormality in the sub-waves can be attributed to abnormality in the corresponding region of the heart. Prediction of disease is therefore possible by detecting these abnormalities. Medically, heart diseases can be classified into following four categories-

3. Identification and diagnosis of abnormalities in ECG. [6, 5] In our study, we focus on the third category and propose a novel design for identification of abnormalities in ECG. Most of the existing approaches for this category proceed as follows:

Figure 1: Schematic representation of normal ECG • Myocardial infarction(MI) - This condition is caused by blockage of the heart arteries. It leads to death of myocardial tissues due to lack of sufficient oxygen. MI is diagnosed by presence of ST depression in ECG signals and by presence of certain cardiac tissue proteins in blood tests. • Cardiac arrhythmia - Any deviation from the highly regular rhythm of the heart is termed as arrhythmia. These are caused due to electrolytic/hormone imbalances in blood and are primarily diagnosed using ECG. Arrhythmia can originate in any region of heart such as Atria, AV-node, Right Bundle Branch, Left Bundle Branch, Atrio-ventricular septum etc. • Myocardial hypertrophy - This refers to thickening of the cardiac muscles. It is mostly caused by hypertension and valvular stenosis. It is diagnosed by increase in amplitude of the ECG subwave corresponding to hypertrophic region and also by observing cardiac wall thickening in echo-cardiograph. • Valvular Disease - This refers to presence of valvular stenosis and punctured valves. It can be indirectly diagnosed if hypertrophy is also observed, but is primarily diagnosed using echo-cardiography.

3.

PREVIOUS WORK

Various machine learning and signal processing techniques have been used for ECG signal analysis. The existing work in this field can be divided into following three categories 1. Accurate sampling, noise filtering and transformations to obtain higher compression of ECG signals. [3] 2. Extraction of various sub-waves(P, QRS, T) and characteristics values(PR interval, ST depression etc) from ECG signal. [8]

Extract relevant features that are principal components in differentiating normal ECG signals from abnormal ones. These features can be either local or global in nature. Local features are extracted for every beat and can vary over individual beats e.g. width of QRS complex, presence of P wave [11], depression in ST segment. Global features are calculated from multiple beats and are essentially aggregate in nature e.g. RR interval(heart rate), Fourier coefficients of the signal, wavelet decompositions, deviations from baseline, average amplitudes of sub-waves etc. These local and global features are indicative of different aspects of cardiac health. The features are then processed and combined to obtain a maximally discriminating feature space in which good classification accuracy is expected. Some feature processing methods explored earlier are Fourier transforms , wavelet decompositions [10, 11], values of intervals, sub-wave durations, sub-wave amplitudes etc. Feature selection and combination is mostly performed manually. A few techniques used in this regard are genetic algorithms [6], statistical selection[1]. These features are then used to learn the normal and various abnormal classes. Different algorithms used for learning a model include Markov Models [7], Artificial Neural Networks(ANN) [4, 11], Discord Discovery(AWDD) [3], Linear Discriminants [2]. A comparative study of several machine learning algorithms on UCI Arrhythmia dataset by Gao et al. has been presented in IJCNN 2005 [5]. They have compared the performance of Bayesian Probabilistic Artificial Neural Network, Decision Trees, Naive Bayes, Logistic Regression and RBF network. Their experiments clearly indicate that the Neural Network and Decision Tree algorithms are best suited for the task of abnormality detection. The accuracy improves on selecting significant features and using only those to build the neural network. They report a best performance of around 80%. Guvenir et al. [6] who have contributed the UCI Arrhythmia dataset, have used the Voting Feature Interval algorithm for normal/abnormal classification and they report an accuracy of 71.7%.

4.

OUR CONTRIBUTIONS 1. We present a Decision Forest model based on multiple Decision Trees built over subsets of feature space and show that it is superior to the traditional Decision Tree model on the UCI Arrhythmia data. 2. The most challenging aspect of the UCI Arrhythmia data is the limited training data and a large feature space. We show that selection of relevant features from the given 279 features using medical knowledge increases classification accuracy since it reduces confusion due to irrelevant features. 3. We show that the normal class is dependent on a large number of features and it can only be modeled by the

Class 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16

Table 1: Class distribution in Dataset Disease Name Normal Coronary Artery Disease Old Anterior Myocardial Infarction Old Inferior Myocardial Infarction Sinus Bradycardia Sinus Tachycardia Premature Ventricular Contraction Supraventricular Premature Contraction Left Bundle Branch Block Right Bundle Branch Block 1 degree Atrio-Ventricular Block 2 degree AV block 3 degree AV block Left Ventricle Hypertrophy Atrial Fibrillation or Flutter Others

Instances 245 44 15 15 13 25 3 2 9 50 0 0 0 4 5 22

absence of any abnormal pattern i.e. exclusion of all disease classes. 4. The proposed model generates simpler rules as compared to traditional classifiers like Artificial Neural Network, Support Vector Machines and Nearest Neighbour models. These rules can also be easily interpreted by physicians.

5.

DATASET

The UCI Cardiac Arrhythmia data is used to evaluate the proposed methodology and compare it with other state-ofart techniques. This database has 452 ECG recordings from all 12 leads of ECG. These recordings are of different patients who were diagnosed as normal or suffering from some arrhythmia. The arrhythmia has further been classified into 14 classes and the class distribution is given in Table 1. Twenty two instances are unclassified and so we remove them from our dataset. Classes which had fewer than 6 instances were also removed from the data as they have insufficient instances for robust training and testing. Thus the new data comprises of instances from the normal class and seven disease classes as listed in Table 2. The ECG recordings have been analyzed and 279 features have been extracted for each recording. Of these 279 features, 206 are linear valued and 73 are nominal. The features include durations and amplitudes of medically important sub-waves of the ECG signal; presence or absence of certain common ECG patterns. The dataset has 0.33% missing values.

6.

DECISION TREES

Decision trees are widely used technique for various classification tasks. They are popular because they have low computational complexity, provide simple interpretations for each classification of a test instance and are also able to provide rules for formation of classes. They can be trained over a combination of real-valued and nominal features. In cases where data instances have missing features, decision trees

Class D1

D2 D3 D4 D5 D6 D7

Table 2: Diseases and relevant features Disease Name Relevant Features Coronary Artery Disease Amplitudes of T wave in leads DII,V2, V4, V5 and V6 Old Anterior Average width of Q wave Myocardial Infarction in lead V3 Old Inferior Average width of Q wave in leads AVF and V4 Myocardial Infarction Sinus tachycardy Heart Rate(F15) Sinus bradycardy Heart Rate(F15) Left bundle branch block QRS duration(F5) Right bundle branch block Average width of S wave in lead V1, Amplitude of R’ wave in lead V1 Number of intrinsic deflections in leads DII and V1

can continue with classification using techniques like surrogate splits. Decision trees are particularly useful when the learning model has to incorporate domain knowledge using human expertise. We use the C4.5 algorithm [9] for training. C4.5 is an algorithm for decision tree which selects the attribute with the highest information gain (decrease in entropy) for splitting at each node. This step is applied repeatedly at each node to build the classifier.

7.

OUR APPROACH: DECISION FOREST

Often, we might have a very large feature space but only a few relevant features useful for classification. Presence of many non-relevant features often leads to chance correlation. Therefore, we propose a transition from the Decision Tree method to a Decision Forest wherein several individual Decision Trees are built using different and smaller subsets of the feature space. The decisions from all Decision Trees in the forest are combined to get the final prediction. A Decision Forest method was also adopted by Tong et al. to classify prostrate cancer samples. [12].

7.1

Building the decision trees

The individual decision trees in the forest target a specific type of disease and improve recall for that disease. For training a decision tree for a particular disease, say D, we created a subset of the given feature space consisting of features that were relevant for diagnosis of disease D using medical knowledge for the disease. Since we build a strictly binary tree for each disease the decision tree was trained on instances of the normal class and instances of D for the chosen feature subset. The UCI data set has a single label on each data item. However, it is possible that a data item may actually qualify for multiple labels so we have chosen to train the tree only on normals and the subset of data items that has the chosen disease label. Only normals are guaranteed not to have the disease. The test set comprised of normal instances and instances of all diseases (including D). The confusion matrices so obtained for individual trees are used while combining the decision trees (see section 7.2) to give a single label to instances

fjm jl is the fraction of instances of Djl classified as Djm by DTjm .

that have multiple disease labels.

7.2

Combining the decision trees

Classification of a test instance is done by combining the results of the individual decision trees trained by the method described above.

ejm is the measure of error of the tree DTjm with respect to the concerned classes. Hence, in case of conflict between the decision trees DTj1 ,DTj2 ,DTj3 ..... DTjk , we choose a DTji which is least confused between the concerned classes, Dj1 , Dj2 , Dj3 ..... Djk . Such a DTji is expected to have minimum value of the error, e. the final prediction for the test instance is Djm such that ejm is minimum.

For a test instance, there are three different possibilities after the predictions from all N trees are available:

An example: We illustrate by the following example. Suppose, for a test instance, predictions of DT1 , DT3 and DT7 are D1 , D3 and D7 respectively. All other DTi (i 6= 1, 3, 7) classify it as non-Di . We now use the confusion matrices for DT1 , DT3 and DT7 to compute the values of e1 , e3 , e7 . The relevant parts for the three classification matrices are shown below.

• Case 1: None of the DTi s label the test instance as Di i.e. not positive for any disease. • Case 2: One of the decision trees, say DTj , labels the test instance as Dj while all other DTi s(i 6= j) label it as non-Di . • Case 3: More than one, say k, decision trees (DTj1 , DTj2 , DTj3 ..... DTjk ) label the test instance as Dj1 , Dj2 , Dj3 ..... Djk respectively while the rest of the decision trees DTi s (i 6= j1 ,j2 ,j3 ...jk ) label it as nonDi .

Actual D1 D3 D7

The final prediction for a test instance can be one of the Di s or ‘Rest’. The class ‘Rest’ implies that the instance belongs to either the normal class or a class for which a decision tree has not been built. The final class label for the test instance is given by the method described below: • Case 1: The final prediction for the test instance is ‘Rest’.

Predicted D1 D3 D7 6 3 1 1 1 4 -

For DT1 :

F P1,3 =1 f1,3 =0.067 F P1,7 =1 f1,7 =0.020 e1 = f1,3 + f1,7 = 0.087

For DT2 :

F P3,1 =6 f3,1 =0.136 F P3,7 =4 f3,7 =0.080 e3 = f3,1 + f3,7 = 0.216

For DT3 :

F P7,1 =3 f7,1 =0.068 F P7,3 =1 f7,3 =0.067 e7 = f7,1 + f7,3 = 0.135 Since e1 is minimum of all the three values of e, therefore the final prediction by the algorithm for the given test instance is D1 .

• Case 2: The final prediction for the test instance is Dj . • Case3: In this case, there is a conflict between predictions of multiple trees. So, we adopt the following strategy to decide the final class of the test instance.

Let us define

fjm jl

l=1,l6=m

Let there be N decision trees in the forest, labeled as DT1 , DT2 , DT3 ..... DTN trained for the disease D1 , D2 , D3 ..... DN respectively. Each DTi classifies the test instance as either belonging to disease class Di or non-Di . While training DTi learnt precisely the features for disease Di Hence, it is expected to achieve a high recall for disease Di . All other instances i.e. normal and instances of diseases Dj (j 6= i) are classified as non-Di .

Consider the subset of original dataset which consists of instances of only classes Dj1 , Dj2 , Dj3 ..... Djk .

l=k X

ejm =

7.3

Evaluation

For evaluating the different classifiers we use classification accuracy, recall and precision as measures of performance.

F Pjm jl = Number of instances of Djl classified as Djm by DTjm .

For a multi-class classifier where data is distributed over n classes C1 , C2 .. Ci ... Cn , we define:

F Pjm jl is the class-wise False Positives for the Decision Tree DTjm . [The confusion matrix of DTjm , obtained in 7.1,is used for calculating the value of F Pjm jl ]

T Pi = True Positives for Ci F Pi = False Positives for Ci Ni = Number of instances belonging to Ci

fjm jl

F Pjm jl = Total number of instances of Djl

Pi=n i=1 T Pi Classif icationAccuracy = Pi=n i=1 (Ni )

Table 3: Confusion Matrix for Decision Forest Predicted Actual N D1 D2 D3 D4 D5 D6 D7 N (Total=245) 217 10 0 1 3 5 1 8 D1 (Total=44) 12 27 1 2 0 1 0 1 D2 (Total=15) 0 0 13 0 2 0 0 0 D3 (Total=15) 2 1 0 12 0 0 0 0 D4 (Total=13) 3 0 1 0 9 0 0 0 D5 (Total=25) 1 0 1 0 0 22 0 1 D6 (Total=9) 0 0 0 0 0 0 9 0 D7 (Total=50) 10 0 0 1 2 1 0 36 For each class Ci , Recall and Precision are defined as: T Pi Ni T Pi = T P i + F Pi

RecallCi = P recisionCi

8.

EXPERIMENTS AND RESULTS

We performed three different experiments on the dataset and compared their results.

Figure 2: Comparison of Precision for the three experiments

In the first experiment, we built a Decision Tree model using all 279 features for multi-class classification. We obtained a classification accuracy of 73%. On investigation is was found that including all features during learning led to confusion between classes. Also the number of classes in a real world scenario would be much larger and resulting in a very huge decision tree. In the second experiment, we built the decision tree on a smaller feature space of 14 features. Features were chosen based on their importance in identifying any one of the seven diseases. The selection of features was based on known medical information of their significance. In this experiment we obtained a classification accuracy of 80%. In the last experiment, a Decision Forest was built that contained seven Decision Trees where each tree was trained to identify only one disease. Each Decision Tree was built using features relevant for its disease. The chosen diseases and their respective feature sets are shown in Table 2. The final prediction was made by combining predictions of all the trees according to the algorithm described above (sec. 7.2). Since we built Decision Trees for all the seven diseases present in the dataset, the ‘Rest’ class of the classifier corresponds to the normal class of the dataset for this experiment. The resulting confusion matrix is shown in Table 3. We obtained a classification accuracy of 83.4% in this experiment. We carried out a ten fold cross-validation to reduce dependence on selection of the training sample. Figure 1 and Figure 2 show the recall and precision for each class in the three experiments discussed. The figures clearly show that the third experiment gives better recall and precision compared to the other two for most classes.

9.

DISCUSSION AND SUMMARY

A. Identification of normal class by exclusion In predicting cardiac diseases one is often tempted to build a

Figure 3: Comparison of Recall for the three experiments

Table 4: Performance N D1 D2 Recall 88.6 61.3 86.7 Precision 88.6 71.1 81.3 Classification Accuracy Normal Abnormal Accuracy

of Decision Forest D3 D4 D5 D6 80.0 69.2 88.0 100 75.0 56.3 75.9 90.0 83.4% 86.5%

D7 72.0 78.3

second experiment of 8. C. Comparison with previous techniques Using this Decision Forest model, we achieve a classification accuracy of 83.4% and an accuracy of 86.5% for the normal/abnormal classification task which is a significant improvement as compared to previous approaches [5] where best results are no better than 80% for normal/abnormal classification. D. Advantages of our design

Figure 4: Decision Tree for Sinus Tachycardy (D4)

classifier that learns the normal pattern and regards all others as abnormal. The aim is to filter out the patients who are suspected to have some heart abnormality. However, this turns out to be difficult ECG analysis . A combination of many features and large range of values that are considered normal for a feature gives rise to a very complex and disconnected manifold in feature space for the normal class and this is very hard to learn for a single classifier. Also, while ECG abnormalities have well defined patterns (e.g. absence of P waves, T wave inversion, high variation in RR interval) normal is generally marked by absence of any abnormal pattern. Our endeavor to build a classifier for the normal class helped us understand this fact. So, in our proposed design we use this hypothesis to classify a test instance as normal if and only if it is not identified as a disease by any of the binary disease classifiers, i.e. by exclusion. The high recall (89%) we obtain for normal class corrobrates our hypothesis. B. Appropriate feature selection Medical literature suggests that only a few features of the ECG signals are relevant for identification of each disease. Hence, for any classification task, an appropriate feature set should be chosen which consists of only features significant for diagnosis of the concerned diseases. We suggest that knowledge from medical domain should be exploited for choosing this feature set. The results of experiments 1,2 (described in 8) show that choosing an appropriate feature subset leads to an improvement in classification accuracy of the classifier (73% to 80%). Furthermore, it was observed that features that are relevant for identifying one disease add to noise or chance correlation while building a predictor for another disease. An example of this is the rule that the Decision Tree learns to identify the disease Sinus Bradycardia. In addition to dependence on the medically acknowledged predictor variable for Sinus Bradycardia i.e. Heart Rate, it also shows dependence on several other features like amplitudes of R and S waves in lead DII. Dependence on latter features is merely a matter of chance arising due to limited number of training instances. It is for this reason that the Decision Forest design yields better results than a single big Decision Tree, built in the

1. An advantage of our design is that it is in accordance with how the cardiologists interpret ECGs. While examining a patient, cardiologists look at features responsible for causing various diseases. This procedure is continued till the patient has been examined for all possible known diseases. The final diagnosis shows that either the patient is suffering from all the detected diseases or the patient is normal. Note that normal label here only implies that person is not suffering from any known disease. We follow a similar approach in our design. The technique can just as easily report multiple diseases but we have no way to verify correctness since the data has only single disease labels. So, we select one disease label from the set of disease labels the Decision Forest has produced using the algorithm described in 7.2 . 2. Another interesting advantage of this design is that the Decision Trees built are simple and their rules easy to interpret. This makes it fairly easy to incorporate rules suggested by human experts in the Decision Tree. This is specially useful in cases where the learning dataset is small. Since each Decision Tree is fairly small our design is expected to be more scalable and should continue to perform well when the number of disease classes is increased. 3. Also, we observe that the rules learnt by the Decision Trees are in consonance with the rules used by cardiologists during diagnosis. For example, the Decision Tree for Sinus Bradycardy (D4), shown in Figure 4, learns a boundary of Heart Rate > 101 bpm for identifying Sinus Bradycardy. And discussions with cardiologists verifies that the rule used for identification of Sinus Bradycardy is Heart Rate > 100 bpm. Similar observations were made for other diseases. This is also emperically demonstrated by high Recall and Precision vlaues for various classes as shown in Table 4.

E. Limitations of the UCI dataset Our experimental results show that we do not obtain good recall on Coronary Artery Disease(61.4%) and Right bundle branch block(72%). According to medical literature, ST segment depression or elevation is a critical feature in diagnosis of these diseases. The UCI data set, unfortunately does not have these features. Also, accurate diagnosis of Coronary Artery Disease requires analysis of a stress ECG which is

again absent from the data set. When we removed instances of these two classes, we obtained a classification accuracy of 93.5%. This suggests that presence of above mentioned features in the data set is expected to significantly improve the recall on these two classes.

10.

[10]

FUTURE WORK

The design described above is applicable for classification of a given ECG feature vector into either one of the disease classes or the Rest class. In the case of UCI dataset, these feature vectors have been extracted from the raw ECG signal. The ECG time-series from which these feature vectors were extracted is not provided in the dataset. Availability of the ECG time-series would have allowed us to extract other important characteristics like QRS morphology, ST segment depression, baseline deviations etc. An algorithm that has been trained on these features in addition to features provided in UCI dataset, is expected to perform better and increase disease coverage. The task ahead is to extract all these features with high reliability and accuracy i.e. if present, a feature should not remain unidentified and if detected, the feature value should be correct. This task requires signal processing techniques such as noise removal, frequency filtering. Several tools[ecgpuwave, waveann] are available for this purpose. Using these tools and by further processing their output we have been able to extract amplitudes and durations of all sub-waves, RR interval, ST deviation etc. on a number of datasets provided by physionet. These contain 2-lead ECG signals of arrhythmia patients. We also plan to extend this approach to handle data from stress tests and echo cardiograms.

11.

[9]

REFERENCES

[1] Chazal, P., and Celler, B. Selection of optimal parameters for ecg diagnostic classification. In Computers in Cardiology (1997). [2] Chazal, P. D., and Reilly, R. B. Automatic classification of heartbeats using ecg morphology and heartbeat interval features. IEEE Transactions on Bio-medical Engineering 51 (2004), 1196–1206. [3] Chuah, M. C., and Fu, F. Ecg anomaly detection via time series analysis. Tech. rep., 2007. [4] Foo, S., Stuart, G., Harvey, B., and Meyer-Baese, A. Neural network-based ekg pattern recognition. Engineering Applications of Artificial Intelligence 15 (2002), 253–260. [5] Gao, D., Madden, M., Chambers, D., and Lyons, G. Bayesian ann classifier for ecg arrhythmia diagnostic system : A comparison study. In Proceedings of International Joint Conference of Neural Networks Montreal Canada (2005). [6] Guvenir, H. A. Detection of abnormal ecg recordings using feature intervals. In Proceedings of the Tenth Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN’2001) (2001), pp. 265–274. [7] Hughes, N. P., Peter, N., and Philosophy, H. D. Markov models for automated ecg interval analysis. In Proceedings NIPS 16 (2004), MIT Press, pp. 611–618. [8] Maglaveras, N., Stamkopoulos, T., Diamantaras, K., Pappas, C., and Strintzis, M. Ecg pattern recognition and classification using nonlinear transformations and neural networks: A review.

[11]

[12]

International Journal of Medical Informatics 52 (1998), 191–208. Quinlan, J. Induction of decision trees. Machine Learning 1 (1986), 81–106. Saxena, S., Kumar, V., and Hamde, S. Feature extraction from ecg signals using wavelet transforms for disease diagnostics. International Journal of Systems Science 33 (2002), 1073–1085. Sternickel, K. Automatic pattern recognition in ecg time series. Computer Methods and Programs in Biomedicine 68 (2002), 109–115. Tong, W., Xie, Q., Hong, H., Fang, H., Shi, L., Perkins, R., and Petricoin, E. F. Using decision forest to classify prostate cancer samples on the basis of seldi-tof ms data: Assessing chance correlation and prediction confidence. Environmental Health Perspective 112 (2004), 1622–1627.