Artificial Speciation of Neural Network Ensembles Vineet Khare & Xin Yao School Of Computer Science

Overview • • • • • • •

Introduction Benchmark problems Overview of the methodology Encoding of neural networks Evolving ANNs with fitness sharing Results and discussion Conclusion and suggestions

Introduction • Modular approach for complex problems fitness sharing in evolving ANNs • Making use of population information

Introduction • Background – Automatic modularization using speciation (Darwen and Yao, 1996) – Use of population information in evolution (Yao and Liu, 1998) – Speciation in neural networks (Ahn and Cho, 2001)

Benchmark Problems • Problems taken from UCI machine learning benchmark repository – Wisconsin breast cancer dataset • 699 instances • 2 classes - malignant or benign • 9 integer valued attributes

– Heart disease dataset • 270 instances • 2 classes - presence or absence • 13 real valued attributes

• Datasets divided into - training (1/2th), validation (1/4th) and testing (1/4th)

Overview of Methodology 1. Encoding ANNs 2. Initialize ANNs 3. Train each ANN partially (on training set) 4. Fitness evaluation (on validation set) with sharing 5. Copy elites, apply variational operators 6. Termination criteria satisfied ? No => go to step 3 Yes => train ANNs fully, combine the outputs

Encoding of Neural Networks

Evolving ANNs • Partial training - Lamarckian evolution • Fitness evaluation and fitness sharing at phenotypic level 1 f raw = MSE

• Distance between two individuals modified Kullback Leibler entropy pj qj 1 n  D( p, q ) = ∑ p j log + q j log 2 j =1  qj pj

   

• Sharing radius (σshare = 0.5, 1 and 2)

• Elitism

Evolving ANNs

– Preserving best individual with raw fitness – Preserving best individual with shared fitness

• Genetic operators - mutation – Connection creation – Connection deletion

• Genetic operators - crossover – Sub-graph exchange

Evolving ANNs

• Genetic operators - crossover (contd.)

Evolving ANNs

• Full training • Combination of outputs

– Majority voting – Averaging – Recursive least square (RLS) (Yao and Liu, 1998)

• Parameter settings for experiments

Evolving ANNs • Parameter settings for experiments (contd.) PARAMETER # of input nodes Maximum # of hidden nodes # of output nodes Population size Learning rate for training Data points in training set Data points in validation set Data points in testing set SEED Crossover probability Mutation Probability Sharing Radius # of Generations # of Runs

BREAST CANCER DATASET 9 5 1 20 0.1 349 175 175 System time 0.3 0.1 1 200 30

HEART DISEASE DATASET 13 6 1 20 0.1 135 67 68 System time 0.3 0.1 1 350 24

Results and Discussion

• Results for breast cancer problem Mean SD Min Max

Voting Validatio Training n 0.0378 0.0189 0.0100 0.0153 0.0074 0.0000 0.0544 0.0514

Averaging Testing 0.0231 0.0176 0.0000 0.0514

Training Validation 0.0374 0.0235 0.0102 0.0151 0.0078 0.0114 0.0544 0.0571

RLS Training+Validat Testing Testing ion 0.0237 0.0229 0.0167 0.0137 0.0152 0.0122 0.0000 0.0016 0.0000 0.0514 0.0267 0.0343

Table 1: Final E rror Rates (averaged over 30 runs) for B reast Cancer P roblem

• Results for heart disease problem Voting

Averaging

RLS Training+Validati Training Validation Testing Training Validation Testing Testing on 0.1960 0.1623 0.1733 0.1828 0.1462 Mean 0.1642 0.1612 0.1612 0.0282 0.0265 0.0404 0.0231 0.0293 0.0323 0.0243 0.0337 SD 0.1333 0.1194 0.1176 0.1333 0.1194 0.1029 0.1188 0.1176 Min 0.2667 0.2388 0.2794 0.2370 0.2388 0.2353 0.2129 0.2500 Max Table 2: Final Error Rates (averaged over 24 runs) for Heart Disease Problem

Results and Discussion • Best individual performance vs different combination methods 0.25

Error Rates on Training Set for Breast Cancer problem (Comparing Combination methods with best individual in the population)

Best Individual Voting Averaging

0.15

RLS

0.1

0.05

# of generations

199

190

181

172

163

154

145

136

127

118

109

100

91

82

73

64

55

46

37

28

19

10

0 1

Error Rates

0.2

Results and Discussion • Best individual performance vs different combination methods 0.49 0.44

Error Rates for Heart Disease Problem (Comparing Combination Methods with the Best Individual in the Population)

Error Rates

0.39 0.34

Best Individual Voting Averaging RLS

0.29 0.24 0.19 0.14 0.09 1

21

41

61

81

101 121 141 161 181 201 221 241 261 281 301 321 341

# of Generations

Conclusion and Suggestions • Comparison with known results C lassification A ccuracy on T est D ata K now n B est O ur R esult B reast C ancer H eart D isease

98.29% (Ahn and C ho , 2001)

98.33%

84.9% (Y ao and Liu, 1998)

83.88%

• Combination of outputs produced much better results than individual best. • RLS combination method produced best results. • Shortcomings

Conclusion and Suggestions

• Shortcomings

– Too expensive – Choice of sharing radius

• Suggestions – Less expensive schemes for speciation – Simple Sub-population Scheme (Spears, 1994) – Other schemes that require less problem knowledge • Multi-national EA (Ursem, 1999) • DNC (Gan and Warwick, 2001)

References Ahn, J. H. & Cho, S. B., 2001, Speciated Neural Networks Evolved with Fitness Sharing Technique. Proceedings of the 2001 Congress on Evolutionary Computation, 390-396, Seoul, Korea. Darwen, P., & Yao, X., 1996, Automatic Modularization by Speciation. Proceedings of the 1996 IEEE International Conference on Evolutionary Computation (ICEC '96), 88-93, Nagoya, Japan: IEEE Computer Society Press. Gan, J. & Warwick, K, 1999, A Genetic Algorithm with Dynamic Niche Clustering for Multimodal Function Optimisation. In proc. of the 4th Inter. Conf. on Artificial Neural Networks and Genetic Algorithms, 248-255, Springer Wien New York. Spears, W. M., 1994, Simple Subpopulation Schemes. Proceedings of 3rd Annual conf. on Evolutionary Programming, 296-307, World Scientific. Ursem, R., 1999, Multinational Evolutionary Algorithms. Proceedings of Congress of Evolutionary Computation, 3, 1633-1640 Yao, X. & Liu, Y., 1998b, Making Use of Population Information in Evolutionary Artificial Neural Networks. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, 28(3), 417-425.

Artificial Speciation of Neural Network Ensembles

Problems taken from UCI machine learning benchmark repository. – Wisconsin breast cancer dataset. • 699 instances. • 2 classes - malignant or benign.

212KB Sizes 2 Downloads 250 Views

Recommend Documents

Artificial Speciation of Neural Network Ensembles
... Khare & Xin Yao. School Of Computer Science ... Datasets divided into - training (1/2th), validation ... Preserving best individual with shared fitness. • Genetic ...

ARTIFICIAL NEURAL NETWORK MODELLING OF THE ...
induction furnace and the data on recovery of various alloying elements was obtained for charge ..... tensile tests of the lab melt ingot has been employed.

Development of an artificial neural network software ...
a distance of 1.5m, using a broadband, TEM horn antenna. ... London EC1V 0HB, U.K. .... 6 Comparison between the peaks of the three perpendicular directions.

Artificial Neural Network for Mobile IDS Solution
We advocate the idea that mobile agents framework enhance the performance of IDS and even offer them new capabilities. Moreover agent systems are used in ...

Review Paper on Artificial Neural Network in Data ...
networks have high acceptance ability for high accuracy and noisy data and are preferable ... applications such as identify fraud detection in tax and credit card.

2009.Artificial Neural Network Based Model & Standard Particle ...
Artificial Neural Network Based Model & Standard ... Swarm Optimization for Indoor Positioning System.pdf. 2009.Artificial Neural Network Based Model ...

Artificial Neural Network for Mobile IDS Solution (PDF Download ...
Agents is defined as a distinct software process, which. can reason independently, and ..... James P. Anderson Company, (Fort Washington, Pennsylvania, 1980). [2] D. E. Denning, An .... [44] CISCO, http://www.cisco.com.AccessedMarch2008.

seminar report on artificial neural network pdf
seminar report on artificial neural network pdf. seminar report on artificial neural network pdf. Open. Extract. Open with. Sign In. Main menu. Displaying seminar ...

Using Artificial Neural Network to Predict the Particle ...
B. Model Implementation and Network Optimisation. In this work, a simple model considering multi-layer perception (MLP) based on back propagation algorithm ...

Computing with Neural Ensembles
Computing with Neural Ensembles. Miguel A. L. Nicolelis, MD, PhD. Anne W. Deane Professor of Neuroscience. Depts. of Neurobiology, Biomedical ...

Neural Network Toolbox
3 Apple Hill Drive. Natick, MA 01760-2098 ...... Joan Pilgram for her business help, general support, and good cheer. Teri Beale for running the show .... translation of spoken language, customer payment processing systems. Transportation.

Neural Network Toolbox
[email protected] .... Simulation With Concurrent Inputs in a Dynamic Network . ... iii. Incremental Training (of Adaptive and Other Networks) . . . . 2-20.

Neural Network Toolbox
to the government's use and disclosure of the Program and Documentation, and ...... tool for industry, education and research, a tool that will help users find what .... Once there, you can download the TRANSPARENCY MASTERS with a click.

Electromagnetic field identification using artificial neural ... - CiteSeerX
resistive load was used, as the IEC defines. This resistive load (Pellegrini target MD 101) was designed to measure discharge currents by ESD events on the ...

Impact of Missing Data in Training Artificial Neural ...
by the area under the Receiver Operating .... Custom software in the C language was used to implement the BP-ANN. 4. ROC. Receiver Operating Characteristic.

Prediction of Software Defects Based on Artificial Neural ... - IJRIT
IJRIT International Journal of Research in Information Technology, Volume 2, Issue .... Software quality is the degree to which software possesses attributes like ...