Int. J. Signal and Imaging Systems Engineering, Vol. 3, No. 1, 2010

31

A modified training scheme for SOFM to cluster multispectral images T.N. Nagabhushan* and D.S. Vinod Department of Information Science and Engineering, Sri Jayachamarajendra College of Engineering, Mysore, India E-mail: [email protected] E-mail: [email protected] *Corresponding author Abstract: In this paper, we propose modifications to Kohonen’s Self-Organising Feature Map (SOFM) to achieve faster convergence specifically with respect to multispectral images. First, the raw image is pre-processed using data reduction technique to obtain reduced data set and then Condensed Nearest Neighbour (CNN) rule is applied to yield standard subset of samples. The samples in the standard subset are used to find the Best Matching Unit (BMU) and the samples in the reduced data set are used to update BMU and its neighbouring neurons. The SOFM is tested on • Synthetic image data set and • Harangi 1991, 1992 image data sets. Results are compared with conventional SOFM. Keywords: SOFM; self-organising feature map; CNN; condensed nearest neighbour; reduced data set; standard subset; BMU; best matching unit; multispectral image clustering. Reference to this paper should be made as follows: Nagabhushan, T.N. and Vinod, D.S. (2010) ‘A modified training scheme for SOFM to cluster multispectral images’, Int. J. Signal and Imaging Systems Engineering, Vol. 3, No. 1, pp.31–39. Biographical notes: T.N. Nagabhushan received his BE Degree in Electrical Engineering from the University of Mysore and Master’s in Electrical Engineering at Indian Institute of Science. He obtained his PhD Degree from Indian Institute of Science in the area of constructive learning RBF networks. He is the chairman of Information Science and Engineering at Sri Jayachamarajendra College of Engineering, Mysore, India. His research interests include machine learning algorithms and applications. D.S. Vinod received his Bachelor’s Degree in Electronics and Communications Engineering and Master’s Degree in Computer Engineering from the University of Mysore, India. He did his PhD at Visvesvaraya Technological University. He did his research work on Multispectral Image Analysis. He is currently working as Faculty at the Department of Information Science and Engineering, Sri Jayachamarajendra College of Engineering, Mysore, India. His research interests include image processing, neural networks and algorithms.

1

Introduction

Clustering of multispectral image data is a complex process and requires consideration of many factors like size of the image, feature dimension, noise, overlapping clusters, number of clusters, unequal cluster density and unequal cluster size (Tran et al., 2005). A number of clustering algorithms have been applied to multispectral image data sets. Lu and Weng (2007) address the problem of clustering of remotely sensed images. They discuss the different methods in determination of clustering algorithm, selection of suitable training samples, feature extraction pre-processing, post-processing and cluster accuracy assessment. The advantage of neural networks in clustering

Copyright © 2010 Inderscience Enterprises Ltd.

of remotely sensed data over conventional clustering algorithms is suggested by Roli et al. (1996). Srinivas et al. (2008) used SOFM to remotely sensed data in analysis of regional flood. Kohonen’s SOFM is one of the widely used unsupervised neural networks that partition the input space into different clusters based on similarity among data samples (Kohonen, 1982a, 1982b). Conventional SOFM algorithm consumes long training cycles when large size image data is fed to it. With every presentation of data pattern, SOFM updates BMU and its neighbours. Multispectral images are multiband images taken from different bands of a geostationary satellite and would occupy a large memory space. They are often rich in redundancy. Hence, clustering with conventional

32

T.N. Nagabhushan and D.S. Vinod

SOFM requires more memory space besides longer training cycles. To improve the learning characteristics of conventional SOFM, we propose novel modifications to the learning algorithm. The modifications are introduced in the computation of BMU and updating its neighbours in the input space. The raw image is subjected to data reduction process to obtain a reduced data set. Further, CNN rule is applied on the reduced data set to generate standard subset of samples. The BMU is determined using standard subset of samples while the BMU and its topological neighbours are updated using reduced data samples. The modified procedure is discussed in the subsequent sections.

2

Unsupervised learning with SOFM

Kohonen’s (1982a, 1982b) SOFM performs unsupervised learning on the input data samples and create clusters. Many variants of SOFM have appeared in the literature to deal with uniform and non-uniform data distributions (Kohonen, 1987, 1993; Kohonen et al., 1990). SOFM transforms the input patterns of arbitrary dimensions into a one or two-dimensional discrete map in a topologically ordered manner. Each neuron is fully connected to all the patterns in the input. The SOFM algorithm proceeds by initialising the synaptic weights in the network. This is done by randomly selecting required number of patterns for the input. Once the network is initialised, for each input pattern, the winner or BMU is found. The winner neuron and its topological neighbours are updated. The procedure is repeated for finite number of iterations.

3

Proposed modifications

Since conventional SOFM has limitation on large data sets having variable redundancies, specifically with multispectral images, we propose to introduce the following modifications to original SOFM algorithm: •



Generate reduced data from original image data. The reduced data set is used to update the BMU and its neighbours. We have proposed modifications to Gowda’s (1984) data reduction technique to cover both the single band images as well as multiband images. The proposed modification achieves data reduction without dimensionality reduction. A standard subset of samples is further obtained from reduced data set. This standard subset of samples is used to determine the BMU.

The above-mentioned process requires offline computations on image data and two efficient procedures have been used to get reduced data set and standard data subset, respectively.

3.1 Generation of reduced data set Since the multispectral images often have high degree of redundancies, using all data samples to train SOFM render it inefficient. Therefore, it is essential to pre-process the image data to generate a reduced data set. There have been many data reduction procedures proposed in the literature. They include the works of Wehrens et al. (2004), Gowda (1984), Jagannathan et al. (1996), Tate (1994), Zhu and Po (1996), Vasilyev (1997), Chitroub et al. (2001), Mielikainen and Kaarna (2002) and De Backer et al. (1998). Most of them do dimensionality reduction besides data reduction (De Backer et al., 1998; Gowda, 1984). One of the most significant work on data reduction of multispectral images is due to Gowda (1984). Gowda (1984) proposed a scheme of multi-stage isodata clustering incorporating dimensionality reduction and data reduction to multispectral images. Here, the d features are transformed to three-dimensional primary colour space namely blue, green and red coordinates. The three-dimensional data is reduced by the application of storage bin arrays. This is followed by a multi-stage isodata technique incorporating a novel seed point picking method to obtain the desired number of clusters. The disadvantage of this algorithm is that the algorithm cannot be applied for clustering single band images, as well as analysing any one of the selected individual bands in the multispectral image data set. It is generally desirable to hold all the features without dimensionality reduction and obtain subset of data that can lead to efficient clustering. Efficient clustering has two characteristics, fast convergence and less storage requirements. Hence, we have proposed modifications to Gowda’s (1984) method to cover single band images as well as multiband images. The proposed modification achieves data reduction without dimensionality reduction and works with both multiband images and single band images. The same algorithm is modified to retain all features and generate only reduced data set. To reduce multiband image data with d spectral bands, d number of nb × nb storage bin matrices are required. The samples (pixel values) are stored in the corresponding storage bins depending on the feature values. The bins are updated as and when the samples are stored. The updated features in the bins form the reduced data set. The procedure to obtain reduced data set is presented here. Procedure to obtain reduced data set

A modified training scheme for SOFM to cluster multispectral images Table 2 Non-empty bin

33

Reduced data set obtained in non-empty bins Samples in bin

Normalised feature values

1





2





3

1, 5

1.42647, 0.985294, 0.352941, 0

2

1.38235, 0.823529, 0.352941, 0

4 5

3, 4, 6, 8, 9,10 1.62255, 0.911766, 0.916666, 0.25

6

7

2, 0.882353, 1.32353, 0.352941

7





8





9





3.2 Generation of standard subset

We have introduced modifications to Gowda’s (1984) procedure and proposed an approach that keeps the features of the image intact and remove the redundancy using storage bin arrays. The reduced data set is obtained without sacrificing the dimensionality of the images. Illustration Consider a data set with 10 samples shown in Table 1 where each sample has four features. Assigning a bin size of 3 × 3 with a threshold of 0.1, the algorithm generates samples, which are shown in Table 2. It can be seen that only four bins are non-empty and all the features take one of the bins having reference numbers 3, 4, 5 and 6, respectively. It is observed that bins 4 and 6 are assigned single samples 2 and 7, respectively. Hence, these bins contain only the normalised feature values of the respective samples as the updated data. Table 2 shows the normalised average value of the features in respective bins. Thus, the algorithm while preserving all features yields reduced data. Table 1

Data set

Sample

Feature values

1

5.1, 3.5, 1.4, 0.2

2

4.9, 3.0, 1.4, 0.2

3

4.7, 3.2, 1.3, 0.2

4

4.6, 3.1, 1.5, 0.2

5

5.0, 3.6, 1.4, 0.2

6

5.4, 3.9, 1.7, 0.4

7

7.0, 3.2, 4.7, 1.4

8

6.4, 3.2, 4.5, 1.5

9

6.9, 3.1, 4.9, 1.5

10

6.3, 3.3, 6.0, 2.5

In conventional SOFM training, a BMU is determined for every presented pattern. In the proposed approach, we determine a standard subset of samples from reduced data set using CNN rule. This will greatly influence the speed of convergence. The concept of standard subset of samples was introduced by Subba Reddy et al. (1996) and CNN rule was introduced by Gowda and Krishna (1979). The CNN decision rule iteratively produces a consistent subset from the reduced data set. This subset when used as a stored reference set for the nearest neighbour decision rule correctly clusters all the samples belonging to the original reduced data set used in training the network. The above-mentioned concepts are used to narrow down the samples for determining the BMU. The procedure to determine standard subset of samples is given here. Procedure to obtain standard subset

34

T.N. Nagabhushan and D.S. Vinod

Illustration Consider a data set with 10 samples shown in Table 1 where each sample has four features. The result of the application of data reduction technique is shown in Table 2. To obtain standard subset of samples for this reduced data set, CNN is applied with threshold value Tnear = 1. This resulted in two representative samples in the standard subset as shown in Table 3. Sample 1 in standard subset represents non-empty bins 3 and 4, similarly sample 2 represents non-empty bins 5 and 6. Table 3 Samples in standard subset 1 2

An example to obtain standard subset of samples

Feature values Samples in reduced data set 1.42647, 0.985294, 0.352941, 0 1.426, 1.0147, 0.352, 0 1.38235, 0.823529, 0.352941, 0 1.620, 0.923, 1.62255, 0.911766, 0.916666, 0.25 0.914, 0.25 2, 0.882353, 1.32353, 0.352941

3.3 Farthest neighbour initialisation of neurons To initialise the neurons, we have adopted the concept of farthest neighbour (Gowda, 1984) rather than choosing the weight vectors at random from input space. This is done to ensure that all the samples in the input space are covered and identified in a proper sequence. The procedure to initialise the grid is presented here. Procedure farthest neighbour

5

4

Modified training procedure for SOFM

The reduced data set Ired of d dimension with nred number of samples and standard subset Isub of d dimension with ncon number of samples obtained during pre-processing stage are presented to the SOFM. The initial weight vectors are selected by the farthest neighbour method from the reduced image data set. The algorithm is described here. Algorithm

Experiments

To determine the number of clusters and check the quality of clusters, cluster validity index (Jain and Dubes, 1988) has been used. To measure the performance of the clustering algorithm, we have used Davies Bouldin Index (DBI) (Davies and Bouldin, 1979) and Cluster Tendency Index (CTI) (Sudhanva and Chidananda Gowda, 1992). For good clusters, a minimum value of DBI is preferred and a maximum value of CTI is preferred. We have experimented the modified training scheme on the following data sets: •

Four band synthetic image



Harangi 1991 image



Harangi 1992 image.

All the experimental simulations have been carried out on having Intel P-4 3.3 GHz, 1 GB RAM personal computer.

A modified training scheme for SOFM to cluster multispectral images Figure 1

5.1 Experiment on synthetic image This image has four spectral bands and the size of the image is 256 × 256. The image has three clusters and the features are derived from Iris flower data set. The samples from three classes of Iris flower data set are randomly picked and placed to form different clusters of the image. To obtain reduced set of samples, we first apply the data reduction technique on the image data set. This yielded nine samples in the reduced data set. In the next stage, CNN rule is applied to the reduced set of samples. This results in five samples in the standard subset, which are representative of samples present in the reduced data set for threshold Tnear = 0.39. The reduced data set and standard subset of samples is presented to the modified SOFM. The simulation parameters are shown in Table 4. The results obtained are shown in Figure 1(c) and (d) and Table 5. The clusters obtained for the conventional SOFM are shown in Figure 1(a) and (b). The cluster composition for both conventional SOFM and modified SOFM is shown in Table 6. Table 4

Simulation Parameters for synthetic image

Number Parameter

Value

1

Number of Samples

65536

2

Number of features

4

3

Size of storage bin array

4

Threshold for data reduction Tdr Number of Iterations

6

Number of weight vectors

7

Threshold to obtain subset of samples Tnear

0.39

8

Initial learning constant η(0)

0.9

9

Initial neighbourhood σ(0)

0.9

10

Time constant for neighbourhood Tne

5.0

11

Time constant for learning rate Tle

500

Figure 1

0.1 1500

(a)

Table 5

(b)

(d)

Result of synthetic image

Number Parameter

Value

1

Samples in the reduced data set

9

2

Samples in the standard subset

5

3 4

Number of clusters Time taken by conventional SOFM

3

5

Time taken for data reduction technique

0.71 second

6

Time taken by the modified training scheme

0.04 second

7

Total time taken by the proposed SOFM

0.75 second

Table 6

3

Clusters produced by SOFM and modified SOFM on synthetic image: (a) clusters produced by SOFM; (b) pie chart for SOFM; (c) clusters produced by modified SOFM; (d) pie chart for modified SOFM (see online version for colours)

Clusters produced by SOFM and modified SOFM on synthetic image: (a) clusters produced by SOFM; (b) pie chart for SOFM; (c) clusters produced by modified SOFM; (d) pie chart for modified SOFM (see online version for colours) (continued)

(c)

5×5

5

35

Clusters obtained by conventional SOFM and modified SOFM on synthetic image Conventional SOFM

Cluster

137.32 seconds

Number of sample

Modified SOFM

Percentage

Number of samples

Percentage

Cluster 1

43323

66.1056

43323

66.1057

Cluster 2

13945

21.2783

11295

17.2348

Cluster 3

8268

12.6159

10918

16.6595

The results are compared with ground truth information in Table 7. It is found that the clusters produced by conventional SOFM have an accuracy of 100%, 95.98% and 71.51% for clusters 1, 2 and 3, respectively. The overall accuracy of clusters obtained is 94.54%, whereas modified SOFM produced clusters with an overall accuracy of 100%. The time taken to obtain the reduced data set is 0.71 s and training time taken by modified SOFM is 0.04 s. The total execution time taken by modified SOFM is 0.75 s whereas the conventional SOFM took 137.32 s. The DBI and CTI values for conventional SOFM are 0.271626 and 34.1612, respectively. The DBI and CTI values for modified SOFM are 0.560911 and 43.9303, respectively.

36 Table 7

T.N. Nagabhushan and D.S. Vinod Comparison of clusters obtained by SOFM and modified SOFM on synthetic image Conventional SOFM

Cluster

Number of samples matched

Modified SOFM

Number of samples mismatched

Number of samples matched

Number of samples mismatched

% error

0

43323

0

0

% error

Cluster 1

43323

Cluster 2

10834

461

4.0815

11295

0

0

Cluster 3

7807

3111

28.4942

10918

0

0

61964

3572

5.4504

65536

0

0

Total

0

Figure 2

5.2 Harangi 1991 image A registered satellite image acquired on 3 February 1991 covering Harangi reservoir and its adjacent area of Coorg district of Karnataka state in India has been considered. The image is acquired from LISS-II B1 sensor of IRS 1A satellite. The size of the image is 260 × 260 and has four spectral bands. Initially, data reduction procedure is applied to get reduced samples with 270 samples. A standard subset of samples containing 25 samples are derived using modified CNN rule. The simulation parameters are shown in Table 8. Table 8

Simulation Parameters for Harangi 1991 image

Number Parameter

Value

1 2 3

Number of samples Number of features Size of storage bin array

67600 4

4 5 6 7 8

Threshold for data reduction Tdr Number of Iterations Number of weight vectors Threshold to obtain subset of samples Tnear

9 10 11

Initial learning constant η(0) Initial neighbourhood σ(0) Time constant for neighbourhood Tne Time constant for learning rate Tle

1 2 3 4 5 6 7

(b)

(c)

(d)

0.9 500 500

Result of Harangi 1991 image

Number Parameter

(a)

25 × 25 2.0 3600 7 3 0.9

The results obtained are shown in Table 9. The clusters produced by modified SOFM are shown in Figure 2(c) and (d), and the cluster composition is shown in Table 10. The clusters produced by conventional SOFM are shown in Figure 2(a) and (b), and the cluster composition is shown in Table 10. Table 9

Clusters produced by SOFM and modified SOFM on Harangi 1991 image: (a) Clusters produced by SOFM; (b) Pie chart for SOFM; (c) Clusters produced by modified SOFM; (d) Pie chart for modified SOFM (see online version for colours)

Value

Samples in the reduced data set 270 Samples in the standard subset 25 Number of clusters 7 Time taken by conventional SOFM 945.77 seconds Time taken for data reduction technique 0.68 second Time taken by the modified training 3.3 seconds scheme Total time taken by the proposed 3.98 seconds SOFM

Table 10

Cluster composition of Harangi 1991 image produced by SOFM and modified SOFM Conventional SOFM

Cluster

Samples

Modified SOFM

Percentage

Samples

Percentage 33.51

Cluster 1

7

0.0104

22654

Cluster 2

11097

16.4157

8539

12.6317

Cluster 3

18580

27.4852

223

0.3299

Cluster 4

11893

17.5932

5

0.0074

Cluster 5

4921

7.2796

5873

8.6879

Cluster 6

13627

20.1583

1735

Cluster 7

7475

11.0577

28571

2.5666 42.26

The four major ground covers generated by modified SOFM are water, forest, coffee and crop shown in Table 11. Water body showing Harangi reservoir is conspicuous among the ground covers. Water, forest and crop are represented by clusters 2, 7 and 5, respectively. Coffee is represented

A modified training scheme for SOFM to cluster multispectral images by clusters 1 and 6. Clusters 3 and 4 are insignificant and cannot be considered as major ground covers. The four major ground covers generated by conventional SOFM are water, forest, coffee and crop shown in Table 11. Water and crop are represented by clusters 7 and 5, respectively. Forest is represented by clusters 4 and 6 and coffee is represented by clusters 2 and 3. Cluster 1 is not significant and hence cannot be identified as a major ground cover. From Table 11, the percentage of ground covers is 11.05%, 37.75%, 43.9% and 7.28% for water, forest, coffee and crop, respectively, for conventional SOFM. In case of modified SOFM, the percentage of ground covers for water, Table 11

37

forest, coffee and crop is 12.63%, 45.22%, 33.11% and 8.68%, respectively. Only a small variation in number of samples per ground cover is observed. The execution time of conventional SOFM is 945.77 s. The time taken to obtain the reduced data set is 0.68 s and training time taken by modified SOFM is 3.3 s hence the total execution time of modified SOFM is 3.98 s. Hence, the execution of modified SOFM is much faster than that of conventional SOFM. The DBI and CTI values for the modified SOFM are 0.96427 and 90.7235, respectively. The DBI and CTI values for conventional SOFM are 0.81755 and 101.363, respectively.

Major ground covers for Harangi 1991 image generated by SOFM and modified SOFM Conventional SOFM

Ground cover Water

Cluster 7

Modified SOFM

Samples

Percentage

7475

11.0576

Cluster

Samples

Percentage

2

8539

12.6317

Forest

4, 6

25520

37.7514

7

28571

42.26

Coffee

2, 3

29677

43.9008

1, 6

24389

36.07

5

4921

7.2795

5

5873

8.6879

Crop

5.3 Experiment on harangi 1992 image A registered satellite image acquired on 27 December 1992 covering Harangi reservoir and its adjacent area of Coorg district of Karnataka state in India has been considered. The image is acquired from LISS-II B1 sensor of IRS 1B satellite. The size of the image is 260 × 260 and has four bands. The data reduction yielded 294 samples in reduced data set and CNN rule yielded 25 samples in the reduced data set. Table 12 indicates the simulation parameter for the data set selected. Figure 3(a) and (c) depicts the cluster generated by SOFM and modified SOFM, respectively. Results obtained is shown in Table 13. Conventional SOFM was tried for 7 clusters and the number of iterations was 3600, whereas modified SOFM produced 10 clusters for 5100 iterations. The cluster compositions are shown in Table 14. The modified SOFM generated four major ground covers namely forest, water, coffee and crop. Forest is represented by clusters 3, 6 and 9. Water, coffee and crop are represented by clusters 4, 2 and 7, respectively. Water body showing Harangi reservoir is prominent among the clusters obtained for all the three methods. Other clusters of insignificant size are not considered for identification of major ground covers. The four major ground covers generated by conventional SOFM are water, forest, coffee and crop as depicted in Table 15. Water and crop correspond to clusters 1 and 2, respectively. Coffee is represented by clusters 4 and 5. Forest is represented by clusters 3 and 7. As the number of samples in cluster 6 is small, it cannot be considered as a major ground cover. From Table 15, the percentage of ground covers is 13.5%, 25.75%, 49.97% and 10.06% for water, forest,

coffee and crop, respectively, for modified SOFM. In conventional SOFM, the percentage of ground covers for water, forest, coffee and crop is 12.55%, 23.81%, 49.97% and 13.18%, respectively, as shown in Table 15. Only a small variation in number of samples per ground cover is observed. The time taken to obtain the reduced data set is 0.65 s and training time taken by modified SOFM is 6.3 s hence the total execution time of modified SOFM is 6.95 s. The execution time of conventional SOFM is 924.23 s hence the execution of modified SOFM is faster when compared with the conventional SOFM. The DBI and CTI values obtained for the modified SOFM are 0.797923 and 280.181, respectively. For the conventional SOFM, the values are 0.814746 and 114.939, respectively. Table 12 Simulation parameters for Harangi 1992 image Number Parameter

Value

1

Number of samples

67600

2

Number of features

4

3

Size of storage bin array

4

Threshold for data reduction Tdr

25 × 25

5

Number of Iterations

6

Number of weight vectors

7

Threshold to obtain subset of samples Tnear

8

Initial learning constant η(0)

0.9

9

initial neighbourhood σ(0)

0.9

10

Time constant for neighbourhood Tne

1000

11

Time constant for learning rate Tle

1000

2.0 3600 or 5100 7 or 10 3

38

T.N. Nagabhushan and D.S. Vinod

Figure 3

Clusters produced by SOFM and modified SOFM on Harangi 1992 image: (a) Clusters produced by SOFM; (b) Pie chart for SOFM; (c) Clusters produced by modified SOFM; (d) Pie chart for modified SOFM (see online version for colours)

Table 13

Result of Harangi 1992 image

Number Parameter

Value

1 2

Samples in the reduced data set Samples in the standard subset

3

Number of clusters

4

Time taken by conventional SOFM

10

Time taken for data reduction technique

0.65 second

6

Time taken by the modify ed training scheme Total time taken by the proposed SOFM

6.3 seconds

Table 14

(c) Table 15

Cluster

(d)

6.95 seconds

Cluster composition for Harangi 1992 image produced by SOFM and Modified SOFM Conventional SOFM

(b)

923.23 seconds

5

7

(a)

294 25

Samples

Percentage

Modified SOFM Samples

Percentage

Cluster 1

8487

12.5547

4

Cluster 2

8914

13.1864

33785

0.0059

Cluster 3

12426

18.3817

11490

16.9970

Cluster 4

19194

29.3935

9131

13.5074

Cluster 5

11939

17.6612

1968

2.9112

Cluster 6

2966

4.3876

2744

4.0592

Cluster 7

3674

7.15

49.97

5.4349

4834

Cluster 8





453

0.6701

Cluster 9





3178

4.7012

Cluster 10





13

0.0192

Major ground covers generated by SOFM and modified SOFM for Harangi 1992 image Conventional SOFM

Ground cover

Modified SOFM

Cluster

Sample

Percentage

Cluster

Sample

Percentage

Water

1

8457

1205547

4

9131

13.5074

Forest

3, 7

16100

23.8165

3, 6, 9

17412

25.7573

Coffee

4, 5, 6

34099

50.44

2

33785

49.97

2

8914

13.18

5, 7

6802

10.06

Crop

Comparing results of Harangi 1991 and Harangi 1992, a small rise in percentage of water in the year 1992 can be observed. A rise in percentage of coffee plantation is also seen. A drastic reduction in the percentage of forest area is also observed. From this, it can be inferred that the cultivation of coffee and other crops has led to the deforestation in a span of one year.

6

Conclusion

In conventional training scheme, the data set is directly applied to train SOFM. This presents a serious limitation in terms of time and memory requirements when clustering multispectral image data. From experiments, it is found that a large memory is required to store clusters of images in conventional SOFM.

We have proposed a modified training scheme for SOFM to overcome the limitations of conventional SOFM. Modifications have been proposed to train SOFM using farthest neighbourhood initialisation of weight vectors followed by BMU selection based on CNN rule and BMU and neighbourhood neurons are updated using reduced data set samples. In the modified SOFM, the original image data is drastically reduced using data reduction technique. Hence, very less memory is required to obtain clusters of images. It is found that the time required to obtain clusters using modified SOFM is significantly less than that of conventional SOFM. Experiments have clearly demonstrated that the modifications introduced to train SOFM have resulted in faster convergence while maintaining improved accuracy among different clusters.

A modified training scheme for SOFM to cluster multispectral images

References Chitroub, S., Houacine, A. and Sansal, B. (2001) ‘Principal component analysis of multispectral images using neural network’, ACS/IEEE International Conference on Computer Systems and Applications, Beirut, Lebanon, pp.89–95. Davies, D.L. and Bouldin, D.W. (1979) ‘A cluster separation measure’, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-1, No. 2, pp.224–227. De Backer, S., Naud, A. and Scheunders, P. (1998) ‘Non-linear dimensionality reduction techniques for unsupervised feature extraction’, Pattern Recognition Letters, Vol. 19, pp.711–720. Gowda, K.C. (1984) ‘A feature reduction and unsupervised classification algorithm for multispectral data’, Pattern Recognition, Vol. 17, No. 6, pp.667–676. Gowda, K.C. and Krishna, G. (1979) ‘The condensed nearest neighbor rule using the concept of mutual nearest neighborhood’, IEEE Transactions on Information Theory, Vol. IT-25, No. 4, pp.488–490. Jagannathan, S., Nagabhushan, P., Gowda, K.C. and Rajangam, R.K. (1996) ‘A number theory based image coding’, Australian and New Zealand Conference on Intelligent Information Systems, Adelaide, Australia, pp.147–150. Jain, A.K. and Dubes, R.C. (1988) Algorithms for Clustering Data, Prentice-Hall, Englewood Cliffs, New Jersey. Kohonen, T. (1982a) ‘Analysis of a simple self-organizing process’, Biological Cybernetics, Vol. 44, pp.135–140. Kohonen, T. (1982b) ‘Self-organized formation of topologically correct feature maps’, Biological Cybernetics, Vol. 43, pp.59–69. Kohonen, T. (1987) ‘Adaptive, associative and self-organizing functions in neural computing’, Applied Optics, Vol. 26, No. 3, pp.4910–4918. Kohonen, T.K. (1993) ‘Things you haven’t heard about the self-organizing map’, Proceeding of IEEE International Conference on Neural Networks, San Francisco, California, pp.1147–1156. Kohonen, T.K., Kangas, J.A. and Laaksonen, T. (1990) ‘Variants of self-organizing maps’, IEEE Transactions on Neural Networks, Vol. 1, No. 1, pp.93–99.

39

Lu, D. and Weng, Q. (2007) ‘A survey of image classification methods and techniques for improving classification performance’, International Journal of Remote Sensing, Vol. 5, No. 28, pp.823–870. Mielikainen, J. and Kaarna, A. (2002) ‘Improved back end for integer PCA and wavelet transforms for lossless compression of multispectral images’, Proceedings 16th International Conference on Pattern Recognition, Vol. 2, Quebec, Canada, pp.257–260. Roli, F., Serpico, S.B. and Vernazza, G. (1996) Fuzzy Logic and Neural Network Handbook, McGraw-Hill, pp.1501–1528. Srinivas, V.V., Shivam Rao, T., Rao, R.A. and Govindaraju, S. (2008) ‘Regional flood frequency analysis by combining self-organizing feature map and fuzzy clustering’, Journal of Hydrolog, Vol. 348, pp.148–156. Subba Reddy, N.V., Nagabhushan, P. and Gowda, K.C. (1996) ‘A neural network based expert system model for conflict resolution’, Australian New Zealand Conference on Intelligent System, Adelaide, Australia, pp.229–232. Sudhanva, D. and Chidananda Gowda, K. (1992) ‘Dimensionality reduction using geometric projections: a new technique’, Pattern Recognition, Vol. 25, No. 8, pp.809–817. Tate, S.R. (1994) ‘Band ordering in lossless compression of multispectral images’, Data Compression Conference, DCC ‘94, Proceedings, pp.311–320. Tran, T.N., Wehrens, R. and Buydens, L.M.C. (2005) ‘Clustering multispectral images: a tutorial’, Journal of Chemometrics and Intelligent Laboratory Systems, Vol. 77, pp.3–17. Vasilyev, S.V. (1997) ‘An optimal data loss compression technique for remote surface multiwave mapping’, Astronomical Data Analysis Software and Systems VII ASP Conference Series, Vol. 145, pp.133–136. Wehrens, R., Buydens, L.M.C., Fraley, C. and Raftery, A.E. (2004) ‘Model-based clustering for image segmentation and large datasets via sampling’, Journal of Classification, Vol. 21, No. 2, pp.231–253. Zhu, C. and Po, L.M. (1996) ‘Partial distortion sensitive competitive learning algorithm for optimal codebook design’, Electronics Letters, Vol. 32, No. 19, pp.1757–1758.

A modified training scheme for SOFM to cluster ...

the University of Mysore and Master's in Electrical Engineering at Indian Institute of Science. He obtained his PhD Degree from Indian Institute of Science in the area of constructive learning RBF networks. He is the chairman of Information Science and Engineering at. Sri Jayachamarajendra College of Engineering, Mysore, ...

3MB Sizes 2 Downloads 195 Views

Recommend Documents

Cluster Training August -Instructions & Monitoring Format.pdf ...
Retrying... Cluster Training August -Instructions & Monitoring Format.pdf. Cluster Training August -Instructions & Monitoring Format.pdf. Open. Extract. Open with.

pdf-2351\mitam-a-modified-ict-adoption-model-for-developing ...
... loading more pages. Retrying... pdf-2351\mitam-a-modified-ict-adoption-model-for-dev ... ing-in-a-developing-country-by-mohamed-elsaadani.pdf.

A modified method for detecting incipient bifurcations in ...
Nov 6, 2006 - the calculation of explicit analytical solutions and their bifurcations ..... it towards c using polynomial regression and call it the DFA- propagator .... Abshagen, J., and A. Timmermann (2004), An organizing center for ther- mohaline

A Modified Complex Permittivity Measurement ...
Frequency-domain, MathCADTM, Microwave ... [100]. Some of the microwave measurement techniques are simple and .... using the microwave free-space.

A Model-Free Unsupervised Method to Cluster Brain ...
This DWI data was acquired on a 3T Siemens scanner, with 61 ... white-matter single-tract voxels (cortical-spinal tract). ... tissue classes, such as the cortical and sub-cortical grey matter (light blue, dark blue) and ventricles/CSF (pink), whilst.

Cluster Processes, a Natural Language for Network ...
allowing the model to be used to predict traffic properties as networks and traffic evolve. ... key properties as a function of meaningful network parameters. They .... at quite high utilisations and be a useful dimensioning tool for core networks. .