136

MODIFIED KOHONEN LEARNING NETWORK AND APPLICATION IN CHINESE CHARACTER RECOGNITION +RQJ&DRDQG$OH[&.RW School of Electrical and Electronics Engineering Nanyang Technological Univ., Singapore [email protected] ABSTRACT Normal multilayer neural network is rarely used to solve pattern match problem of large scale without grouping classes and creating sub-networks. In this paper, a modified single-layer Kohonen learning network structure based on generalized learning vector quantization (GLVQ) theory is proposed. By cascading two of the proposed learning networks in handwritten Chinese character recognition, training, pre-classification and final recognition processes are easily integrated. Experiments conducted with off-line handwritten samples show the efficiency of the network.

In this paper, an adaptive handwritten Chinese character recognition system using modified Kohonen learning networks (MKLN) is presented to consistently tackle the pattern recognition problem of large scale. Section 2 describes a Gabor feature extraction technique. Section 3 elaborates on structure and functionality of the proposed network. Section 4 illustrates a cascaded MKLN network with 2 sub-networks used for pre-classification and recognition respectively. Section 5 shows some of the experiments conducted and the corresponding results. 2. GABOR FEATURE EXTRACTION

1. INTRODUCTION Handwritten Chinese character recognition is a typical pattern recognition problem of large scale (PLPLS) due to large number of character classes and many similar character patterns. The Chinese dictionary Zhong Hua Zi Hai published in 1994 contains over 86,000 characters and the number of Characters is still increasing. However, to design a decent recognition system for simplified Chinese characters, only the most frequently used 3,755 characters in GB2382-10 1st set needs to be included. The GB2382-10 1st set are reportedly to cover 99.9% occurrences of Chinese character in China’s mainland. In recent years, numerous pattern matching approaches using high-dimensional statistical features have achieved reasonably stable result in off-line Chinese character recognition. However, among these approaches, multilayer neural network (NN) has been rarely applied. This is due to the capacity limit of a normal multilayer neural network. A few existing papers using NN [1] has to divide Chinese characters into smaller subsets prior to NN training. However, doing this poses another issue, i.e. some robust standards must be defined to pre-group Chinese characters and the standard must be robust for the pre-classification. Solving this issue may be cumbersome if complicated algorithm and many extra features are used, otherwise, the pre-classification could hardly be robust. ___________________________________________ 0-7803-8560-8/04/$20.00©2004IEEE

(a) W A

L: Stroke Length W: Stroke Width A: Stroke Area

(b)

Figure 1: (a) Histogram study of stroke width (b) Average stroke width estimation model In recent study [2], 4-orientational Gabor feature has been proved superior to the DEF feature in Chinese character recognition. In this paper, to obtain Gabor features, we adopt a standard form of Gabor filter expression as follows [4]:

137   1 ¦£ R 2 + R 2 ¦²¯ f ( x, y, Rk , M, T ) = exp ¡¡ ¤¦ 1 2 2 »¦°° ¦¦° 2 ¦¥¦ T ¼± ¢¡

(1)

¦£ 2Q R1 ¦²¦ ×exp ¦¤i », ¦¥¦ M ¦¼¦

where R1 = xcosșk + ysinșk and R2 = -xcosșk + ysinșk . M and Rk are the wavelength and orientation of sinusoidal 2-D plane wave, respectively. T is the standard deviation which controls the size of the Gaussian envelops.

Prior to Kohonen training, the training samples of each character are clustered into Q mutually exclusive clusters with k-means clustering algorithm, where Q is the number of clusters to be created for each character. By doing so, totally J=QuC clusters are created. After clustering process is complete, each cluster is then assigned a unique cluster identity j (j=1, 2,} J) and all training samples contained in the jth cluster are labeled with j. As shown in Figure 3, the Kohonen layer consists of J processing node, each receiving an input quasi-feature vector of N dimensions. Each processing node in the network corresponds to a character prototype built with a cluster of samples. To save the convergence time, each prototype w j is initialized as the corresponding cluster center with the following formula: 1 wj = Sj

Figure 2: Gabor filtered output image To design a good Gabor filter for Chinese character recognition, we have to choose proper values for M and T . Reference [4] has shown that Gabor filter is the most sensitive to lines of the width O 2 and orientations T k r S 2 . As we know, normalized Chinese character usually has rather uniform stroke line-width. Figure 1 (a) shows histogram the width of the stroke cross-sections from 7,510 normalized samples of 64u64 pixels. A second statistical study is to estimate the stroke width W based on the model in Figure 1(b), where A is measured by counting total number of pixels in the normalized image and L is obtained by counting the number of pixels in the single-pixel skeleton of the normalized character. The estimated average stroke width is 5.6 pixels, hence we set M =11.2 and similar to [4] we set T =5.6 to have optimized filter response. 3. A MODIFIED KOHONEN LEARNING NETWORK To consistently perform training, pre-classification and recognition, here we propose a modified Kohonen learning network (MKLN), which includes multiple prototypes of all C available characters. Kohonen network is originally used as unsupervised competitive learning for data clustering. Here, we make use of its network architecture and modify the output layer of the network to make it a supervised learning network. The remaining part of this section elaborates on how it works.

___________________________________________ 0-7803-8560-8/04/$20.00©2004IEEE

Sj

œx i =1

j

(2)

i

where xij is the ith feature vector of cluster Sj is size of the jth cluster.

j

and

Figure 3: Architecture of a modified Kohonen learning network After the initialization of network, the fine tuning is achieved iteratively in the following steps:

Step 1: Initialize the epoch number t = 1 . Step 2: Randomize the order of all training quasi feature vectors. Step 3: Set 7 g = 1 where g is the cluster label of input quasi-feature vector q and set 7 j = 0 , where j v g . Step 4: Set 7 i = 1 , 7 g = 0 and g = i , if di < d g , where two clusters g and i are from the same character class. Step 5: Initialize vector of training flags z and an intermediate vector W using

138 z j = 0, j = (1, 2,! , J ) , U j = P & j = (1, 2,! , J ) (3)

Step

6:

d =  ¡¢ d1 ,! , d j ,! d J ¯°±

Calculate

T

,

the

distance vector from q to all the character prototypes w1 , w 2 ,! , w J using dj = q-wj

2

q

2

= wj

2

 2qT w j

(4)

Step 7: Find the node whose input distance d m is the Pth smallest among all nodes, where m denotes the node’s cluster identity; Step 8: Use d m as a threshold to obtain vector y at the output layer of the MKLN using £1, if d j b d m ¦ yj =¦ (5) ¤ ¦ ¦ ¥ 0, otherwise Step 9: Set the vector W according to its distance ranking, e.g. U j is set to ‘1’ if d j is smallest among all and y j = 1 , U j is set to ‘2’ if d j is second smallest and y j = 1 . Step 10: Obtain the vector z using ¦£[ j  y j , if d g > d m zj =¦ ¤ ¦ ¦ ¥ 0, otherwise Step 11: Update w j if z j v 0 : w j (t ) = w j (t 1) + B j (q  w j (t 1)) z j

back to step 3. Otherwise increment t by 1 and go back to step 2.

(6)

From steps (4)–(10), the output z j of a node can only possibly be one of these numbers: -1, 0, 1. When zj = 0, w new = w old j j , node j is not updated. When z j z 0 , node j is either moved towards q or away

from q . The major advantage of the above network is its flexibility and stability. Because all the nodes are essentially independent, insignificant nodes are easily removed without affecting the rest of the network. By properly selecting the number P and N, the network can be used for either pre-classification or recognition. 4. CASCADED MKLN NETWORK

To speed up the recognition in PRPLS application, it is usually desirable to pre-select a number of candidates with a small portion of extracted features prior to final recognition. Because the nodes are easily added or deleted from the proposed MKLN during runtime, multiple MKLN networks can be cascaded to perform system training or recognition.

(7)

where B j is the tuning strength in >0,1@ . To ensure convergence of the learning, the adaptation rate B j is evaluated in compliance with the GLVQ algorithm [3, 5] as follows: £¦ B0 v j Dj ¦¦ , if j v g 2 ¦¦ U t ( D j + Dm ) ¦¦ Bj =¤ (8) ¦¦ B v Dj 0 l ¦¦ , otherwise ¦¦ t D + D 2 ( l j) ¦¥ where B0 is a constant and D j , v j , q are derived using Dj = d j + q

2

, j = 1,! , J

£ Dg  D j ¦ ¦¦ ¦¦ D + D , if j v g g j μ j = ¦¤ ¦¦ Dg  Dl ¦¦ , otherwise ¦¦ Dg + Dl ¥ 1 H j (t ) = μ t 1+ e j

(9)

v j = H j (t )×(1 H j (t ))

l = arg min( D ) , D = D1 ,! D j ! , DJ j

Step 12: If there is still unused training vectors left, feed the next vector qc into the network and go ___________________________________________ 0-7803-8560-8/04/$20.00©2004IEEE

Figure 4: Cascaded MKLN network In our implementation, two MKLN networks are cascaded together, one for pre-selection of candidates, the other for final recognition. The cascaded MKLN network is illustrated in Figure 4, where MKLNP is used for pre-classification, in which Pp and N p are empirical constants for the pre-classification MKLNP and w iP is a sub-vector of w i ; MKLNR is for final recognition, in which PR 1 and N R N  N P . The vector z P is used to control the tuning of pre-classification representatives w iP when a pre-classification error occurs. If no pre-classification error occurs, the vector y P is used to select the corresponding nodes in the recognition MKLNR and the pre-classification distance vector d P is exported to MKLNR network as well. When pre-classification

139 error happens, the distance vectors d P and z P in the pre-classification MKLN can be used to train the pre-classification prototypes w iP , i 1, 2,! , J , which are the sub-vectors of w i , i

1, 2,! , J

;

when final recognition error happens, the overall Euclidean distance vector d d P  d R and z R are used to train the character prototypes w i , i 1, 2,! , J . Both final recognition and pre-classification requires training to be more accurate. However, training for pre-classification is frequently neglected. The cascaded MKLN network automatically and consistently takes care of both training. The network structure can be extended to solve all PRPLS problems.

5. EXPERIMENTAL RESULTS AND DISCUSSIONS

database contains 81 bitmap samples per character. To efficiently represent character pattern, we extract 256 Gabor features together with 16 cross-count features to form a raw 272-dimensional feature vector. To make the features more discriminant and to reduce the dimensionality, Linear Discriminant Analysis (LDA) has been applied to transform each raw feature vector to a new feature vector q of 128 dimensions, which is referred as quasi feature vector. The cascaded MKLN network in Fig. 4 was used for the overall system training. In the experiment, we set D 0 0.8 and 71 randomly selected samples per character class are used as training samples and the remaining 10 samples per character are used for testing purposes. As shown in Fig. 5, the MKLN training has improved both pre-classification and recognition accuracies significantly. For training samples, the error rate drops by 84.4% for pre-classification from 0.064% to 0.01% and by 97.7% for recognition from 4.35% to 0.1% within 10 epochs of training. For testing samples, both the pre-classification and recognition accuracies improve at the beginning but start dropping after 4 or 5 epochs. This is probably because some badly written training samples or wrongly labeled samples start dominating the system tuning after a few epochs. To prevent this from happening, we need to identify the bad samples manually and quarantine them during the training. Overall, the results obtained prove the effectiveness of proposed learning network.

6. REFERENCES [1] H.J. Lin and S.H. Yen, “A scheme of on-line Chinese character recognition using neural networks”, IEEE Int. Conf. on sys., man and cybern., vol. 4, pp. 3528-3533, Oct. 1997.

(a)

[2] Q. Huo, Z. D. Feng and Y. Ge, “A Study on the Use of Gabor Features for Chinese OCR”, In Proc. Int. Symp. on Intell. Multimedia Video & Speech Signal Process., pp. 389 -392, 2001. [3] A. Sato and K. Yamada, “A Formulation of Learning Vector Quantization Using A New Misclassification Measure”, In Proc. Int. Conf. on Pattern Recog., vol. 1, pp. 322-325, 1998. (b) Figure 5: MKLN training for (a) pre-classification ( PP 100, N P 40 ) (b) recognition ( PR 1, N R 88 ) st

An off-line GB 2312-80 1 set Chinese character database containing 3,755 character classes are used to train and test recognition system. The offline ___________________________________________ 0-7803-8560-8/04/$20.00©2004IEEE

[4] X. Wang, X. Ding and C. Liu, “Optimized Gabor filter based feature extraction for character recognition”, Int. Conf. on Pattern Recog., vol. 4, pp. 223-226, 2002. [5] M. K. Tsay, K. H. Shyu, and P. C. Chang, “Feature transformation with generalized learning vector quantization for hand-written Chinese character recognition”, IEICE trans. inf. & syst., vol.E82-D, 1999.

modified kohonen learning network and application in ... - IEEE Xplore

APPLICATION IN CHINESE CHARACTER RECOGNITION. Hong Cao and Alex C Kot. School of Electrical and Electronics Engineering. Nanyang Technological ...

2MB Sizes 2 Downloads 252 Views

Recommend Documents

NEXT: In-Network Nonconvex Optimization - IEEE Xplore
Abstract—We study nonconvex distributed optimization in multiagent networks with time-varying (nonsymmetric) connec- tivity. We introduce the first algorithmic ...

A Modified Binary Particle Swarm Optimization ... - IEEE Xplore
Aug 22, 2007 - All particles are initialized as random binary vectors, and the Smallest Position. Value (SPV) rule is used to construct a mapping from binary.

Design and Optimization of Multiple-Mesh Clock Network - IEEE Xplore
Design and Optimization of Multiple-Mesh. Clock Network. Jinwook Jung, Dongsoo Lee, and Youngsoo Shin. Department of Electrical Engineering, KAIST.

Spatial-Modulated Physical-Layer Network Coding in ... - IEEE Xplore
Email: [email protected]. Abstract—We consider a spatial modulation (SM)-based physical-layer network coding (PNC) technique with convolu- tional codes ...

Energy Efficient Content Distribution in an ISP Network - IEEE Xplore
The content data is delivered towards the clients following a path on the tree from the root, i.e., the Internet peering point. A storage cache can be located at each node of the network, providing a potential facility for storing data. Moreover, cac

Design and Optimization of Multiple-Mesh Clock Network - IEEE Xplore
at mesh grid, is less susceptible to on-chip process variation, and so it has widely been studied recently for a clock network of smaller skew. A practical design ...

IEEE Photonics Technology - IEEE Xplore
Abstract—Due to the high beam divergence of standard laser diodes (LDs), these are not suitable for wavelength-selective feed- back without extra optical ...

Graph-Based Multiprototype Competitive Learning and ... - IEEE Xplore
Oct 12, 2012 - to deal with high-dimensional data clustering, i.e., the fast graph- based multiprototype competitive learning (FGMPCL) algorithm.

Iterative Learning Control: Brief Survey and Categorization - IEEE Xplore
Index Terms—Categorization, iterative learning control (ILC), literature review. ... introduction to ILC and a technical description of the method- ology. In Section II ...

wright layout - IEEE Xplore
tive specifications for voice over asynchronous transfer mode (VoATM) [2], voice over IP. (VoIP), and voice over frame relay (VoFR) [3]. Much has been written ...

Device Ensembles - IEEE Xplore
Dec 2, 2004 - time, the computer and consumer electronics indus- tries are defining ... tered on data synchronization between desktops and personal digital ...

wright layout - IEEE Xplore
ACCEPTED FROM OPEN CALL. INTRODUCTION. Two trends motivate this article: first, the growth of telecommunications industry interest in the implementation ...

An Ambient Robot System Based on Sensor Network ... - IEEE Xplore
In this paper, we demonstrate the mobile robot application associated with ubiquitous sensor network. The sensor network systems embedded in environment.

Evolutionary Computation, IEEE Transactions on - IEEE Xplore
search strategy to a great number of habitats and prey distributions. We propose to synthesize a similar search strategy for the massively multimodal problems of ...

I iJl! - IEEE Xplore
Email: [email protected]. Abstract: A ... consumptions are 8.3mA and 1.lmA for WCDMA mode .... 8.3mA from a 1.5V supply under WCDMA mode and.

I-MINDS: an application of multiagent system ... - IEEE Xplore
Department of Computer Science and Engineering, University of Nebraska. Lincoln, NE ... ticipate in a virtual classroom rather than passzuely lis- tening to ...

Neural Network Based Macromodels for High Level ... - IEEE Xplore
A simple Back Propagation (BP) algorithm is employed to train a feed-forward neural network with the available data set to find out the weights and biases of the interconnecting layers, and subsequently the neural network is used as a model to determ

Reciprocal Spectrum Sharing Game and Mechanism in ... - IEEE Xplore
resources for CR users' networking services by granting them ... International Workshop on Recent Advances in Cognitive Communications and Networking.

Gigabit DSL - IEEE Xplore
(DSL) technology based on MIMO transmission methods finds that symmetric data rates of more than 1 Gbps are achievable over four twisted pairs (category 3) ...

Deep Learning Guided Partitioned Shape Model for ... - IEEE Xplore
Abstract—Analysis of cranial nerve systems, such as the ante- rior visual pathway (AVP), from MRI sequences is challenging due to their thin long architecture, structural variations along the path, and low contrast with adjacent anatomic structures

Distributed Adaptive Learning of Graph Signals - IEEE Xplore
Abstract—The aim of this paper is to propose distributed strate- gies for adaptive learning of signals defined over graphs. Assuming the graph signal to be ...

IEEE CIS Social Media - IEEE Xplore
Feb 2, 2012 - interact (e.g., talk with microphones/ headsets, listen to presentations, ask questions, etc.) with other avatars virtu- ally located in the same ...