IEEE TRANSACTIONS ON MAGNETICS, VOL. 45, NO. 10, OCTOBER 2009

3741

A Generalized Data Detection Scheme Using Hyperplane for Magnetic Recording Channels With Pattern-Dependent Noise Seiichi Mita and Vo Tam Van Toyota Technological Institute, Hisakata, Tempaku, Nagoya 468-8511, Japan We propose a novel data-detection scheme using support vector machine techniques in the presence of pattern-dependent noise on magnetic recording channels. First, the log-likelihood ratios (LLRs) of data series were generated using the Bahl–Cocke–Jelinek–Raviv algorithm. Second, these LLRs were mapped to a 3-D space, and hyperplanes for data discrimination were generated using the radialbasis-function kernel. Third, the LLR of each bit was rescaled on the basis of the distance from the hyperplanes and then fed to an LDPC decoder. We evaluated the performance of the proposed method by retrieving a real data series from a perpendicular magnetic recording channel, and obtained a bit-error rate of approximately 10 3 . For projective geometry–low-density parity-check codes with a code rate of 0.93, the proposed method can reduce the iteration number for a sum product algorithm using conventional LLRs by approximately half. Index Terms—Hyperplane, partial response, projective geometry–low-density parity-check (PG–LDPC) codes, support vector machine (SVM).

I. INTRODUCTION HE choice of a data detection method is a key consideration in the implementation and development of hard-disk drives (HDDs) with higher recording density and higher data reliability. An increase in areal recording density has caused a severe deterioration in signal performance, such as from patterndependent noise and nonlinear distortion, and many methods have been proposed to overcome these problems [1]–[4]. In particular, considerable progress has been made in the theoretical study of maximum a posteriori (MAP) and maximumlikelihood detectors, which can be applied to intersymbol-interference (ISI) and pattern-dependent noise channels [5]. This method can reduce the data errors quite effectively at the cost of complexity. However, this method can only be applied to a 1-D data series. The signal retrieved from a magnetic disk with a very narrow track width in the near future will suffer significant deterioration from 2-D pattern dependency. In this environment, each combination of recording patterns needs its own appropriate threshold for data discrimination. Thus, we focused our attention on the search for a more general method, in order to reduce the errors due to multidimensional pattern-dependent noise. The realization of a novel detection method requires a technique by which curved surfaces for thresholds can be arbitrarily generated. The introduction of the support vector machine (SVM) [6] to the data detection process may provide an answer to this requirement. We already checked the effectiveness of using SVM by simulation [7]. This method can intuitively and directly reflect the pattern dependency of the noise included in a data series. In this study, we investigated the feasibility of a novel data detection method based on the radial basis function (RBF) kernel using a real perpendicular recording data series. Moreover, we evaluated the performance of the proposed

T

Manuscript received March 05, 2009; revised April 26, 2009. Current version published September 18, 2009. Corresponding author: S. Mita (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMAG.2009.2023236

Fig. 1. Signal flow for performance evaluation.

method by applying it to low-density parity-check (LDPC) decoding. This paper is organized as follows. Section II shows the outline of the proposed method. Section III shows the results of data detection by using the proposed method. Section IV shows the results of its application to iterative detection. Finally, the paper is concluded in Section V. II. OUTLINE OF MEASUREMENT PROCESS AND SVM A. Measurement Process The signal flow for the performance evaluation is shown in Fig. 1. Analog signals retrieved from a read channel are sampled at adequate timing instants, fed to a 6-b analog-to-digital converter (ADC), and then subjected to a simple dc restoration process that uses signal envelope detection. In the next step, these data series are equalized to partial response class1 (PR1) , using or class3 (PR3), a channel response of a nine-tap transversal filter. Then, log-likelihood ratios (LLRs) are calculated by applying the BCJR algorithm [8] to the PR1 or PR3 channel. In the case of additive noise, LLRs are independent of each other. However, in the case of pattern-dependent noise, LLRs are dependent on the adjacent bits. Let LR(nT), where denote the LLR series for each bit and T is a bit period. For simplicity, we consider a combination of three-bit values . These combinations are mapped to the 3-D LLR’s space by assigning the first bit of the three-bit as the axis, the second bit as the axis, and the third bit as the axis for the coordinates. We evaluated two kinds of data series—Data1 and Data2—shown in Table I. These data

0018-9464/$26.00 © 2009 IEEE Authorized licensed use limited to: Toyota Technological Institute. Downloaded on October 18, 2009 at 22:15 from IEEE Xplore. Restrictions apply.

3742

IEEE TRANSACTIONS ON MAGNETICS, VOL. 45, NO. 10, OCTOBER 2009

TABLE II STANDARD DEVIATION OF LLR’S DISTRIBUTION OF EACH PATTERN

Fig. 2. Distribution of LLRs in 3-D LLR space.

Fig. 3. Top view of LLRs’ distribution.

TABLE I BER OF THE MEASURED DATA SERIES

series, which had a bit-error rate (BER) of around 10 , were obtained by increasing the linear recording density of a 2.5-in hard-disk drive (HDD). The data transfer rate for this data series was around 600 Mb/s, and the linear recording density was was obtained nearly 1000 kfci. For Data1, a BER of for the PR1 detector and a BER of for the PR3 detector. Data 2 showed a BER of approximately 10 for both detectors. An example of the LLRs’ distribution is shown in Fig. 2. A top view of the 3-D space of the LLRs’ distribution is shown in Fig. 3. It can be easily observed that the distributions of the “000” and “111” three-bit patterns were different from those of “010” and “101.” The former patterns were scattered uniformly. On the other hand, the latter patterns, which include at least two flux changes, have strong correlations between the axes. Table II shows the standard deviation of LLR’s distribution of each pattern. Each deviation was normalized by the distance between the center of each distribution and an origin. Each pattern has a different standard deviation from the others. In particular, the “010” and “101” patterns have larger standard deviations than the other patterns. B. Support Vector Machine (SVM) The SVM (see WEB) is a classifier introduced by Vapnik [6], which can realize the same performance as the so-called artificial neural networks (ANNs) for classification. Generally, neural networks have the problem of a local minimum. On the other hand, the SVM is mathematically transparent, and can provide global and unique solutions. There are two types of SVMs: 1) linear and 2) nonlinear. We will focus mainly on the use of

Fig. 4. Basic concept of SVM.

the nonlinear SVM. With an appropriate nonlinear mapping to a sufficiently high dimension, data from the two categories can always be separated by a hyperplane. However, this requires several computations. The kernel trick invented by Vapnik makes it possible to work in high-dimensional feature spaces without actually having to perform explicit computations in this space, as shown in Fig. 4. A linear SVM is a classifier having a discriminant function , a linear combination of the components of x (i.e., hyperplane for data separation), given as (1) where denotes the data vector, namely a three-bit combination in an LLR’s space, is the weight vector, and b is the , with bias weight. Consider a given training set of class labels , and is input data , output data is represented by a the sample number. The weight vector linear combination of this training data set (2) In this case, the linear classifier

is defined as (3)

values are set to zero through Several of the resulting quadratic programming. Therefore, the computation of the data classification can be remarkably reduced. Vectors of nonzero values are known as support vectors. This implies that the decision boundary is uniquely determined by the support vectors and this decision boundary is called a hyperplane. It should be noted that the distance between the hyperplane and each support vector is normalized to 1. A nonlinear classifier

Authorized licensed use limited to: Toyota Technological Institute. Downloaded on October 18, 2009 at 22:15 from IEEE Xplore. Restrictions apply.

MITA AND VAN: GENERALIZED DATA DETECTION SCHEME

can easily be achieved by replacing kernel kernel trick

3743

with the RBF , thanks to the

(4) where

is a constant. Fig. 5. Example of hyperplane between “000” and “010.”

III. GENERATION OF HYPERPLANE USING SVM In this section, we explain the generation of hyperplanes for data discrimination. In a three-bit combination, we use the center bit for data discrimination. We assume that only one pattern-dependent error event occurs in the cases that follow. 1) One bit of data is slipped to an adjacent bit period. 2) One bit of data is expanded to an adjacent bit period. 3) One bit of data is split to both adjacent bit periods. 4) One bit of data is erased. In these situations, the number of hyperplanes necessary for data discrimination is limited to nine surfaces (S1–S9), as follows. A “010” pattern makes the neighboring surfaces have patterns, such as “000” (S1), “001” (S2), “100” (S3), and “101” (S4). It should be noted that no error occurs for the center bit of a three-bit combination when the “010” pattern changes to “011,” “110,” or “111.” Similarly, a “101” pattern makes the neighboring surfaces have patterns such as “011” (S5), “110” . Moreover, a “100” pattern (S6), “111” (S7), and “010” makes the neighboring surface between a “110” pattern (S8). Similarly, a “001” pattern makes the neighboring surface between a “011” pattern (S9). In order to generate adequate hyperplanes for data discrimination, we carefully selected training data samples that satisfied the following conditions: 1) the distance between the center of each distribution and the selected training sample was larger than the value of each standard deviation; 2) the training data samples did not include error bits; 3) the values of the training data samples were rescaled so that most of the error bits were included within the distance of from a hyperplane. In particular, condition 1) was important for generating adequate hyperplanes that included training data samples with very large LLRs, because of the use of the nonlinear kernel SVM. An example of a hyperplane that separates the “000” and “010” patterns is shown in Fig. 5. We can clearly see that an adequate surface for the discrimination of the two patterns is different from a surface parallel to the axis (i.e., a conventional one). The numbers of errors that occurred between various patterns in Data1 (132 000 b) are presented in Table III. PR3 was used as a channel response. The total number of errors was slightly reduced by using adequate hyperplanes, even if the number of errors for some patterns increased. From these results, we could say that the LLRs rescaled by the distance from the hyperplanes could reduce the bias due to the pattern dependence included in the conventional LLRs. The frequency distributions of the conventional LLRs

Fig. 6. Comparison of LLRs’ distributions. (a) Conventional plane. (b) Hyperplanes.

TABLE III NUMBER OF ERRORS INCLUDED IN DATA1

and the rescaled LLRs are presented in Fig. 6(a) and (b). The conventional one is normalized by the average LLRs. Moreover, Table IV shows the numbers of correct and incorrect LLRs in the conventional plane and the hyperplanes. This comparison uses Data1 and PR3. The incorrect LLRs cause errors. We can see that the conventional LLRs included three times the number of correct samples than those of the hyperplanes for nearly the same number of errors. In other words, the conventional LLRs included numerous ambiguous samples. Therefore, we can expect the use of the new LLRs to reduce the number of iterations in the LDPC decoding process.

Authorized licensed use limited to: Toyota Technological Institute. Downloaded on October 18, 2009 at 22:15 from IEEE Xplore. Restrictions apply.

3744

IEEE TRANSACTIONS ON MAGNETICS, VOL. 45, NO. 10, OCTOBER 2009

TABLE IV COMPARISON OF NUMBERS OF CORRECT AND INCORRECT LLRS

Fig. 7. Reduction of the iteration number using LLRs rescaled by the distance from the hyperplane generated by the SVM. (a) PG–LDPC code with a code rate of 0.95. (b) PG–LDPC code with a code rate of 0.93.

applied to the PG–LDPC code with a code rate of 0.949, all of the errors could be corrected after seven or nine iterations. In these figures, we used two kinds of hyperplanes. HP1 (the best case) was generated from a data series that included several errors in one sector. On the other hand, HP2 (the worst case) was generated from a data series that included several tens of errors in one sector. The BER of HP2 was slightly worse than that of PR3 before the iteration and at the first iteration. Nevertheless, the error-correcting capability outperformed that of PR3 after several iterations. This fact is proof of the robustness of the proposed method. The PG–LDPC code with a code rate of 0.931 could correct all of the errors. The number of iterations of HP1 could be reduced by approximately half, compared to that for PR3. The performance improvement for PR1 was nearly the same as that of PR3. As a comparison, the performance improvement results of a four-state PR1 with branch-metric paths for a pattern-dependent noise (PDN) reduction [3] using the same real data and LDPC codes are shown in Fig. 8, [10]. The use of the four-state PR1 reduced BER by more than one order. However, this did not reduce the number of iterations for Data1. V. CONCLUSION The proposed method using SVM was able to reduce the number of iterations of the sum product algorithm using conventional LLRs by approximately half for a data series with a BER of approximately when using PR3 and the shortened PG–LDPC code with a code rate of 0.93. This study only used a 3-D space. Due to the nature of the SVM, the dimensions could easily be expanded to more than three. ACKNOWLEDGMENT This work was supported in part by Kakenhi No. 19560398, in part by the Storage Research Consortium (SRC), and in part by the NEDO project. The authors would like to thank student F. Haga, who assisted with the data analysis. REFERENCES

Fig. 8. BER of a four-state PR1 with PDN reduction paths.

IV. PERFORMANCE EVALUATION A. PG–LDPC Code In this evaluation, we used a type-2 PG–LDPC code [9] with a code length of 5797, redundant bit length of 296, and code rate of 0.949. Moreover, we estimated another code, the shortened PG–LDPC code, which had a code length of 4296. The code rate of the shortened PG–LDPC code was set to 0.931. As is well known, these LDPC codes do not include four cycle. We used the sum-product algorithm (SPA) as a decoding algorithm. B. Performance Evaluation of Error-Correcting Codes We evaluated the performance of these codes for Data1. Fig. 7(a) and (b) shows examples of the decoding performance of both the PG–LDPC code and the shortened code. When we used conventional LLRs for Data1, the PG–LDPC code with a code rate of 0.949 could not correct all of the errors, even after nine iterations. On the other hand, when rescaled LLRs were

[1] J.-G. Zhu and H. Wang, “Noise characteristics of interacting transitions in longitudinal thin film media,” IEEE Trans. Magn., vol. 31, no. 2, pp. 1065–1070, Mar. 1995. [2] J. Moon and J. Park, “Pattern-dependent noise prediction in signal dependent noise,” IEEE J. Sel. Areas Commun., vol. 19, no. 4, pp. 730–743, Apr. 2001. [3] S. Mita, “A robust detector based on a combination of PR1 and EEPR4 for perpendicular magnetic recording,” IEEE Trans. Magn., vol. 42, no. 10, pp. 2567–2569, Oct. 2006. [4] Z. Wu, P. H. Siegel, J. K. Wolf, and N. Bertram, “Mean-adjusted pattern-dependent noise prediction for perpendicular recording channels with nonlinear transition shift,” IEEE Trans. Magn., vol. 44, no. 11, pp. 3761–3764, Nov. 2008. [5] A. Kavcic and J. M. Moura, “The Viterbi algorithm and Markov noise memory,” IEEE Trans. Inf. Theory, vol. 46, no. 1, pp. 291–301, Jan. 2000. [6] V. Vapnik, The Nature of Statistical Learning Theory. Berlin, Germany: Springer-Verlag, 1995. [7] S. Mita, “A generalized data discrimination scheme using kernel machines in the presence of pattern-dependent noise,” J. Magn. Magn. Mater., vol. 287, pp. 426–431, 2005. [8] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Trans. Inf. Theory, vol. IT-20, no. 2, pp. 284–287, Mar. 1974. [9] Y. Kou, S. Lin, and M. Fossorier, “Low-density parity-check codes based on finite geometries: A rediscovery and new results,” IEEE Trans. Inf. Theory, vol. 47, no. 7, pp. 2711–2736, Nov. 2001. [10] S. Mita and H. Matsui, “Performance comparison of various error correcting strategies using perpendicular magnetic recording data series,” InterMag Dig., p. 1511, 2008, (HT03).

Authorized licensed use limited to: Toyota Technological Institute. Downloaded on October 18, 2009 at 22:15 from IEEE Xplore. Restrictions apply.

A Generalized Data Detection Scheme Using Hyperplane ... - CiteSeerX

Oct 18, 2009 - We evaluated the performance of the proposed method by retrieving a real data ..... improvement results of a four-state PR1 with branch-metric.

542KB Sizes 3 Downloads 260 Views

Recommend Documents

A Generalized Data Detection Scheme Using ... - Semantic Scholar
Oct 18, 2009 - We evaluated the performance of the proposed method by retrieving a real data series from a perpendicular magnetic recording channel, and obtained a bit-error rate of approximately 10 3. For projective geometry–low-density parity-che

A Survey on Brain Tumour Detection Using Data Mining Algorithm
Abstract — MRI image segmentation is one of the fundamental issues of digital image, in this paper, we shall discuss various techniques for brain tumor detection and shall elaborate and compare all of them. There will be some mathematical morpholog

a generalized model for detection of demosaicing ... - IEEE Xplore
Hong Cao and Alex C. Kot. School of Electrical and Electronic Engineering. Nanyang Technological University. {hcao, eackot}@ntu.edu.sg. ABSTRACT.

Generalized Kernel-based Visual Tracking - CiteSeerX
computational costs for real-time applications such as tracking. A desirable ... 2It is believed that statistical learning theory (SVM and many other kernel learning ...

Generalized Features for Electrocorticographic BCIs - CiteSeerX
obtained with as few as 30 data samples per class, support the use of classification methods for ECoG-based BCIs. I. INTRODUCTION. Brain-Computer ...

Generalized Kernel-based Visual Tracking - CiteSeerX
robust and provides the capabilities of automatic initialization and recovery from momentary tracking failures. 1Object detection is typically a classification ...

Efficient Minimization Method for a Generalized Total ... - CiteSeerX
Security Administration of the U.S. Department of Energy at Los Alamos Na- ... In this section, we provide a summary of the most important algorithms for ...

A Fault Detection and Protection Scheme for Three ... - IEEE Xplore
Jan 9, 2012 - remedy for the system as faults occur and save the remaining com- ponents. ... by the proposed protection method through monitoring the flying.

A Measurement Based Rogue AP Detection Scheme
College of William and Mary, Williamsburg, VA, USA. Email: † ... The use of IEEE 802.11 based wireless local area networks, or WLANs, has grown in popularity ...

A Robust Acknowledgement Scheme for Unreliable Flows - CiteSeerX
net and the emergence of sensing applications which do not require full reliability ... can benefit from selective retransmissions of some but not all lost packets, due to ... tion or fading in a wireless network, or loss of ack packets in asymmetric

A Robust Acknowledgement Scheme for Unreliable Flows - CiteSeerX
can benefit from selective retransmissions of some but not all lost packets, due to ... tion or fading in a wireless network, or loss of ack packets in asymmetric ...

an anonymous watermarking scheme for content ... - CiteSeerX
Trusted Computing (TC) is a technology that has been developed to enhance the ..... 3G/GPRS. Broadcast. WLAN. Network. Technologies. Devices. Service and Content. Providers. User ? ... ual authentication for wireless devices. Cryptobytes,.

A Formal Study of Shot Boundary Detection - CiteSeerX
Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. ... not obtained by thresholding schemes but by machine learning ...... working toward the M.S. degree in the Department of Computer ...

Intrusion Detection Systems: A Survey and Taxonomy - CiteSeerX
Mar 14, 2000 - the Internet, to attack the system through a network. This is by no means ... latter approach include its reliance on a well defined security policy, which may be absent, and ..... and compare the observed behaviour accordingly.

Fault Detection Using an LSTM-based Predictive Data ...
data set with labeled faults, we used an LSTM architecture with a forecasting error threshold to ... data. Numerous approaches to fault detection (FD) in industrial and other types of multivariate time series have been proposed: classic methods like

an anonymous watermarking scheme for content ... - CiteSeerX
to anonymously purchase digital content, whilst enabling the content provider to blacklist the buyers that are distributing .... content that a buyer purchases, a passive ad- .... 2004) is a special type of signature scheme that can be used to ...

Affect Detection from Non-stationary Physiological Data using ...
Abstract Affect detection from physiological signals has received ... data) yielded a higher clustering cohesion or tightness compared ...... In: Data. Mining, 2003.

Affect Detection from Non-stationary Physiological Data using ...
more detail the justification for using adaptive and en- semble classification for affect detection from physio- logical data. 2.2 Approaches to adaptive classification in changing environments. Many affective computing studies have relied on the use

I. The numerical scheme and generalized solitary waves - UoA Scholar
Jun 13, 2006 - aDepartment of Mathematics, Statistics and Computer Science, University of Illinois at ... After suitable testing of the numerical scheme, it is used to examine the .... h of periodic smooth splines of order r (degree r − 1) on.

Cryptographic Key Generation from Biometric Data Using ... - CiteSeerX
Department of Computing, Electronics, and Mechatronics. Universidad de las ... is reported in [2]. One more research that uses on-line handwritten signatures to ..... RVP is encrypted using the advanced encryption standard. (AES) encryption ...

Semantic-Shift for Unsupervised Object Detection - CiteSeerX
notated images for constructing a supervised image under- standing system. .... the same way as in learning, but now keeping the factors. P(wj|zk) ... sponds to the foreground object as zFG and call it the fore- ..... In European Conference on.