358

Int. J. Biometrics, Vol. 2, No. 4, 2010

On the use of perceptual Line Spectral pairs Frequencies and higher-order residual moments for Speaker Identification Md. Sahidullah*, Sandipan Chakroborty and Goutam Saha Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology Kharagpur, Kharagpur 721 302, India E-mail: [email protected] E-mail: [email protected] E-mail: [email protected] *Corresponding author Abstract: Conventional Speaker Identification (SI) systems utilise spectral features like Mel-Frequency Cepstral Coefficients (MFCC) or Perceptual Linear Prediction (PLP) as a frontend module. Line Spectral pairs Frequencies (LSF) are popular alternative representation of Linear Prediction Coefficients (LPC). In this paper, an investigation is carried out to extract LSF from perceptually modified speech. A new feature set extracted from the residual signal is also proposed. SI system based on this residual feature containing complementary information to spectral characteristics, when fused with the conventional spectral feature based system as well as the proposed perceptually modified LSF, shows improved performance. Keywords: SI; speaker identification; LSF; line spectral pairs frequencies; perceptual linear prediction; residual signal; higher order statistics. Reference to this paper should be made as follows: Sahidullah, M., Chakroborty, S. and Saha, G. (2010) ‘On the use of perceptual Line Spectral pairs Frequencies and higher-order residual moments for Speaker Identification’, Int. J. Biometrics, Vol. 2, No. 4, pp.358–378. Biographical notes: Md. Sahidullah has Graduated in 2004 from Vidyasagar University in Electronics and Communication Engineering and has obtained Masters in Computer Science and Engineering in the year of 2006 from West Bengal University of Technology, Kolkata with specialisation in Embedded System. He is currently an Institute Research Scholar in the Department of Electronics and Electrical Communication Engineering at Indian Institute of Technology, Kharagpur 721 302, India. His research interests are speaker recognition, pattern classification and speech processing. Sandipan Chakroborty passed Bachelor of Engineering in Electronics from Nagpur University, India in 2001 and passed Masters of Engineering having specialisation in Digital System and Instrumentation Copyright © 2010 Inderscience Enterprises Ltd.

On the use of perceptual Line Spectral pairs Frequencies

359

with highest honours from Bengal Engineering and Science University, Shibpur, Howrah, India in 2003. He was a Research Scholar in the Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur, India. His current area of research includes pattern recognition, neural networks, speech processing, speaker recognition, data fusion analysis. Presently, he is with Samsung India Software Operations Ltd, Bangalore, India. Goutam Saha received his BTech and PhD Degrees from Indian Institute of Technology (IIT), Kharagpur and a short Management Training from XLRI, Jamshedpur, India. He served Tata Steel and Institute of Engineering and Management before joining Faculty of Electronics and Communication Engineering at IIT Kharagpur in 2002 where he is currently serving as Associate Professor. He briefly served Department of Biomedical Engineering, University of Southern California, USA and Trento University, Italy. A winner of DST-Lockheed Martin India Innovation Growth Program 2009. His research interest includes characterisation of biomedical signals, neurosignal processing and audio surveillance. He is also coauthor of two popular engineering text books titled Digital Principles and Applications and Principles of Communication Systems published by Tata McGraw Hill.

1 Introduction Speaker Identification (SI) (Campbell, 1997; Reynolds, 2002; Ramachandran et al., 2002; Kinnunen and Li, 2010) is the task of determining the identity of a subject by its voice. A robust acoustic feature extraction technique (Faundez-Zanuy and Monte-Moreno, 2005) followed by an efficient modelling scheme (Matsui and Tanabe, 2006) are the key requirements of an SI system. Feature extraction transforms (Kinnunen, 2004) the crude speech signal into a compact but effective representation that is more stable and discriminative than the original signal. The central idea behind the feature extraction techniques for speaker recognition system is to get an approximation of short term spectral characteristics of speech for characterising the vocal tract. Most of the existing SI systems use Mel Frequency Cepstral Coefficient (MFCC) or perceptual Linear Predictive Cepstral Coefficient (PLPCC) for parameterising speech (Kinnunen and Li, 2010). MFCC is based on filterbank analysis. The speech signal is passed through some triangular shaped bandpass filters which are equally spaced in mel scale. Finally, the de-correlated log energies of the filters are treated as MFCC. On the other hand PLPCC is derivative of Linear Prediction Coefficient (LPC) of perceptually modified speech signal (Hermansky, 1990). Perceptual modification of speech signal shows significant improvement in speech recognition. Better performance was attained in lower model order of linear prediction analysis. Though PLPCC is mainly used for speech recognition it was successfully applied in speaker recognition (Reynolds, 1994). Various derivatives of LPC like: Log Area Ratio (LAR), Line Spectral pairs Frequencies (LSF), inverse sine coefficients (ARCSIN) are also used as frontend for speaker recognition task (Campbell, 1997). Effective representation of

360

M. Sahidullah et al.

speaker specific information is still a challenging task in developing an SI system. In filterbank based approaches some experiments are carried out in Chakroborty (2008). Recently, an approach has been taken to use the perceptual version of Log Area Ratio (LAR) coefficient as speech parametrisation for robust SI task (Abdulla, 2007). The Perceptual Log Area Ratio (PLAR) feature outperforms the conventional MFCC, PLP based speaker recognition systems. Experimentally it has been found that various psychoacoustic processing has significant effect on the performance of SI systems. LSFs are popular to represent linear prediction coefficients in LPC based coders for filter stability and representational efficiency (Itakura, 1975). It also has other robust properties like ordering related to the spectral properties of the underlying data. LSF parameters are uncorrelated unlike other representations of LPC (Erkelens and Broersen, 1995). The vocal tract resonance frequencies fall between the a pair of LSF frequencies (Bäckström and Magi, 2006; McLoughlin, 2008). The bandwidth of a formant is also directly related to the adjacent LSFs. These properties make LSFs popular for analysis, classification, and transmission of speech signal. Detailed study of LSF feature as parametrisation for speech recognition task is available in Paliwal (1992). LSF parameters are integrated in speech compression schemes like G.729 which are very popular especially in VoIP communication. LSFs are also successfully introduced in speaker recognition task (Liu et al., 1990; Lee et al., 2004). All these features try to represent vocal tract through short term spectral estimation. Vocal cord information which can be obtained from the residual of LP analysis also carries speaker related traits. Pitch information, and other glottal information can be extracted from this residual signal. In Murty and Yegnanarayana (2006), Zheng et al. (2007) and Prasanna et al. (2006) attempts are made in parameterising residual information and combining those complementary information to improve the speaker recognition performance. Our contribution in this work is two fold. The first one is to study the effectiveness of LSF after pairing psychoacoustic operations with the original speech signal. The LSF coefficients extracted is termed as Perceptual LSF (PLSF). The second introduces a novel complementary feature based on residual signal by applying higher order statistical moments, which we term as HOSMR. While the PLSF is spectral based and captures vocal tract information, the HOSMR is residual based and captures vocal cord information. From both feature sets separate models were developed and finally, the log-likelihood scores of both the systems are linearly fused to get advantages of the complementarity in improving the overall performance. SI experiments are performed using the two newly proposed features using GMM (Reynolds and Rose, 1995; Reynolds, 1992) as a classifier. Both newly proposed features were evaluated to observe their individual speaker discriminating ability. Finally, experiments were conducted in dual stream mode by fusing the contribution from both the systems. A comparison is also shown with the experiments on other existing feature based SI systems. Two popular speaker recognition corpus, YOHO and POLYCOST are used for conducting the experiments. The performance of SI system using PLSF feature outperforms conventional feature extraction techniques among the spectral feature based system. This also outperforms in fused mode when the contribution from residual information is included.

On the use of perceptual Line Spectral pairs Frequencies

361

The rest of the paper is organised as follows. The theoretical concepts behind linear prediction analysis, line spectral pairs frequencies and perceptual analysis are presented in Section 2. The proposed schemes are elaborated in Section 3. The experimental setup and results are discussed in Section 4. Finally, the paper is concluded in Section 5.

2 Theoretical background 2.1 Linear prediction analysis and residual signal In the LP model, (n − 1)th to (n − p)th samples of the speech wave are used to predict the nth sample. The predicted value of the nth speech sample (Atal, 1974) is given by sˆ(n) =

p 

a(k)s(n − k)

(1)

k=1

where {a(k)}pk=1 are the predictor coefficients and s(n) is the nth speech sample. The value of p is chosen such that it could effectively capture the real and complex poles of the vocal tract in a frequency range equal to half the sampling frequency. The Prediction Coefficients (PC) are determined by minimising the mean square prediction error (Campbell, 1997) and the error is defined as e(n) = s(n) −

p 

a(k)s(n − k).

(2)

k=1

The LP transfer function can be defined as, H(z) =

1−

p

G

k=1

a(k)z −k

=

G A(z)

(3)

where G is the gain scaling factor for the present input and A(z) is the pth order inverse filter. These LP coefficients itself can be used for speaker recognition as it contains some speaker specific information like vocal tract resonance frequencies, their bandwidths etc. The prediction error i.e., e(n) is called residual signal and it contains the complementary information that are not contained in the PC. Its worth mentioning here that residual signal conveys vocal source cues containing fundamental frequency, pitch period etc. However, it is difficult to extract meaningful features directly from residual signal although there have been some attempts (Zheng et al., 2007; Murty and Yegnanarayana, 2006; Markov and Nakagawa, 1999) to find the features out of it.

2.2 Line Spectral pairs Frequencies (LSF) The LSFs are representation of the predictor coefficients of the inverse filter A(z). At first A(z) is decomposed into a pair of two auxiliary (p + 1) order polynomials as follows:

362

M. Sahidullah et al. 1 (P (z) + Q(z)) 2 P (z) = A(z) − z −(p+1) A(z −1 ) Q(z) = A(z) + z −(p+1) A(z −1 ). A(z) =

(4)

The LSFs are the frequencies of the zeros of P (z) and Q(z). It is determined by computing the complex roots of the polynomials and consequently the angles. It can be done in different ways like complex root method, real root method and ratio filter method (Kondoz, 2004). The root of P (z) and Q(z) occur in symmetrical pairs, hence the name Line Spectrum pairs Frequency. P (z) corresponds to the vocal tract with the glottis closed and Q(z) with the glottis open (Bäckström and Magi, 2006). However, speech production in general corresponds to neither of these extreme cases but something in between where glottis is not fully open or fully closed. For analysis purpose, thus, a linear combination of these two extreme cases are considered. On the other hand, the inverse filter A(z) is a minimum phase filter as all of its poles lie inside the unit circle in the z-plane. Any minimum phase polynomial can be mapped by this transform to represent each of its roots by a pair of frequencies with unit amplitude. Another benefit of LSF frequency is that Power Spectral Density (PSD) at a particular frequency tends to depend only on the close by LSF and vice-versa. In other words, an LSF of a certain frequency value affects mainly the PSD at the same frequency value. It is known as localisation property (Chu, 2003), where the modification to PSD have a local effect on the LSF. This is its advantage over other representation like LPCC, Reflection Coefficient (RC), LAR where changes in particular parameter affect the whole spectrum. The LSF parameters are themselves frequency values directly linked to the signal’s frequency description. The power spectrum can also be directly computed from the LSF values (McLoughlin, 2008). In Soong and Juang (1984), it is stated that LSF coefficients are sufficiently sensitive to the speaker characteristics. Though popularity of LSF remains in low bit rate speech coding (Lepschy et al., 1988; Chu, 2003; Bishnu et al., 2003), it is also successfully employed in speaker recognition (Campbell, 1997; Liu et al., 1990; Lee et al., 2004; Yuan et al., 1999). LSFs coefficients are furthermore appropriate for pattern classification, contrary to the other representation of LPCs (Tourneret, 1998).

2.3 Perceptual Linear Prediction (PLP) analysis The PLP technique converts speech signal in meaningful perceptual way through some psychoacoustic process (Hermansky, 1990). It improves the performance of speech recognition over conventional LP analysis technique. The various stages of this method are based on our perceptual auditory characteristics. The significant blocks of PLP analysis are as follows:

2.3.1 Critical Band Integration (CBI) In this step the power spectrum is warped along its frequency axis into Bark frequency. In brief, the speech signal passed through some trapezoidal filters equally spaced in Bark scale.

On the use of perceptual Line Spectral pairs Frequencies

363

2.3.2 Equal Loudness Pre-emphasis (ELP) Different frequency components of speech spectrum are weighted by a simulated equal-loudness curve.

2.3.3 Intensity-loudness Power Law (IPL) Cube-root compression of the modified speech spectrum is carried out according to the power law of hearing (Stevens, 1957). In addition, RASTA processing (Hermansky and Morgan, 1994) is done with PLP analysis as an initial spectral operation to enhance the speech signal against diverse communication channel and environmental variability. The integrated method is often referred to as RASTA-PLP.

3 Proposed framework 3.1 Perceptual Line Spectral pairs Frequency One of the contribution of the present work is in combining strength of PLP with LSF for automatic SI. Towards this, an alteration of standard PLP scheme is addressed and a strategy is formulated to use the modified PLP coefficient for generation of LSFs. A drawback of PLP analysis technique is that the nonlinear frequency warping stage or Critical Band Integration (CBI) stage introduces undesired spectral smoothing. This work analyses the scatter plot of training data including and excluding the CBI step. A pair of male and a pair of female speaker have been arbitrarily chosen from the POLYCOST database. The extracted training features are plotted in Figure 1 by reducing the 19-D vector to 2-D space using the method of principle component analysis. It shows that the speaker’s data are more separable if critical band integration step is ignored. Contrasted with the work (Abdulla, 2007), we include pre-emphasis in this part of the scheme and LSF. Perceptual weighting of different frequency component enhances the speech signal in accordance with the listening style of human beings. Pre-emphasis stage, is however to emphasise the high frequency component of speech to overcome the roll-of factor of −6 dB/octave due to human speaking characteristics. Hermansky (1990) has also included this step in his work. We have also experimentally found out that this improves the recognition performance. The overall schematic diagram of the proposed Perceptual Line Spectral Pairs feature extraction technique which is based on modified perceptual linear prediction analysis, is shown in Figure 2. The proposed perceptual operation represents the lower frequency region more precisely than the higher frequency zone. In Figure 3 comparative plots of speech spectrum, LP-spectrum, LSF of a speech frame and its perceptual version are shown. The spectral peaks which are sharply approximated by conventional LP are smoothed by modified PLP. Note that the spectral tilt carries speaker related information (Yoma and Pegoraro, 2002). The perceptual modification of spectral information carry speaker dependant information which was removed by conventional PLP [Section II-D in Hermansky (1990)]. LSFs reveal vocal tract spectral information including mouth shape, tongue position and contribution of

364

M. Sahidullah et al.

Figure 1 Figure showing scatter plot of first two feature vectors of two speaker’s training data after principle component analysis. The scatter plot is shown for two cases (i) with critical band analysis step and (ii) without critical band analysis step. The plot is shown using two different colours. Part (a) shows the scatter plot of two male speaker’s data and Part (b) of two female speaker’s data. The data are taken from POLYCOST database (see online version for colours)

the nasal cavity. Its perceptually motivated version represents those characteristics more effectively and hence is expected to improve speaker recognition performance. This is verified by rigorous experimentation as presented in Section 3.2.

3.2 Statistical moments of residual signal Residual signal which was introduced in Section 2.1 generally has an impulse (for voiced) or noise (for unvoice) like behaviour and has a flat spectral response. Though it contains vocal source information, it is very difficult to perfectly characterise it. In literature Wavelet Octave Coefficients Of Residues (WOCOR) Figure 2 Block diagram showing different stages of Perceptual Line Spectral Pairs (PLSF) frequency based feature extraction technique

On the use of perceptual Line Spectral pairs Frequencies

365

Figure 3 Plot showing (a) Speech spectrum (light line), LP-spectrum (dark line) and LSF (Vertical Lines) and (b) Speech Spectrum (light line), PLP-spectrum (dark line) and PLSF (Vertical Lines). The odd LSFs are denoted using continuous lines and the even LSFs are denoted using dotted lines. Speech signal has been taken arbitrarily from a male speaker of YOHO database

(Zheng et al., 2007), Auto-Associative Neural Network (AANN) (Prasanna et al., 2006), residual phase (Murty and Yegnanarayana, 2006) etc., has been used to extract the residual information. It is worth mentioning here that the higherorder statistics were found significant in a number of signal processing applications (Nandi, 1994) when the nature of the signal is non-gaussian. Higher order statistics also got attention of the researchers for retrieving information from the LP residual signals (Nemer et al., 2001) in voice activity detection. Recently, higher order cumulant of LP residual signal has been investigated (Chetouani et al., 2009) for improving the performance of SI system. Higher order statistical moments of a signal parameterises the shape of a function (Lo and Don, 1989). Let the distribution of random signal x be denoted by P (x), the central moment of order k of x be denoted by  ∞ Mk = (x − µ)k dP (5) −∞

for k = 1, 2, 3 . . . , where µ is the mean of x.

366

M. Sahidullah et al.

On the other hand, the characteristics function of the probability distribution of the random variable is given by,  ∞ ∞  (jt)k . (6) ϕX (t) = ejtx dP = Mk k! −∞ k=0

From equation (6) it is clear that moments (Mk ) are coefficients for the expansion of the characteristics function. Hence, they can be treated as one set of expressive constants of a distribution. Moments and their different modifications are successfully used in image analysis (Liao and Pawlak, 1996; Teh and Chin, 1988). Moments can also effectively capture the randomness of residual signal in auto regressive modelling (Mattson and Pandit, 2006). In this work, we use higher order statistical moments of residual signal to parameterise the vocal source information. One more motivation to use the moments on residual signal is that the speech production system is a non-Gaussian process for which higher order statistics can exist and meaningful vocal source features could be extracted. The feature derived by the proposed technique is termed as Higher Order Statistical Moment of Residual (HOSMR). The different blocks of the proposed feature extraction technique from residual are shown in Figure 4. Note that these features are complementary to PLSF features proposed in Section 3 as it is derived from the residual signal. Figure 4 Block diagram of residual moment based feature extraction technique

The followings are the steps for the computation of HOSMR: 1 2 3

Inverse filtering of LP analysis signal generates residual. The residual signal is first normalised between the range [−1, +1]. Then central moment of order k of a residual signal e is computed as, N −1 1  mk = (e(n) − µ)k N n=0

where, µ is the mean of residual signal over a frame.

(7)

On the use of perceptual Line Spectral pairs Frequencies

367

The residual signal is scaled first in frame level and each frame is normalised around their mean. Therefore, first order central moment (i.e. the mean) is zero. The higher order moments (for k = 2 . . . K) are taken as vocal source features as they represent the shape of the distribution of random signal. The lower order moments are coarse parametrisation whereas the higher orders are finer representation of residual signal. In Figure 5, LP residual signals of a voice and unvoice frame are shown as well as their higher order moments. Our experiment shows that a consideration of six moments give good result and still higher moments are not necessary. Figure 5 Figure showing residual signal and higher order moments of two speech frames out of which one is voiced (a) and the other is unvoiced (b). The residual signal is shown in (c) and (d) and moments in (e) and (f) correspondingly. The number of higher order moment shown here is 10 (see online version for colours)

4 Speaker Identification experiment 4.1 Experimental setup 4.1.1 Pre-processing stage In this work, pre-processing stage is kept same for the proposed methods as well as the ones with which compared. It is performed using the following steps:

368

M. Sahidullah et al.



silence removal and end-point detection are done using energy threshold criterion



the speech signal is then pre-emphasised with 0.97 pre-emphasis factor



the pre-emphasised speech signal is segmented into frames of each 20 ms with 50% overlapping, i.e., total number of samples in each frame is N = 160 (sampling frequency Fs = 8 kHz)



in the last step of pre-processing, each frame is windowed using hamming window.

4.1.2 Classification and identification stage The idea of GMM is to use weighted summation of multivariate Gaussian functions to represent the probability density of feature vectors and it is given by p(x) =

M 

(8)

pi bi (x)

i=1

where x is a d-dimensional feature vector, bi (x), i = 1, . . . , M are the component densities and pi , i = 1, . . . , M are the mixture weights or prior of individual Gaussian. Each component density is given by   1 1 t −1 (x − µ bi (x) = exp − ) Σ (x − µ ) (9) i i d 1 i 2 (2π) 2 |Σi | 2 with mean vector µi and covariance matrix Σi . A speaker model is denoted as λ = {pi , µi , Σi }M i=1 . The parameter of λ are optimised using expectation maximisation (EM) algorithm (Dempster et al., 1977). In these experiments, the GMMs are trained with 10 iterations where clusters are initialised by vector quantisation (Linde et al., 1980) algorithm. In closed set SI task, an unknown utterance X = {x1 , x2 , . . . , xt } is identified as an utterance of a particular speaker whose model gives maximum log-likelihood. It can be written as Sˆ = arg max log p(X|λ) = arg max 1≤k≤S

1≤k≤S

T 

p(xt |λk )

(10)

t=1

where Sˆ is the identified speaker from speaker’s model set Λ = {λ1 , λ2 , . . . , λS } and S is the total number of speakers.

4.1.3 Databases for experiments YOHO database: The YOHO voice identification corpus (Campbell, 1997; Higgins et al., 1989) was collected while testing ITT’s prototype speaker verification

On the use of perceptual Line Spectral pairs Frequencies

369

system in an office environment. Most subjects were from the New York City area, although there were many exceptions, including some non-native English speakers. A high-quality telephone handset (Shure XTH-383) was used to collect the speech; however, the speech was not passed through a telephone channel. There are 138 speakers (106 males and 32 females); for each speaker, there are 4 enrollment sessions of 24 utterances each and 10 test sessions of 4 utterances each. In this work, a closed set text-independent SI problem is attempted where we consider all 138 speakers as client speakers. For a speaker, all the 96 (4 sessions × 24 utterances) utterances are used for developing the speaker model while for testing, 40 (10 sessions × 4 utterances) utterances are put under test. Therefore, for 138 speakers we put 138 × 40 = 5520 utterances under test and evaluated the identification accuracies. POLYCOST database: The POLYCOST database (Melin and Lindberg, 1996) was recorded as a common initiative within the COST 250 action during January–March 1996. It contains around 10 sessions recorded by 134 subjects from 14 countries. Each session consists of 14 items, two of which (MOT01 & MOT02 files) contain speech in the subject’s mother tongue. The database was collected through the European telephone network. The recording has been performed with ISDN cards on two XTL SUN platforms with an 8 kHz sampling rate. In this work, a closed set text independent SI problem is addressed where only the mother tongue (MOT) files are used. Specified guideline Melin and Lindberg (1996) for conducting closed set SI experiments is adhered to, i.e. ‘MOT02’ files from first four sessions are used to build a speaker model while ‘MOT01’ files from session five onwards are taken for testing. As with YOHO database, all speakers (131 after deletion of three speakers) in the database were registered as clients.

4.1.4 Score calculation In closed-set SI problem, identification accuracy as defined in Reynolds and Rose (1995) and given by the equation (11) is followed. Percentage of Identification Accuracy (PIA) No. of utterance correctly identified × 100. = Total no. of utterance under test

(11)

4.2 Speaker Identification experiments and results The work uses GMM based classifier of different model orders which are power of two i.e., 2, 4, 8, 16, etc. The number of Gaussian is limited by the amount of available training data (Average training speech length per speaker from all sessions after silence removal: 40 s and 150 s for POLYCOST and YOHO respectively). The number of mixtures are incremented to 16 for POLYCOST and 64 for YOHO database. We have conducted a series of experiments using the two databases. First, we have observed the consequence of different perceptual operations on the performance of LSF based SI system. The individual effects of CBI, ELP, and ILP as well as their combined effort were evaluated. The inclusion of ELP and ILP steps independently improves the identification performance over the baseline system.

370

M. Sahidullah et al.

On the other hand including the CBI step the SI performance degrades. Hence, the inclusion of ELP and ILP and exclusion of CBI step is followed in proposed perceptual LSF (LSF) based SI system. This combination is also used for PLAR for the same reason. Next we conduct experiments using the different baseline features to compare the PIA of the proposed technique. For LP based methods, comparisons are shown with LSF; for filterbank based method MFCC. Two perceptual motivated features are also evaluated which are PLPCC and PLAR. The feature dimension is set at 19 for all kinds of features. In LP based systems 19 filters are used for all-pole modelling of speech signals. On the other hand 20 filters are used for filterbank based system and 19 coefficients are taken for extracting MFCC after discarding the first co-efficient which represents dc component. The detail description are available in Chakroborty (2007, 2008). The formulation method of other features are available in Campbell (1997) and Rabiner and Juang (2003). The results are shown in Tables 1 and 2 for POLYCOST and YOHO database respectively. The last columns of each table corresponds to results on proposed PLSF based SI system while the rest are based on other baseline features. From the results it is clear that the proposed feature outperforms the other existing techniques. The proposed perceptual feature gives 6.26% relative improvement over traditional PLPCC based system and 6.08% over recently proposed PLAR feature based system in POLYCOST database. The improvement in YOHO are correspondingly 2.77% and 0.224% . The POLYCOST database consists of speech signals collected over telephone channel. The improvement for this database is significant over YOHO which is micro-phonic. Table 1 Comparative Speaker Identification Results (PIA) using various spectral features for POLYCOST database Model order 2 4 8 16

LSF

MFCC

PLPCC

PLAR

PLSF

60.7427 66.8435 75.7294 78.1167

63.9257 72.9443 78.2493 78.9125

62.9973 72.2812 75.0663 78.3820

64.9867 74.6684 78.6472 78.5146

65.6499 74.4032 80.9019 83.2891

Table 2 Comparative Speaker Identification Results (PIA) using various spectral features for YOHO database Model order 2 4 8 16 32 64

LSF

MFCC

PLPCC

PLAR

PLSF

70.7428 81.3768 90.4529 93.2246 95.5978 96.5761

75.5797 86.1594 91.4855 94.5471 96.0688 97.0109

66.5761 76.9203 85.3080 90.6341 93.5326 94.6920

83.4420 90.1449 94.0761 95.6884 96.5036 97.1014

78.1884 89.0036 94.1486 96.1413 96.9565 97.3188

Speaker Identification (SI) performance of residual moment based feature is also evaluated for both the databases. The order of LP to extract the residual signal is

371

On the use of perceptual Line Spectral pairs Frequencies

kept between 10–18 as these orders are often used in speech processing applications. This range is also sufficient to capture effective speaker specific information (Prasanna et al., 2006). We have performed SI experiment for different order of moments. Empirically, we have observed that 4–6 higher order moments are sufficient to capture the vocal cord information. In Tables 3 and 4, the results of SI experiment is shown using HOSMR. The identification performance is very low because the vocal cord parameters are the only cues for identifying speakers. It is importance lies in complementarity to vocal tract information. Table 3 Speaker Identification accuracy for POLYCOST database using HOSMR feature (number of mixtures = 16) % of accuracy for different number of features LP order 10 11 12 13 14 15 16 17 18

1

2

3

4

5

6

7

7.1618 7.1618 5.5703 5.5703 7.1618 6.7639 7.1618 7.0292 6.7639

15.6499 14.9867 15.5172 14.7215 13.9257 15.5172 15.5172 14.9867 16.0477

17.5066 16.4456 17.5066 17.5066 16.0477 18.9655 17.5066 16.8435 17.3740

21.4854 20.1592 21.4854 22.1485 21.8833 21.7507 22.2812 21.4854 21.6180

19.4960 16.9761 21.0875 19.7613 19.6286 20.5570 20.2918 19.4960 21.3528

19.3634 20.5570 20.9549 20.6897 23.3422 20.0265 21.2202 22.4138 22.1485

17.9045 18.5676 20.6897 20.1592 19.3634 18.0371 20.2918 18.8329 18.0371

Table 4 Speaker Identification accuracy for YOHO database using HOSMR feature (number of mixtures = 64) % of accuracy for different number of features LP order 10 11 12 13 14 15 16 17 18

1

2

3

4

5

6

7

6.7754 6.7754 7.3732 7.4819 7.6268 7.4819 7.8442 7.6993 7.4819

15.3986 16.2862 16.1413 16.1775 16.1051 16.6486 16.1957 16.6486 16.1413

19.9638 20.4891 20.0725 20.6341 20.9964 20.5797 19.8370 21.0688 20.2899

24.5471 25.0181 25.5072 25.3804 25.9783 24.8913 23.8587 25.0725 24.9094

20.3804 21.6486 22.3913 22.2826 22.6268 22.3370 22.0652 21.2500 22.1739

21.6304 21.2862 22.5181 22.1558 22.9348 22.6993 21.9486 21.9565 22.4094

19.9638 20.2355 20.8514 19.8732 21.1594 19.8913 20.7428 20.2717 19.9094

4.3 Fusion of vocal tract and vocal cord information Here, vocal tract and vocal cord parameters are successfully integrated for identifying speakers. The way PLSF and HOSMR represent speech signal are complementary to one another. Hence, it is expected that combining the advantages of both the feature will improve (Kittler et al., 1998) the overall performance of SI system. The block diagram of the combined system is shown in Figure 6. Spectral

372

M. Sahidullah et al.

Figure 6 Block diagram of fusion technique: score fusion of vocal tract information based feature (short term spectral) and vocal chord information (residual) (see online version for colours)

features and residual features are extracted from the training data in two separate streams. Consequently, speaker modelling is performed for the respective features independently and model parameters are stored in the model database. At the time of testing same process is adopted for feature extraction. Log-likelihood of two different features are computed w.r.t. their corresponding models. Finally, the output score is weighted and combined. To get the advantages of both the system and their complementarity the score level linear fusion can be formulated as in equation (12): LLRcombined = ηLLRspectral + (1 − η)LLRresidual .

(12)

where LLRspectral and LLRresidual are log-likelihood ratio calculated from the spectral and residual based systems, respectively. The fusion weight is determined by the parameter η. In this experiment, we take equal evidence from the two systems and set the value of η to 0.5. In Tables 5 and 6, SI performance of the fused system is shown. The performance of the combined system is better than the single feature based system. The order used for LP is 17 and the number of features for HOSMR is 6. We find that HOSMR feature always improves the result due to its complementarity and PLSF-HOSMR

On the use of perceptual Line Spectral pairs Frequencies

373

combination is better than other combinations. Note that the improvement is higher in lower model order of GMM due to the base effect which is usually experienced in this type of performance evaluation. For example, in case of two model order based system using PLSF feature the PIA of dual stream based system is improved by 6.46% (POLYCOST) and 5.61% (YOHO) compared to single stream based approach; on the other hand the improvements for model order 64 are given by 1.6% and 0.52% respectively. In a separate experiment we have seen that if single stream model dimension of spectral feature is increased to 25 same as combined dimension of features in two stream model the performance of the later is always better due to complementarity of residual. The performance of the 25-dimensional pooled system is also better than the single streamed 25 dimensional spectral feature based system. Table 5 Speaker Identification accuracy for POLYCOST database using HOSMR based feature and fused system (Score Level Linear Fusion with η = 0.5; HOSMR Configuration: LP Order = 17, Number of Higher Order Moments= 6) % HOSMR fused with No. of mixtures 2 4 8 16

LSF

MFCC

PLPCC

PLAR

PLSF

65.1194 70.9549 77.8515 80.3714

69.3634 76.3926 80.6366 80.5040

65.7825 75.5968 77.3210 80.5040

70.6897 77.7188 81.1671 81.5650

69.8939 77.9841 81.9629 84.6154

Table 6 Speaker Identification accuracy for YOHO database using HOSMR based feature and fused system (Score Level Linear Fusion with η = 0.5; HOSMR Configuration: LP Order = 17, Number of Higher Order Moments = 6) % HOSMR fused with No. of mixtures 2 4 8 16 32 64

LSF

MFCC

PLPCC

PLAR

PLSF

76.2319 84.2754 91.2862 94.0217 95.9058 96.7935

79.6377 88.1159 92.9167 95.0543 96.5399 97.1558

72.5543 81.0507 87.7717 91.9022 94.3116 95.3986

86.3768 91.5761 95.0000 96.4312 97.0290 97.5906

82.5725 91.1232 95.1993 96.5036 97.2826 97.8261

We have varied the value of the weight (η) empirically. It is observed that unequal weighting improves the identification result. In Figure 7 the identification accuracy of fused system vs. fusion weight is shown. PLSF feature gives improved accuracy for various fusion weight compared to other spectral feature based combined system. With the help of exhaustive search at η = 0.427 we get PIA of 84.7480% for POLYCOST database and at η = 0.448 we get PIA of 97.8623% for YOHO database. The residual information extracted through HOSM contains more speaker specific information than that is captured by pitch alone. In a different experiment

374

M. Sahidullah et al.

Figure 7 Speaker Identification accuracy on: (a) POLYCOST and (b) YOHO with fused system for different values of fusion weight (η)

we have used standard pitch detection algorithm to find the pitch of the speech frames. The SI performances are observed for combined system i.e. for pitch and spectral features. Table 7 shows the comparative results for pitch based and HOSMR based fused system. The performance of HOSMR based system is better if both voiced and unvoiced frames are taken instead of only voiced. Table 7 Comparative Speaker Identification result (fused system) on two databases. The number of gaussian for POLYCOST is 16 and for YOHO is 64. The fusion weight (η) is set at 0.5. HOSMR Configuration: LP Order = 17, Number of Higher Order Moments = 6 POLYCOST Spectral feature LSF MFCC PLPCC PLAR PLSF

YOHO

Pitch based

HOSMR based

Pitch based

HOSMR based

81.2997 78.9125 78.5146 79.9735 84.0849

80.3714 80.5040 80.5040 81.5650 84.6154

96.6123 95.7065 93.2971 96.7572 97.1377

96.7935 97.1558 95.3986 97.5906 97.8261

The databases used for our experiment have same training and testing condition with significant session variability. To check the performance of the system when the test data is corrupted with noise we have conducted a separate experiment.

375

On the use of perceptual Line Spectral pairs Frequencies

Additive white Gaussian noise is added to test signal for different SNR level for POLYCOST full database and no speech enhancement is done. It is observed that the performance of the fused system is still significantly better than single feature based system even in low SNR. The result is shown in Table 8. It is worth mentioning that the proposed HOSMR based fused system with MFCC improve performance over single streamed vocal tract based features in higher noise while fused system with PLSF outperforms in higher SNR. Table 8 Performance of Speaker Identification system in presence of additive white Gaussian noise. The results are shown for POLYCOST database (telephonic) where the speakers are modelled using 16 Gaussian components. The fusion weight (η) is set at 0.5. HOSMR configuration: LP Order = 17, Number of Higher Order Moments = 6 SNR (in dB)

MFCC

PLSF

MFCC +HOSMR

PLSF +HOSMR

40 30 20

78.9125 72.2812 43.8992

82.8912 71.4854 34.4828

80.2387 75.7294 52.6525

83.4218 75.8621 35.4111

5 Conclusion A novel scheme to improve the performance of the speaker recognition system is presented. Speaker information captured by the vocal tract and vocal cord parameters are fused together to achieve the best performance. A novel spectral feature, Perceptual Line Spectral pairs Frequency (PLSF) is proposed in this paper which effectively exploits the advantages of LSF and perceptual analysis of speech signal to capture the vocal tract parameter. The PLSF outperforms other spectral based features in terms of identification accuracy. Unlike others, it does not show a dip if dimension increases, rather it shows incremental improvement. In addition, a novel complementary feature is also proposed based on Higher Order Statistical Moments of Residual (HOSMR) signal that gives the vocal cord characteristics. Experiments with two standard databases show the superiority of proposed PLSF features and fusion of HOSMR over others.

References Abdulla, W.H. (2007) ‘Robust speaker modelling using perceptually motivated feature’, Pattern Recogn. Lett., Vol. 28, No. 11, pp.1333–1342. Atal, B.S. (1974) ‘Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification’, The Journal of the Acoustical Society of America, Vol. 55, No. 6, pp.1304–1312. Bäckström, T. and Magi, C. (2006) ‘Properties of line spectrum pair polynomials: a review’, Signal Process, Vol. 86, No. 11, pp.3286–3298. Bishnu, S.A., Cuperman, V. and Gersho, A. (2003) Advances in Speech Coding, Kluwer Academic Publishers, USA.

376

M. Sahidullah et al.

Campbell, J.P. (1997) ‘Speaker recognition: a tutorial’, Proceedings of the IEEE, Vol. 85, No. 9, pp.1437–1462. Chakroborty, S., Roy, A., Majumdar, S. and Saha, G. (2007) ‘Capturing complementary information via reversed filter bank and parallel implementation with mfcc for improved text-independent speaker identification’, International Conference on Computing: Theory and Applications, ICCTA’07, Kolkata, India, pp.463–467. Chakroborty, S. (2008) Some Studies on Acoustic Feature Extraction, Feature Selection and Multi-Level Fusion Strategies for Robust Text-Independent Speaker Identification, PhD Dissertation, Indian Institute of Technology Kharagpur, Kharagpur, India. Chetouani, M., Faundez-Zanuy, M., Gas, M.B. and Zarader, J. (2009) ‘Investigation on lp-residual representations for speaker identification’, Pattern Recognition, Vol. 42, No. 3, pp.487–494. Chu, W.C. (2003) Speech Coding Algorithms, John-Wiley, USA. Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977) ‘Maximum likelihood from incomplete data via the em algorithm’, Journal of the Royal Statistical Society. Series B (Methodological), Vol. 39, pp.1–38. Erkelens, J. and Broersen, P. (1995) ‘On the statistical properties of line spectrum pairs’, International Conference on Acoustics, Speech, and Signal Processing, ICASSP-95, Vol. 1, pp.768–771. Faundez-Zanuy, M. and Monte-Moreno, E. (2005) ‘State-of-the-art in speaker recognition’, Aerospace and Electronic Systems Magazine, IEEE, Vol. 20, No. 5, pp.7–12. Hermansky, H. (1990) ‘Perceptual linear predictive (plp) analysis of speech’, The Journal of the Acoustical Society of America, Vol. 87, No. 4, pp.1738–1752. Hermansky, H. and Morgan, N. (1994) ‘Rasta processing of speech’, IEEE Transactions on Speech and Audio Processing, Vol. 2, No. 4, pp.578–589. Higgins, A., Porter, J. and Bahler, L. (1989) Yoho Speaker Authentication Final Report, ITT Defense Communications Division, Tech. Rep. Itakura, F. (1975) ‘Line spectrum representation of linear predictor coefficients of speech signals’, The Journal of the Acoustical Society of America, Vol. 57, No. S1, p.S35. Kinnunen, T. (2004) Spectral Features for Automatic Textindependent Speaker Recognition, PhD Dissertation, University of Joensuu, Joensuu, Finland. Kinnunen, T. and Li, H. (2010) ‘An overview of text-independent speaker recognition: from features to supervectors’, Speech Communication, Vol. 52, No. 1, pp.12–40. Kittler, J., Hatef, M., Duin, R. and Matas, J. (1998) ‘On combining classifiers’, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 3, pp.226–239. Kondoz, A.M. (2004) Digital Speech Coding for Low Bit Rate Communication Systems, 2nd ed., John Wiley & Sons Ltd., England. Lee, B.J., Kim, S. and Kang, H.-G. (2004) ‘Speaker recognition based on transformed line spectral frequencies’, International Symposium on Intelligent Signal Processing and Communication Systems, ISPACS 2004, Proceedings of 2004, pp.177–180. Lepschy, A., Mian, G. and Viaro, U. (1988) ‘A note on line spectral frequencies [speech coding]’, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 36, No. 8, pp.1355–1357. Liao, S. and Pawlak, M. (1996) ‘On image analysis by moments’, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18, No. 3, pp.254–266. Linde, Y., Buzo, A. and Gray, R. (1980) ‘An algorithm for vector quanisation design’, IEEE Transactions on Communications, Vol. COM-28, No. 4, pp.84–95.

On the use of perceptual Line Spectral pairs Frequencies

377

Liu, C-S., Wang, W-J., Lin, M-T. and Wang, H-C. (1990) ‘Study of line spectrum pair frequencies for speaker recognition’, International Conference on Acoustics, Speech, and Signal Processing, ICASSP-90, Albuquerque, New Mexico, USA, Vol. 1, pp.277–280. Lo, C.H. and Don, H.S. (1989) ‘3-d moment forms: their construction and application to object identification and positioning’, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, No. 10, pp.1053–1064. Markov, K.P. and Nakagawa, S. (1999) ‘Integrating pitch and lpc-residual information with lpc-cepstrum for text-independent speaker recognition’, J. Acoust. Soc. Jpn. E, Vol. 20, No. 4, pp.281–291. Matsui, T. and Tanabe, K. (2006) ‘Comparative study of speaker identification methods: dplrm, svm and gmm’, IEICE – Trans. Inf. Syst., Vol. E89-D, No. 3, pp.1066–1073. Mattson, S.G. and Pandit, S.M. (2006) ‘Statistical moments of autoregressive model residuals for damage localisation’, Mechanical Systems and Signal Processing, Vol. 20, No. 3, pp.627–645. McLoughlin, I.V. (2008) ‘Review: line spectral pairs’, Signal Process, Vol. 88, No. 3, pp.448–467. Melin, H. and Lindberg, J. (1996) ‘Guidelines for experiments on the polycost database’, Proceedings of a COST 250 Workshop on Application of Speaker Recognition Techniques in Telephony, Vigo, Spain, pp.59–69. Murty, K. and Yegnanarayana, B. (2006) ‘Combining evidence from residual phase and mfcc features for speaker recognition’, Signal Processing Letters, IEEE, Vol. 13, No. 1, pp.52–55. Nandi, A. (1994) ‘Higher order statistics for digital signal processing’, IEE Colloquium on Mathematical Aspects of Digital Signal Processing, London, England, pp.6/1–6/4. Nemer, E., Goubran, R. and Mahmoud, S. (2001) ‘Robust voice activity detection using higher-order statistics in the lpc residual domain’, IEEE Transactions on Speech and Audio Processing, Vol. 9, No. 3, pp.217–231. Paliwal, K. (1992) ‘On the use of line spectral frequency parameters for speech recognition’, Digital Signal Processing, Vol. 2, No. 2, pp.80–87. Prasanna, S.M., Gupta, C.S. and Yegnanarayana, B. (2006) ‘Extraction of speakerspecific excitation information from linear prediction residual of speech’, Speech Communication, Vol. 48, No. 10, pp.1243–1261. Rabiner L. and Juang, B.H. (2003) Fundamental of Speech Recognition, Pearson Education, First Indian Reprint, India. Ramachandran, R.P., Farrell, K.R., Ramachandran, R. and Mammone, R.J. (2002) ‘Speaker recognition–general classifier approaches and data fusion methods’, Pattern Recognition, Vol. 35, No. 12, pp.2801–2821. Reynolds, D.A. (1992) A Gaussian Mixture Modeling Approach to Text-independent Speaker Identification, PhD Dissertation, Georgia Institute of Technology, Georgia, USA. Reynolds, D. (1994) ‘Experimental evaluation of features for robust speaker identification’, IEEE Transactions on Speech and Audio Processing, Vol. 2, No. 4, pp.639–643. Reynolds, D. and Rose, R. (1995) ‘Robust text-independent speaker identification using gaussian mixture speaker models’, IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, pp.72–83. Reynolds, D. (2002) ‘An overview of automatic speaker recognition technology’, IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings (ICASSP’02), Vol. 4, pp.IV–4072–IV–4075.

378

M. Sahidullah et al.

Soong, F. and Juang, B. (1984) ‘Line spectrum pair (lsp) and speech data compression’, Vol. 9, pp.37–40. Stevens, S.S (1957) ‘On the psychophysical law’, Psychological Review, Vol. 64, No. 3, pp.153–181. Teh, C.H. and Chin, R. (1988) ‘On image analysis by the methods of moments’, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 10, No. 4, pp.496–513. Tourneret, J.Y. (1998) ‘Statistical properties of line spectrum pairs’, Signal Processing, Vol. 65, No. 2, pp.239–255. Yoma, N.B. and Pegoraro, T.F. (2002) ‘Robust speaker verification with state duration modelling’, Speech Communication, Vol. 38, Nos. 1–2, pp.77–88. Yuan, Z.X., Xu, B.L. and Yu, C.Z. (1999) ‘Binary quantisation of feature vectors for robust text-independent speaker identification’, IEEE Transactions on Speech and Audio Processing, Vol. 7, No. 1, pp.70–78. Zheng, N., Lee, T. and Ching, P.C. (2007) ‘Integration of complementary acoustic features for speaker recognition’, Signal Processing Letters, IEEE, Vol. 14, No. 3, pp.181–184.

On the use of perceptual Line Spectral pairs ...

India Software Operations Ltd, Bangalore, India. Goutam Saha received his BTech and PhD Degrees from ... are very popular especially in VoIP communication. LSFs are also successfully introduced in speaker recognition task ... A comparison is also shown with the experiments on other existing feature based SI systems.

1MB Sizes 2 Downloads 203 Views

Recommend Documents

On pseudolinearity and generic pairs
Feb 24, 2009 - dim, stp, acl, SU and Cb). A set A ⊂ (M,P) is called P-independent, if. A |. L. P (A)P(M). In [8], we showed that an LP -type of a P-independent set in a generic pair is determined by its quantifier free LP -type. For any pair of set

On the conjugacy of off-line and on-line Sequential ...
Aug 19, 2014 - l'Administration .... is achieved by off-line method taking into account more than a few ... method compared to SMC alternatives in section 3.

ON LOVELY PAIRS OF GEOMETRIC STRUCTURES 1 ...
of rank ≤ ω. The tools used in the proof depended mainly on the description of definable sets given by van den Dries in [10]. Since such a description can be.

On the Intercepts of the OI-Line
Jun 8, 2004 - ISSN 1534-1178. On the Intercepts of the OI-Line. Lev Emelyanov. Abstract. We prove a new property of the intercepts of the line joining the cir-.

Perceptual coding of audio signals
Nov 10, 1994 - “Digital audio tape for data storage”, IEEE Spectrum, Oct. 1989, pp. 34—38, E. .... analytical and empirical phenonomena and techniques, a central features of ..... number of big spectral values (bigvalues) number of pairs of ...

Perceptual Reasoning for Perceptual Computing
Department of Electrical Engineering, University of Southern California, Los. Angeles, CA 90089-2564 USA (e-mail: [email protected]; dongruiw@ usc.edu). Digital Object ... tain a meaningful uncertainty model for a word, data about the word must be

Attitudes of South African environmentalists on the domestic use of ...
Mar 18, 2009 - ABSTRACT. The paucity of literature on the perceptions and attitudes of South Africans on recycling, reusing, and reducing the number of resources used suggests the need for an exploration of these environmental issues. The current ene

THE NOTION OF CONSTRUAL IN INTERPRETING PAIRS OF ...
THE NOTION OF CONSTRUAL IN INTERPRETING PA ... SEMANTICS STUDY BY LANA RIZGAR KAMAL .pdf. THE NOTION OF CONSTRUAL IN ...

Influence of prolonged bed-rest on spectral and ...
39.2(26.1). 17.6(10.8). 10.3(6.6). Activation timing (phase-lead/lag; PHZ) relative to goniometer signal (degrees). 50. 153.4(59.7). А135.3(44.9). 138.5(54.2).

Attitudes of South African environmentalists on the domestic use of ...
Mar 18, 2009 - ABSTRACT. The paucity of literature on the perceptions and attitudes of South Africans on recycling, reusing, and reducing the number of resources used suggests the need for an exploration of these environmental issues. The current ene

on-line-brochure.pdf
GOVERNMENT OF WEST BENGAL. WEST BENGAL .... Cards/Credit Cards/Net-Banking of any Bank. Fees can be ... Main menu. Displaying on-line-brochure.pdf.

Investigation of the Spectral Characteristics.pdf
DEDICATION. This work is dedicated to my darling sisters. Mrs. Payman Mahmood. Mrs. Hanaw Ahmad. With love and respect... Whoops! There was a problem ...

On the spectral gap for compact manifolds 1 ...
For technical reasons we will also consider non-integer values of n although the corresponding differential ..... at King's College where this work was done. References. [B/B/G] Berard, ... Academic Press, Inc., Orlando 1984. [Che] Cheng, S. Y..

Perceptual coding of audio signals
Nov 10, 1994 - for understanding the FORTRAN processing as described herein is FX/FORTRAN Programmer's Handbook, Alliant. Computer Systems Corp., July 1988. LikeWise, general purpose computers like those from Alliant Computer Sys tems Corp. can be us

Notes on the Spectral Aspects of Linear Prediction of ...
Feb 10, 2008 - that has a closed-form unique solution: aML = (. X. T. C. −1 e. X. )−1. X. T. C. −1 e x. (20). This becomes, considerng Ce = σ. 2. I: aML = (. X. T. X. )−1. X. T x. (21). We would like to calculate the probability density func

Optimal hash functions for approximate closest pairs on ...
Oct 11, 2001 - hash functions for various metrics have been given: projecting an ... Here, on the other hand, we will call a region optimal for p if no region ... D. Gordon and P. Ostapenko are with the IDA Center for Communications Research, ...

A Note on Common Fixed-Points for Banach Operator Pairs
[email protected], [email protected]. Sumei Xu. Department of Mathematics and Applied Mathematics. Anyang Normal University, P.R. ...

On the use of gradual densesparse discretizations in ...
SUMMARY. A key factor to success in implementations of real time optimal control, such as receding horizon control. (RHC), is making efficient use of computational resources. The main trade-off is then between efficiency and accuracy of each RHC iter

Draft Regulations on the use of Television White Spaces.pdf ...
INDEPENDENT COMMUNICATIONS AUTHORITY OF SOUTH AFRICA. NOTICE 283 OF 2017 283 Electronic Communications Act (36/2005): Hereby issues a ...

The Influence of Perceptual Load on Age Differences in ...
aging of attention by testing implications for aging of a recent proposal (e.g., Lavie .... data from Stroop tasks showed that a single Brinley plot with a slope of 1.9 ..... correct (after an arcsine transformation), with the same three factors as b

Concept paper on the use of adjuvanted veterinary vaccines
Dec 31, 2016 - An agency of the European Union ... European Medicines Agency, 2016. ... updating it to take account of more recent scientific developments ...

Guideline on the use of pharmacokinetics and pharmacodynamics in ...
Jul 21, 2016 - Clinical pharmacokinetic data to support PK-PD analyses . ..... The statistical method most often used is Monte Carlo Simulation (MCS) but ...