A Gradient Descent Approach for Multi-modal Biometric ...

Viewer
Transcript

2010 International Conference on Pattern Recognition

A Gradient Descent Approach for Multi-Modal Biometric Identification

Nalini Ratha IBM TJ Watson Research Centre, USA [email protected]

Jayanta Basak, Kiran Kate,Vivek Tyagi IBM Research - India, India bjayanta, kirankate, [email protected]

engines is assumed to be available [3], [4]. At score level, the matching score (similarity or dissimilarity) is available for fusion purposes [5], [6]. These two methods tend be highly popular as they do not require any prior knowledge about the underlying matching algorithm. The next level of fusion involves integrating features from different biometrics engines. At the lowest level of integration, the signals acquired for a modality can be fused to improve the basic signal or supplement the signal with additional information. The feature-level and signal-level fusion while promising, have not been focus of research in many cases. Similarly, as pointed out earlier, most of the current research is focussed around verification problem. For identification systems, often rank based fusion is also considered where the candidate list generated by the biometrics classifier is used as the input to the fusion stage [7], [3]. In this paper, we focus on score level fusion for identification problems. There are two stages in the proposed algorithm. In the training phase, we try to learn weights for each modality in such a way that the weighted score of genuine candidate is more than the weighted score of impostors. The weights are learnt using a gradient descent method. In the testing stage, these weights are used for the members of the candidate list from each modality. Our method has been tested using the NIST BSSR-1 dataset. We compare our results with published results on this dataset and demonstrate the superiority of our algorithm both in terms of accuracy and speed. The rest of the paper is organized as follows. In section 2, we formulate the weight learning problem and present the gradient descent algorithm for learning these weights. The database used in testing the proposed algorithm along with the results are presented in Section 3. We present conclusions from our work in Section 4.

Abstract—While biometrics-based identification is a key technology in many critical applications such as searching for an identity in a watch list or checking for duplicates in a citizen ID card system, there are many technical challenges in building a solution because the size of the database can be very large (often in 100s of millions) and the intrinsic errors with the underlying biometrics engines. Often multi-modal biometrics is proposed as a way to improve the underlying biometrics accuracy performance. In this paper, we propose a scorebased fusion scheme tailored for identification applications. The proposed algorithm uses a gradient descent method to learn weights for each modality such that weighted sum of genuine scores is larger than the weighted sum of all the impostor scores. During the identification phase, top K candidates from each modality are retrieved and a super-set of identities is constructed. Using the learnt weights, we compute the weighted score for all the candidates in the superset. The highest scoring candidate is declared as the top candidate for identification. The proposed algorithm has been tested using NIST BSSR1 dataset and results in terms of accuracy as well as the speed (execution time) are shown to be far superior than the published results on this dataset.

I. I NTRODUCTION Biometrics-based recognition systems have already proven to be useful in many high security applications. A key step in building a trusted biometrics systems requires a secure enrolment process where no duplicates will be allowed into the system. For example, citizen ID card systems, passport issue systems and voter ID systems can not function with duplicates in the system. Such 1:N matching systems are also needed for watch-list matching. Often it is believed that one can use a 1:1 biometrics system and iterate over the database to pick the best matching cases from the database. Such approach can be non-scalable and error prone. Multibiometrics systems have been proposed to improve accuracy performance for both verification and ID systems. While the multi-biometrics for verification systems have received considerable interest in the research community, very little research has been reported for identification systems involving biometrics fusion [1]. It is assumed that identification based on such approaches would improve performance as the core biometrics authentication performance improves using multi-biometrics. Biometrics fusion can happen at several levels [2]: decision level, score level, feature level and signal level. In decision level fusion, only decisions from the biometrics 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.329

II. G RADIENT D ESCENT A PPROACH Let x be a query sample for an individual, and r(x, y) be the matching score when an individual y is matched against x. We can have M different modalities. For example, we can have two different fingerprint matchers, and two different face matchers. In that case, we have M = 4. Let us represent the matching score of the individual x against y for ith modality as ri (x, y). We view the score fusion such a way that the total matching score for an individual x matched 1326 1322

against x should be highest when compared to matching scores with other individuals. We assign certain weight to each modality. The task is to determine these weights such that for any x, the weighted sum of the matching scores against that particular x should be maximum. In other words, wi ri (x, x) > wi ri (x, y) (1) i

In other words, assuming the local error surface to be linear, we change the weight such that the total error reduces to zero. Further we have ∂E Δwi (8) ΔE = ∂wi i From Equation 5, we have

i

for all y = x. In order to have a margin of separation between the maximum and the next maximum value, we modify the condition as wi ri (x, x) > λ wi ri (x, y) (2) i

ΔE = −η

2 M ∂E i=1

∂wi

(9)

Using linear error surface, from Equations 7 and 9, we have E η= 2 M ∂E i=1

i

where λ = 1 + , > 0 being a small constant. In order to determine the weights, first we define an error measure for a pair of samples (x, y). If for a pair (x, y), the condition as in Equation 2 is satisfied then there is no error, otherwise the pair contributes to the error measure. The error measure is given as M 0 if i=1 wi (ri (x, x) − λri (x, y)) > 0 M exy = w (λr otherwise i (x, y) − ri (x, x)) i=1 i (3) The objective of gradient descent based approach is to adapt the weights such that the total error thus defined is minimized. We define the total error as 1 2 e (4) E= 2 x,y xy

(10)

∂wi

Expanding the partial, we obtain the learning (adaptation) rate η as 2 x,y exy η= (11) 2 M 2 i=1 e (r (x, x) − λr (x, y)) xy i i x,y In the vicinity of the minima of the error surface, the denominator of Equation 11 becomes very small, and that can cause instability in the value of η. We therefore modify the value of η as 2 x,y exy η= 2 (12) M 1 + 2 i=1 e (r (x, x) − λr (x, y)) i x,y xy i During the training phase, we adapt the weights for certain number of iterations until the change in weights becomes too small (less than certain specified threshold or for a specified number of iterations). Once the weights are obtained from the training data, we use these weights for the test data. For each query, we retrieve top K samples for each modality. We obtain the superset of these K samples. In computing the superset, some of the samples may not have scores for all modalities. We use the minimum score that has been retrieved for the respective modality to fill the blank scores of these samples. For example, let in the fingerprint modality, the retrieved samples be A, B, and C; and for the face modality, the retrieved samples be B, C, and D. Let the order be A > B > C and B > C > D respectively. We do not fill face score of A as zero. Similarly we do not fill finger score for D as zero. We fill the face score for A as the same as that of D and the finger score for D as the same as C (the minimum scores). Here the size of the superset is 4. Once we have the superset with respective scores, we compute the weighted aggregated scores using the weights as learned during the training phase. We then obtain the sample which has the maximum aggregated score as the matched identity. We also obtain top k identities according to the aggregated score and find out if the query sample is contained in the top k samples thus obtained for a specific value of K.

Since exy is not symmetric, it is not possible to perform linear regression over wi to determine the optimal weights. We determine the optimal weights by gradient descent given as ∂E (5) Δwi = −η ∂wi where η is an adaptation or learning rate. Evaluating the partial, we obtain exy (ri (x, x) − λri (x, y)) (6) Δwi = η x,y

We can observe that the pair of samples (x, y) for which exy = 0 do not contribute to the weight adaptation. If for certain modality, the exy > 0 then the method reduces the weight (negative contribution to Δwi ) for that particular modality and vice-versa. The convergence depends on the selection of η. For a very small η, the convergence can be slow whereas for a relatively large value of η, the weights may not converge at all. We determine η automatically using line search method. We adapt the weights in each iteration such that E + ΔE = 0 (7)

1323 1327

III. E XPERIMENTAL R ESULTS

with half-half training-test split of the dataset. The table clearly shows that our method is much faster than the LRTGMM method. The large difference in the execution time for NIST-Finger dataset shows that our method is more scalable than LRT-GMM. It is to be noted that the highest rank and gradient descent algorithms were implemented in MATLAB and LRT-GMM in C++. Higher values of execution time of the highest rank method may be attributed to its sub-optimal implementation in MATLAB. In real life identification scenarios, the training is performed off-line and hence only the efficiency of testing phase affects the identification performance. The testing phase of our method involves ranking according to the weighted sum of scores of individual modalities and hence is efficient and scalable.

We have evaluated the identification performance of the gradient descent method on a public-domain dataset NISTBSSR1 [8]. This dataset contains multi-modal (two fingerprint and two face) scores for 517 users (NIST-517). It also contains scores from two fingerprint matchers (left and right index fingerprint) for 6000 users (NIST-Fingerprint) and scores from two face matchers for 3000 users (NIST-Face). We have compared the gradient descent approach for fusion with the individual modalities and a score fusion method and a rank fusion method. The score fusion technique is a likelihood ratio based identification as proposed in [1] which is referred to as LRT-GMM in this section. The rank fusion is the well-known highest rank method [9] which assigns a rank to a user such that it is the minimum of all the ranks assigned by different matchers. In accordance with [1], we observed that this rank fusion technique gives better performance than other methods like sum rule, Borda count, logistic regression for ranks greater than the number of matchers. The cumulative match characteristic (CMC) curves in Fig. 1 and Fig. 2 show the results for the NIST-Face and NISTFinger datasets respectively. Since the size of the NIST-517 dataset is very small and our method as well as LRT-GMM resulted in 100% rank-1 accuracy, we have not reported results on that dataset. The identification accuracies are the average values over 20 trials where each trial was conducted by randomly splitting the number of users in the dataset into half for training and half for testing. The value of K is 50 for all these experiments. While the performance of our method was good on the original dataset, we observed that it improved after a simple preprocessing step. The preprocessing involved raising the data points with a positive power. We have chosen the values of power empirically and have reported them in the results. The CMC curves and Table I indicate that our method outperforms the existing methods by a significant amount for all values of rank. For the NIST-Face dataset, the rank1 to rank-10 accuracies of our method are approximately 0.7-1% better than those of the LRT-GMM technique. The rank-1 accuracy of our method is around 4% better than that of the highest rank method and rank-2 to rank-10 accuracies are also better by 0.3-1%. Clearly, the fusion achieves a large improvement in the identification accuracy values over the individual matchers for all values of rank. Similar results are observed for the NIST-Finger dataset as well. Our method achieves an improvement of 0.5-0.7% in the accuracy over LRT-GMM consistently from rank 1 to 10. The rank-1 accuracy is around 9% greater than that of the highest rank fusion. Table II gives the execution time in seconds for different methods. We have measured the execution time for calculation of rank-1 accuracy for a single trial of the experiment

Rank−k Identification Accuracy (%)

95

90

85 Face Matcher 1 Face Matcher 2 Highest Rank Fusion LRT−GMM Gradient Descent Power = 1.2 80

1

2

Figure 1.

3

4

5 6 Rank (k)

7

8

9

10

Comparison of CMC curves on NIST-Face database

98

Rank−k Identification Accuracy (%)

96 94 92 90 88 86 Left Index Finger Right Index Finger Highest Rank Fusion LRT−GMM Gradient Descent Power = 0.5

84 82 80

1

Figure 2.

1324 1328

2

3

4

5 6 Rank (k)

7

8

9

10

Comparison of CMC curves on NIST-Finger database

Table I C OMPARISON OF RANK - K IDENTIFICATION ACCURACY (%).

Dataset Face

Finger

Method Matcher 1 Matcher 2 Highest Rank LRT-GMM Gradient Descent Matcher 1 Matcher 2 Highest Rank LRT-GMM Gradient Descent

k=1 84.47 81.13 83.06 87.24 87.91 81.80 88.71 85.71 94.33 94.94

k=2 87.25 84.57 89.23 89.48 90.51 83.73 90.18 95.00 95.09 95.65

k=3 88.57 86.08 90.68 90.64 91.65 84.56 90.87 95.49 95.40 95.97

Table II C OMPARISON OF EXECUTION TIME ( IN S ECONDS ) FOR DIFFERENT FUSION METHODS .

Method LRT-GMM Gradient Descent Highest Rank

NIST-Face Train Test 338.15 62.46 277.14 13.76 N/A 168.88

[5] M. Villegas and R. Paredes, “Score fusion by maximizing the area under the roc curve,” in Pattern Recognition and Image Analysis: 4th Iberian Conference, IbPRIA, 2009. [6] K. Nandakumar, Y. Chen, S. C. Dass, and A. K. Jain, “Likelihood ratio based biometric score fusion,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 30, no. 2, pp. 342–347, 2008. [7] A. Abaza and A. Ross, “Quality based rank-level fusion in multibiometric systems,” in In proc. of IEEE BTAS, 2009. [8] “National institute of standards and technology, nist biometric scores set release 1, http://www.itl.nist.gov/iad/894.03/biometricscores/,” 2004. [9] T. K. Ho, J. J. Hull, and S. N. Srihari, “Decision combination in multiple classifier systems,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 16, no. 1, pp. 66–75, 1994.

NIST-Finger Train Test 1809.05 593.18 194.58 11.54 N/A 671.67

IV. C ONCLUSION Designing effective and scalable biometric fusion algorithms for identification is an important and challenging problem. At the level of score fusion, the weighted sum of scores is a well known method. In this paper, we proposed a gradient descent technique to learn these weights for individual modalities. We considered linear combination of the modalities to perform the identification, however, the method can be extended to incorporate non-linear combinations as well. We conducted experiments on the NISTBSSR1 dataset, and demonstrated that the gradient-descent approach is able to perform better than the state-of-the-art identification algorithms in terms of accuracy and speed. As a future study we will perform the experiments on other multi-modal biometric datasets. R EFERENCES [1] K. Nandakumar, A. K. Jain, and A. Ross, “Fusion in multibiometric identification systems: What about the missing data?,” in ICB ’09: Proceedings of the Third International Conference on Biometrics, pp. 743–752, 2009. [2] A. K. Jain, K. Nandakumar, and A. Ross, “Score normalization in multimodal biometric systems,” Pattern Recognition, vol. 38, no. 12, pp. 2270–2285, 2005. [3] M. L. Gavrilova and M. M. Monwar, “Fusing multiple matcher’s outputs for secure human identification,” International Journal of Biometrics, vol. 1, no. 3, pp. 329–348, 2009. [4] S. Prabhakar and A. K. Jain, “Decision-level fusion in fingerprint verification,” Pattern Recognition, vol. 35, no. 4, pp. 861– 874, 2002.

1325 1329

Functional Gradient Descent Optimization for ... - public.asu.edu

A Block-Based Gradient Descent Search Algorithm for ...

QPLC: A novel multimodal biometric score fusion method

a decison theory based multimodal biometric authentication system ...

Functional Gradient Descent Optimization for Automatic ...

Hybrid Approximate Gradient and Stochastic Descent for Falsification ...

a decison theory based multimodal biometric ...

Hybrid Approximate Gradient and Stochastic Descent ...

cost-sensitive boosting algorithms as gradient descent

Gradient Descent Efficiently Finds the Cubic ...

Gradient Descent Only Converges to Minimizers: Non ...

a video-based biometric authentication for e- learning ...

A Dual Coordinate Descent Algorithm for SVMs ... - Research at Google

A Gradient Based Method for Fully Constrained Least ...

Towards a 3D digital multimodal curriculum for the ... - Semantic Scholar

A tandem clustering process for multimodal datasets

Multimodal Signal Processing and Interaction for a ...

Towards a 3D digital multimodal curriculum for the ...

Bema: A Multimodal Interface for Expert Experiential ... - Bret L. Jackson

Towards a 3D digital multimodal curriculum for the ... - Semantic Scholar

Multimodal Signal Processing and Interaction for a Driving ... - CiteSeerX