Neurocomputing 216 (2016) 393–401

Contents lists available at ScienceDirect

Neurocomputing journal homepage: www.elsevier.com/locate/neucom

Learning coherent vector fields for robust point matching under manifold regularization Gang Wang a,b,n, Zhicheng Wang b, Yufei Chen b, Xianhui Liu b, Yingchun Ren b, Lei Peng c a

School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, China CAD Research Center, College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China c College of Information Engineering, Taishan Medical University, Taian, Shandong, China b

art ic l e i nf o

a b s t r a c t

Article history: Received 9 January 2016 Received in revised form 3 June 2016 Accepted 1 August 2016 Communicated by Jiwen Lu Available online 8 August 2016

In this paper, we propose a robust method for coherent vector field learning with outliers (mismatches) using manifold regularization, called manifold regularized coherent vector field (MRCVF). The method could remove outliers from inliers (correct matches) and learn coherent vector fields fitting for the inliers with graph Laplacian constraint. In the proposed method, we first formulate the point matching problem as learning a corresponding vector field based on a mixture model (MM). Manifold regularization term is added to preserve the intrinsic geometry of the mapped point set of vector fields. More specially, the optimal mapping function is obtained by solving a weighted Laplacian regularized least squares (LapRLS) in a reproducing kernel Hilbert space (RKHS) with a matrix-valued kernel. Moreover, we use the Expectation Maximization (EM) optimization algorithm to update the unknown parameters in each iteration. The experimental results on the synthetic data set, real image data sets, and non-rigid images quantitatively demonstrate that our proposed method is robust to outliers, and it outperforms several state-of-the-art methods in most scenarios. & 2016 Elsevier B.V. All rights reserved.

Keywords: Point matching Mismatch removal Vector field learning Manifold regularization Kernel

1. Introduction Point matching problem is a fundamental problem and plays a significant role in computer vision, signal processing, and pattern recognition [1–6], and it frequently arises in many applications, such as image registration, medical imaging, 3D reconstruction, image stitching, and object recognition. The goal of the matching task is to distinguish inliers from outliers between given two point sets where each point set is captured from an image by a certain local feature extractor (e.g., SIFT [7], SURF [8,9]). However, the matching problem has several challenges: (1) initial correspondence set is usually contaminated by outliers (false matches or mismatches) after matching feature point pairs using similarity based method such as Best Bin First (BBF) [7], (2) the matching problem is an ill-posed problem and needs a constraint to preserve the intrinsic geometry of point set, (3) the transformation between point sets can be linear (e.g., translation, similarity, affine) or non-linear (e.g., quadratic, non-rigid), note that the latter one is hard to solve. Many algorithms exist for point matching and try to address the above challenges. The most popular algorithm in the field is n Corresponding author at: School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, China. E-mail address: [email protected] (G. Wang).

http://dx.doi.org/10.1016/j.neucom.2016.08.009 0925-2312/& 2016 Elsevier B.V. All rights reserved.

RANdom SAmple Consensus (RANSAC) [10], it repeatedly generates a hypothetical model from a small correspondence set, and then verifies each model on the whole set to select the best one. However, limitation occurs when facing non-linear transformations. To overcome those limitations, many progressive RANSAC algorithms have been developed, such as maximum likelihood estimation sample consensus (MLESAC) [11], progressive sample consensus (PROSAC) [12], non-rigid RANSAC [13]. It is worth noting that Sunglok et al. [14] has evaluated the performance of RANSAC algorithm family. From the iterative point matching based methods [15,16], correct matches can be identified. Further, from the perspective of motion coherence (i.e., spatial coherence), the floating point set is moved to the target point set as close as possible by a set of smooth mapping functions. Some state-of-the-art methods are based on this motion field coherence theory (MCT) [17], such as coherent point drift (CPD) [18], Gaussian mixture model and thinplate spline (GMM–TPS) [19], vector field consensus (VFC) [20,3,21], mixture of asymmetric Gaussian (MoAG) model [22,23], robust L2E estimation [24], and context-aware Gaussian fields criterion (CA-LapGF) [25]. More specifically, the non-rigid transformation is parameterized by radial basis function (RBF), such as thin-plate spline (TPS), and Gaussian RBF (GRBF). Finally, outliers would be rejected as well as possible after learning a coherent motion field from point set pairs.

394

G. Wang et al. / Neurocomputing 216 (2016) 393–401

Moreover, a topological clustering algorithm [26] was proposed and used to filter out mismatches. With this method, outliers can be identified and rejected by checking the consistency of topological relationships between matched regions in the image pair. The support vector machine regression method was used to identify the point correspondences and remove outliers (ICF) [27]. In this paper, we focus on identifying and removing outliers from point set matching as well as possible based on vector field learning (VFL). More specially, we first formulate the point matching as learning a coherent vector field mapping function, and then use the manifold regularization to constrain the vector field with preserving the intrinsic geometry. Our contribution in this paper includes the following two aspects. (1) We introduce the well-known manifold regularization framework for learning coherent vector fields with outliers. (2) Based on the MCT point matching model, we propose manifold regularized coherent vector field learning method (namely manifold regularized coherent vector field, MRCVF) for robust point matching, which can improve the matching accuracy compared to state-of-the-art methods. It is worth noting that our MRCVF is based on VFL method such as VFC [20], and the motivation derives from (1) the initial correspondence set contaminated by outliers, and (2) the natural property of manifold regularization. The remainder of the paper is organized as follows. In Section 2 we first present the coherent vector field learning algorithm more formally and profoundly using manifold regularization constraint. In Section 3 we evaluate the proposed algorithm by some experiments on the public data set. In Section 4 we give a brief discussion and conclusion.

2.1. Vector field learning Let us recall the familiar vector field learning briefly. Let input point set be X and output point set be Y , then given a finite training set of labeled correspondences with some unknown outliers S = {(xi , yi )}iN= 1. We define a mapping function f from a structured input space ? ∈ A to a structured output space @ ∈ B from labeled examples S, then our task is to learn f : ?↦@ , i.e., yi = f (xi ) and identify the inliers (namely remove outliers), where f ∈ / , and / is a reproducing kernel Hilbert space. Let k: ? × ?↦ be a standard Mercer kernel with an associated RKHS family of functions /K : ?↦ with the corresponding norm ∥·∥/ . Then the optimal mapping function f can be solved by minimizing the following Tikhonov regularized [28] optimization problem,

f ∈ /k

f ⋆ = arg min f ∈ /k

1 N

N

∑ ∥ yi

− f (xi )∥2 + λ1 ∥ f ∥2/ + λ2 ‖f ‖20

i=1

1 N

N

∑ ∥ yi

− f (xi )∥2 + λ1‖f ‖2/

i=1

(1)

where the solution of f can be expressed by the classical Representer theorem [29] with finite dimensional coefficients N α = [α1, … , αN ]T , i.e., f ⋆ (x ) = ∑i = 1 αi k (x, xi ) with a linear system (K + λNI ) α = Y , where K is a positive semi-definite Gram matrix with K (i, j ) = k (xi , xj ), λ1 > 0 is a trade-off parameter, I denotes the identity matrix.

‖f ‖20 = f T Lf =

1 2

N



Wij (f (xi ) − f (xj ))2

(3)

i, j = 1

where f = [f (x1) , … , f (xN )], note that D is a diagonal matrix with N

elements Dii = ∑ j = 1 Wij . The solution of coherent vector fields will be discussed later.

2.3. Learning coherent vector fields Motivated by the sample consensus, the inliers can be fitted by a coherent vector field mapping. Thus we assume that the error between Y and f (X) satisfies the following distributions,

In Manifold Regularization framework [30,31], an additional penalty term ‖f ‖20 is used to penalize f along a low dimensional manifold. Thus we can learn coherent vector fields under manifold regularization by minimizing the following extension

(4)

where the error for inliers satisfies Gaussian distribution with zero mean and uniform standard deviation s, while the error for outliers satisfies a uniform distribution 1 with a positive constant u. u

Thus the error between observed input-output pairs is modeled as a mixture model of the Gaussian and uniform distributions [11,18,20,32],

p (Si |θ ) = γ

1 D

(2πσ 2) 2

⎛ ⎞ ∥ y − f (xi )∥2 ⎟ 1 exp ⎜⎜ − i ⎟ + (1 − γ ) u 2σ 2 ⎝ ⎠

(5)

where 0 ≤ γ ≤ 1 is a mixing coefficient denoting the percentage of inliers, θ = {f , γ , σ 2} is the set of unknown parameters, and D denotes the dimension of data. Moreover, to reduce over-fitting and preserve smoothness constraint, the prior of the coherent mapping function f under manifold regularization can be expressed as follows,

(

p (f ) ∝ exp −λ1‖f ‖2/ − λ2 ‖f ‖2I

)

(6)

According to Bayes' theorem, the posterior distribution p (θ|S ) could be estimated by the given (5) and prior (6),

p (θ|S ) ∝ 3 (θ|S ) p (f ) 2.2. Manifold regularized coherent vector field

(2)

where λ1 controls the complexity of the mapping function in the ambient space while λ2 controls the complexity of the mapping function in the intrinsic geometry. More specially, Let W be a nearest-neighbor graph which serves as a discrete probe for the geometric structure of the data, then the graph Laplacian L = D − W provides a natural intrinsic measure for simplicity of data-dependent smoothness,

⎧ ϵi ∼ 5 (0, σ 2I ) if inliers, ⎪ Y − f (X) = ⎨ 1 if outliers. ⎪ ϵo ∼ ⎩ u

2. Methods

f ⋆ = arg min

of Eq. (1),

(7) N ∏i = 1

where the likelihood 3 (θ|S ) = p (Si |θ ), and the optimal solution of θ is to estimate a maximum a posteriori (MAP). Considering the complete-data with a latent variable zi, where zi ¼0 for outliers, and zi ¼ 1 for inliers, then the objective function is an upper bound of the negative log-likelihood function of (7),

G. Wang et al. / Neurocomputing 216 (2016) 393–401

8 (θ ) =

1 2σ 2

N

∑ pi ∥ yi

− f (xi )∥2 +

i=1

coherent vector field after solving the optimal parameter α , f = Kα .

Np D log σ 2 + Np log γ 2

+ Mp log (1 − γ ) + λ1‖f ‖2/ + λ2 ‖f ‖20 N

(8) N

where pi = P (zi = 1|Si, θ ), Np = ∑i = 1 pi , and Mp = ∑i = 1 P (zi = 0|Si, θ ) , note that some θ-independent constants are omitted. In order to estimate the optimal parameters of Eq. (8), EM algorithm is used to solve this problem. In the E-step, inliers are identified by a fixed coherent vector field. The weight P is the responsibilities (posterior probability), and it is a diagonal matrix with elements Pii = pi , which can be expressed as follows based on Bayes' rule,

⎛ ∥ y − f (xi )∥2 ⎞ γ exp ⎜ − i ⎟ 2σ 2 ⎝ ⎠ pi = D ⎞ ⎛ 2 ∥ y − f (xi )∥ ⎟ (2πσ 2) 2 γ exp ⎜⎜ − i + ( − γ ) 1 ⎟ 2σ 2 u ⎝ ⎠

(9)

where the larger the probability pi is, the reliable the inlier is. Here we can define a threshold ζ ∈ [0, 1] for identifying inlier set C = {pi > ζ}iN= 1 after the EM iteration converges or reaching some stop conditions. Unknown parameters θ = {f , γ , σ 2} are updated by taking a derivative of 8 (θ ) with respect to each parameter in the M-step. Firstly, we can obtain γ, and s2 as follows,

γ=

Np N

σ2 =

(10)

tr ⎡⎣ (Y − f (X)) T P ((Y − f (X)) ⎤⎦ Np D

(11)

where tr (·) denotes the trace operator. Now we begin to discuss the solution of the coherent vector field under manifold regularization. Ignoring the f-independent terms of the objective function (8), we can obtain a laplacian regularized least squares (LapRLS) with a weighting vector P ,

8 (f ) =

1 2σ 2

N

∑ pi ∥ yi

− f (xi )∥2 + λ1‖f ‖2/ + λ2 f T Lf

i=1

(12)

where f can be written as αK with a squared exponential kernel k (xi , xj ) = exp −β ∥ xi − xj ∥2 , thus we can solve the optimal parameter α by minimizing the function (12), we can obtain

(

α⋆ = argmin α ∈N

)

1 (Y − Kα )T P (Y − Kα ) 2σ 2

+ λ1αT Kα + λ2 αT K T LKα

(13)

By taking the derivative of the objective function with respect to α , and let it be zero, we can obtain the optimal form,

(

)−1Y

α = K + 2σ 2 (λ1I + λ2 LK ) P−1

395

(14)

where I is a N × N dimensional identity matrix, L is computed by D − W , where the adjacency graph W is constructed by k nearest neighbors or a graph kernel Wij . Finally, we can obtain the

2.4. Analysis The proposed MRCVF algorithm includes three main parts: (1) construct a data adjacency graph using k nearest neighbors and then choose edge weights using heat kernel weights ⎛ ∥ xi −xj ∥2 ⎞ ⎟, (2) compute graph laplacian similarity maWij = exp ⎜ − 2τ 2 ⎝ ⎠ trix using D − W , and (3) learn coherent vector fields under manifold regularization constraint in a reproducing kernel Hilbert space with matrix-value kernel Ki, j = exp −β ∥ xi − xj ∥2 . In the MRCVF algorithm, there are several parameters: k, τ, s2, u, γ, β, λ1, λ2, and ζ. Following the suggestion of manifold regularization framework [30] and the source code (http://manifold. cs.uchicago.edu/manifold_regularization/manifold.html), we set k¼ 6, while we set τ = σ to let the neighbor data keep equal scale in each iteration of EM. According to the suggestion of vector field consensus algorithm [20], we use the same parameter values, i.e., u ¼10, γ = 0.90, β = 0.10, and λ1 = 3 for fair comparing. The initial scale between point sets needs to set a relative large value for EM

(

)

tr ⎡⎣ (Y − X) T (Y −X) ⎤⎦

algorithm, thus we set σ 2 = , the iterative procedure D ×N is similar to deterministic annealing [33]. The manifold regularization parameter λ2 is set to 0.10, and it controls the complexity of the vector field mapping in the intrinsic geometry. The inliers set is determined by the given threshold ζ = 0.50. In the linear system (14), a lower bound ε = 1e − 5 is defined for weight P , then some problems will be avoided when P is singular. The computational complexity of the MRCVF algorithm is O (DN3 ) for 2D image point matching. Based on the dimensionality reduction, low-rank matrix approximation method is used to approximate the Gram matrix with choosing several principle components, as discussed in [18], the resulting complexity can be reduced to O(DN) at best. We briefly summarize the proposed manifold regularized coherent vector field method (MRCVF) in Algorithm 1. Algorithm 1. The MRCVF algorithm. Input: The labeled training set with outliers S = {(xi , yi )}iN= 1 Output The coherent vector fields f, and inliers set C. 1: Begin 2: Initialize parameters. 3: Initialize σ 2 =

tr ⎡⎣ (Y − X) T (Y −X) ⎤⎦ D ×N

.

4: Repeat 5: Construct data adjacency graph with N nodes by k nearest neighbors. 6: Choose edge weights of the adjacency graph by heat kernel with bandwidth τ = σ . 7: Compute graph Laplacian similarity matrix by L = D − W . 8: Compute the Gram matrix K with bandwidth β. 9: E-step: Update P by (9). 10: M-step: Update parameters γ, s2, and α by (10), (11) and (14). 11: Until objective function converges. 12: The coherent vector fields can be determined by f = Kα . 13: The inliers can be determined by C = {(xi , yi ) : pi > ζ}iN= 1. 14: End

396

G. Wang et al. / Neurocomputing 216 (2016) 393–401

3. Experiments 3.1. Experimental setup In this section, we performed experiments on a synthetic data set [33], a real image data set [34], and a non-rigid image set for robust point matching. Comparisons are made with RANSAC [10], ICF [27], CPD [18], GMM–TPS [19], VFC [20], and Non-rigid RANSAC [13]. We implemented the MRCVF algorithm in Matlab R2015a, and the experiments are performed on a 2.5 GHz Intel Core CPU with 8 GB RAM. It is worth noting that we mainly use accuracy, precision, recall, and F-score as the quantitative evaluation criteria. Let TP be true positive, TN be true negative, FP be false positive, and FN be false negative. Accuracy is the proportion of true results among the total number of cases examined, and it is defined as

acc =

TP + TN TP + TN + FP + FN

precision, recall and F-score are defined as follows respectively,

pr =

TP TP 2pr × re , re = ,F= . TP + FP TP + FN pr + re

proposed method based on manifold regularization is better than the other methods in both accuracy, and precision-recall pair. ICF uses the SVM regression to learn a mapping function, and then identify inliers which satisfy the learned mapping function. CPD uses Gaussian mixture model to estimate a transformation function between two point sets, and then determine inliers by a predefined threshold (0.50 in this paper) like the MRCVF algorithm, but the weight of inliers is fixed and we set γ ¼0.50. Non-rigid RANSAC improves the classical RANSAC for deformable registration problem, and it solves the limitations of RANSAC when facing non-linear transformations. VFC uses a robust method to learn vector fields and applies for mismatch removal. More specially, according to the regularization, the MRCVF with λ2 = 0 is regarded as the VFC algorithm. The accuracy values of all matching results in Fig. 2 illustrate that the MRCVF gives the best performance. Considering the manifold regularization constraint, the accuracy is improved when comparing with VFC. The MRCVF has lower accuracy than the VFC on 19 cases over 200 cases, but it has higher accuracy than the VFC on 88 cases over 200 cases. Moreover, precision-recall values are shown in Table 1, the average for ICF, CPD, VFC, and Non-rigid RANSAC is (95.79%, 63.83%), (62.12%, 94.38%), (98.76%, 94.61%), and (84.19%, 97.85%) respectively, while (98.44%, 97.20%) for our proposed MRCVF algorithm with λ2 = 0.1. The accuracy and precision-recall of the MRCVF keep high values as increasing the outlier ratio (from 0.33 to 0.50) in this experiment. 3.3. Real image data set

3.2. synthetic data set 2D Chinese character and fish data sets are well used in point set matching and registration. Here we choose two outlier groups (outlier ratio: 0.33 and 0.50 for each group), where each group includes 100 point sets (50 pairs) with outliers. In this experiment, we show an example of the qualitative results by our MRCVF algorithm, as shown in Fig. 1. The initial point set pairs are contaminated by outliers which uniformly distribute around the true shape points. Blue arrows denote that the inliers are identified and fitted as a coherent vector field, while black arrows denote that the outliers are rejected by the MRCVF algorithm. Note that the initial matches are labeled by the author of each data set [33]. Fig. 2 shows the quantitative comparison results. For these nonrigid point sets with structure information, the performance of our

Fig. 3 shows the image sets used to evaluate the matching methods, called Oxford affine covariant regions datasets. There are six different changes in imaging conditions are evaluated: rotation (bark, and boat), viewpoint changes (graf, and wall), scale changes (bark, and boat), image blur (bikes, and trees), illumination (leuven), and JPEG compression (ubc). It is worth noting that it offers ground-truth (underlying transformation matrix) for easily quantitative evaluation. The left most image of each set is used as the reference image, the others are as object images, and then we can obtain 40 image pairs (five pairs in each set). In this experiment, we first use the VLFEAT toolbox [35] to detect SIFT [7] keypoints from each image pair, and then use the Best Bin First (BBF) matching method to construct an initial labeled training set S with matching threshold 1.50. Note that the limitation of the nearest neighbor matching methods, many

Fig. 1. Experimental result examples of the MRCVF algorithm on 2D non-rigid synthetic point set: fish and Chinese character. The outlier ratio in the first row is 0.33, and 0.50 for the second row. Inliers are matched (blue arrows), and outliers are rejected (black arrows). Best viewed in color. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article).

G. Wang et al. / Neurocomputing 216 (2016) 393–401

397 Fish (outliers: 33%)

1

0.95

0.95

0.9

0.9

0.85

0.85

0.8

0.8

Accuracy

Accuracy

Chinese Character (outliers: 33%) 1

0.75

0.75

0.7

0.7

0.65

0.65

0.6 0.55

0.6

ICF(93.35%,70.87%) CPD(70.14%,92.55%) VFC(99.05%,94.38%) Non-rigid RANSAC(77.64%,98.35%) Ours(98.84%,96.53%)

0.55

0.5

ICF(97.04%,74.27%) CPD(75.65%,91.24%) VFC(99.32%,97.51%) Non-rigid RANSAC(92.12%,98.85%) Ours(99.33%,98.88%)

0.5

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

Image Pairs

7

8

9

10

7

8

9

10

Fish (outliers: 50%) 1

0.95

0.95

0.9

0.9

0.85

0.85

0.8

0.8

Accuracy

Accuracy

Chinese Character (outliers: 50%) 1

0.75

0.75

0.7

0.7

0.65

0.65 0.6

0.6 0.55

6

Image Pairs

ICF(94.63%,53.83%) CPD(50.59%,96.86%) VFC(98.06%,93.75%) Non-rigid RANSAC(77.28%,97.07%) Ours(97.32%,96.40%)

0.55

0.5

1

2

3

4

5

6

7

8

9

10

0.5

ICF(98.14%,56.34%) CPD(52.10%,94.86%) VFC(98.59%,92.78%) Non-rigid RANSAC(89.71%,97.11%) Ours(98.25%,97.00%) 1

Image Pairs

2

3

4

5

6

Image Pairs

Fig. 2. Performance comparison on 2D non-rigid synthetic point set pairs using accuracy, and precision-recall pair. The outlier ratio is 0.33 and 0.50 for the first and the second row respectively. ICF, CPD, VFC, Non-rigid RANSAC, and MRCVF are tested for comparing the point matching performance on 200 different point set pairs.

Table 1 Average precision and recall pairs (%) on each data set. The top four rows are synthesized data sets with outliers, and the others are real data sets. ICF, CPD, VFC, Non-rigid RANSAC, and MRCVF are tested. The larger both the precision and recall are, the better the performance is. Image Set

ICF [27]

CPD [18]

VFC [20]

Non-rigid RANSAC [13]

Ours

character(0.33) character(0.50) fish(0.33) fish(0.50) bark bikes boat graf leuven trees ubc wall

(93.35, 70.87 ) (94.63, 53.83 ) (97.04, 74.27 ) (98.14, 56.34 ) (100.0, 95.02 ) (100.0, 97.96 ) (100.0, 79.26 ) (92.22, 84.80 ) (100.0, 98.08 ) (100.0, 96.47 ) (100.0, 98.73 ) (100.0, 87.22 )

(70.14, 92.55 ) (50.59, 96.86 ) (75.65, 91.24 ) (52.10, 94.86 ) (93.18, 97.51 ) (99.63, 94.42 ) (78.62, 94.72 ) (83.26, 86.84 ) (99.73, 92.60 ) (99.19, 94.48 ) (99.67, 94.50 ) (84.01, 84.69 )

(99.05, 94.38 ) (98.06, 93.75 ) (99.32, 97.51 ) (98.59, 92.78 ) (100.0, 98.63 ) (100.0, 97.79 ) (80.93, 98.22 ) (98.52, 98.76 ) (100.0, 96.22 ) (99.89, 97.93 ) (100.0, 97.92 ) (96.00, 99.16 )

(77.64, 98.35 ) (77.28, 97.07 ) (92.12, 98.85 ) (89.71, 97.11 ) (100.0, 99.95 ) (100.0, 98.10 ) (86.80, 84.25 ) (100.0, 94.40 ) (100.0, 97.85 ) (100.0, 96.60 ) (100.0, 98.62 ) (100.0, 94.97 )

(98.84, 96.53 ) (97.32, 96.40 ) (99.33, 98.88 ) (98.25, 97.00 ) (99.89, 100.0 ) (99.23, 99.90 ) (80.50, 100.0 ) (95.24, 100.0 ) (99.31, 99.92 ) (98.16, 99.80 ) (99.51, 99.88 ) (95.71, 98.11 )

outliers might be captured to the initial correspondence set falsely. Fig. 4 shows the examples of matching results by the MRCVF. Outliers are well rejected (TN, black arrows), and then the inliers are also well identified (TP, blue arrows). However, some inliers might be not identified (FN, cyan arrows), and some outliers might be identified as inliers falsely (FP, green arrows), because of the challenging imaging conditions. Quantitative evaluation is shown in Fig. 5, it is clearly to see that our MRCVF algorithm gives better accuracy than the other four approaches ICF, CPD, VFC, and Non-rigid RANSAC in most

cases. Due to the fifth image pairs in each set has the largest condition change, the corresponding accuracy might become worse, while our MRCVF still gives high accuracy values relatively. Considering precision and recall pairs, as shown in Table 1, the average pairs (99.03%, 92.19%), (92.16%, 92.47%), (96.92%, 98.08%), and (98.35%, 95.60%) for ICF, CPD, VFC, and Non-rigid RANSAC, respectively, while (95.95%, 99.70%) for our MRCVF. Moreover, the F-score measure is used to illustrate the trade-off performance between precision and recall, as shown in Table 2, to compare matching methods on different imaging conditions in

398

G. Wang et al. / Neurocomputing 216 (2016) 393–401

Fig. 3. Real image data (Oxford Affine Covariant Regions Datasets). From a to h: bark (zoom þ rotation), bikes (image blur), boat (zoom þrotation), graf (viewpoint change), leuven (light change), trees (blur), ubc (JPEG compression), and wall (viewpoint change). The data sets are available at http://www.robots.ox.ac.uk/vgg/data/data-aff.html.

Fig. 4. Point matching results on Oxford real image data sets in Fig. 3 by MRCVF. Here we just display the matching result of first two image pairs in each set. The blue arrows are identified inliers (TP), black arrows are rejected outliers (TN), cyan arrows are miss inliers (FN), and green arrows are false identified inliers (FP). Best viewed in color. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article).

Fig. 5. Performance comparison on Oxford real image datasets in Fig. 3. Accuracy for different methods: ICF, CPD, VFC, Non-rigid RANSAC, and our MRCVF.

G. Wang et al. / Neurocomputing 216 (2016) 393–401

399

Table 2 F-score (%) for different methods on Oxford real image datasets. RANSAC, ICF, CPD, GMM–TPS, VFC, Non-rigid RANSAC, and MRCVF are tested. The larger the F-score is, the better the performance is. The largest F-score of each image set is in bold type. The last row denotes the average F-score on whole Oxford image data sets. Image Set

RANSAC [10]

ICF [27]

CPD [18]

GMM–TPS [19]

VFC [20]

Non-rigid RANSAC [13]

Ours

bark bikes boat graf leuven trees ubc wall

93.64 95.47 77.14 80.42 96.19 92.53 96.86 91.39

97.44 98.97 88.43 88.36 99.03 98.20 99.36 93.17

95.29 96.95 85.92 85.01 96.03 96.78 97.02 84.35

41.66 98.94 39.39 51.73 99.43 88.66 95.96 72.85

99.30 98.88 88.74 98.64 98.07 98.90 98.95 97.56

99.97 99.04 85.51 97.12 98.91 98.27 99.31 97.42

99.94 99.57 89.20 97.56 99.62 98.98 99.69 96.90

Fig. 3. Due to F-score responds the trade-off degree of precision and recall, so we can see the matching property fairly. In the Oxford data sets, the MRCVF gives the best F-score in most scenarios. 3.4. Non-rigid images Non-rigid transformation is still a challenge in the field of image matching, medical image registration, and shape recognition. Due to the true non-rigid transformation model is always unknown and hard to model, and the large number of unknown transformation parameters, the point matching methods tend to be sensitive to outliers. Experiment on the synthesize data set shows the performance of point matching methods preliminary, and in this subsection, we firstly construct two non-rigid image data sets, and then use them to evaluate the point matching methods. Fig. 6 shows the non-rigid image data set. In the data set, we collect four images with different non-rigid transformation in each image set (Poster, and T-shirt). Here, the left-most image of each set in Fig. 6 is used as the reference image, then each case contains three image pairs. Note that we construct the ground-truth manually, more precisely, all mismatches are carefully removed one by one from the initial matches in Matlab. The qualitative experimental results using our MRCVF algorithm are shown in Fig. 7. We test MRCVF on whole non-rigid image pairs, and the EM iterative procedure (the middle four columns in Fig. 7) is used to illustrate the convergence speed of our MRCVF. In the beginning, we obtain the initial matches (the first column) using the VLFEAT toolbox [35] in Matlab, where the SIFT feature matching ratio is set to 1.50. From top to bottom, inlier ratio for each image pair is 71.75% (127/177), 82.80% (130/157), 69.29% (97/140), 85.65% (191/223), 79.53% (136/171), 80.14% (113/

141), respectively. Note that all initial matches are assumed as inliers in the beginning of the first iterative step. We cannot get any inliers after the first run, as shown in the second column, while almost inliers are identified after the fifth iteration, as shown in the third column. Compare with the results in the fourth and fifth columns, we see that our MRCVF reaches convergence after five iterative steps in most cases. The SIFT matches are classified as inliers and outliers as well as possible by our MRCVF in the right most column. Due to the uncertain non-rigid transformation, several missing inliers (green lines) and false identified inliers (black lines) still exist in the results. We test our MRCVF against ICF, CPD, VFC, and Non-rigid RANSAC methods on non-rigid image pairs. The results in accuracy show that all matching methods can get more than 0.8 accuracy values, and the MRCVF can catch the best performance. Moreover, the average precision-recall pairs for ICF, CPD, VFC, and Non-rigid RANSAC on Poster image set are (100.0%, 85.11%), (93.35%, 86.30%), (97.68%, 97.76%), (99.48%, 94.91%), respectively, and Similarly on T-shirt image set are (100.0%, 91.24%), (96.35%, 91.37%), (99.40%, 98.64%), (99.46%, 94.99%), respectively, while our MRCVF obtains (98.19%, 97.76%) and (99.57%, 98.64%), respectively. In addition, the average elapsed times of our MRCVF on Poster and T-shirt image set are 0.31 and 0.33 s, respectively. In conclusion, MRCVF demonstrates its capability of handling non-rigid image pairs for robust feature point matching. Due to the initial correspondences of the non-rigid image data set probably make the matching process ill-posed, which means that the manifold regularization penalty term plays an important role to solve the problem with preserving the intrinsic geometry of feature point pairs.

Fig. 6. Non-rigid image data set: Poster (the top row) and T-shirt (the bottom row). The left most image of each set is used as the reference image.

400

G. Wang et al. / Neurocomputing 216 (2016) 393–401

Initial Matches

Iteration 1

5

10

50

Final Matches

Fig. 7. Qualitative matching results on non-rigid image data set (six image pairs). The EM iteration process is shown from initial matches to final matches. In the middle four columns, the blue arrows are identified inliers (TP), black arrows are rejected outliers (TN), cyan arrows are miss inliers (FN), and green arrows are false identified inliers (FP). Moreover, in the right most column, yellow lines are identified matches, black ones are false identified matches, and green lines are miss matches. Best viewed in color.

4. Discussion and conclusion In this paper, we are motivated by the manifold regularization framework which can preserve the intrinsic geometry of the training data. Meanwhile, we also found that the vector field learning problem equals to a weighted Laplacian regularized least squares. Based on some state-of-the-art point matching methods, such as RANSAC, ICF, CPD, GMM–TPS, VFC, and Non-rigid RANSAC, we found no one method can get the best performance in every scenario. In our experiments, VFC, and Non-rigid RANSAC can give good performance in most scenarios, and we focus on the VFC algorithm and improve it with the manifold regularization constraint. Following the idea of intrinsic geometry constraint, graph Laplacian regularization can be applied to the motion field coherent theory based methods, such as CPD, and RPM–L2E [24]. Here, we mainly offer an idea for more accuracy improvement for the motion field based methods in a producing kernel Hilbert space with a certain squared exponential kernel. The proposed method, called manifold regularized coherent vector field (MRCVF), uses graph Laplacian regularization to constrain the intrinsic geometry of the data. Coherent vector fields are learned by the formulated objective function in an RKHS with a matrix-valued kernel. Then EM algorithm is used to estimate the unknown parameters iteratively. Experimental results on the synthetic data set and real image data sets demonstrate that the proposed MRCVF outperforms the tested state-of-the-art methods in most scenarios, and it is worth noting that the MRCVF is robust to outliers, and non-linear transformation. We will provide the Matlab code of the MRCVF algorithm free for academic research (https://sites.google.com/site/2013gwang/). Moreover, our future

work shall focus on applying the MRCVF algorithm for image registration, and its fast implementation method, such as sparse approximation.

Acknowledgments This work was supported by National Natural Science Foundation of China (61103070), and the Fundamental Research Funds for The Central Universities.

References [1] R. Hartley, A. Zisserman, Multiple View Geometry in Computer Vision, Cambridge university press, 2003. [2] H. Neemuchwala, A. Hero, P. Carson, Image matching using alpha-entropy measures and entropic graphs, Signal Process. 85 (2) (2005) 277–296. [3] J. Ma, J. Zhao, J. Tian, X. Bai, Z. Tu, Regularized vector field learning with sparse approximation for mismatch removal, Pattern Recognit. 46 (12) (2013) 3519–3532. [4] G. Wang, Z. Wang, Y. Chen, W. Zhao, Robust point matching method for multimodal retinal image registration, Biomed. Signal Process. Control 19 (2015) 68–76. [5] J. Ma, J. Zhao, Y. Ma, J. Tian, Non-rigid visible and infrared face registration via regularized gaussian fields criterion, Pattern Recognit. 48 (3) (2015) 772–784. [6] S. Banerjee, D.D. Majumdar, Shape matching in multimodal medical images using point landmarks with hopfield net, Neurocomputing 30 (1) (2000) 103–116. [7] D.G. Lowe, Distinctive image features from scale-invariant keypoints, Int. J. Comput. Vis. 60 (2) (2004) 91–110. [8] H. Bay, T. Tuytelaars, L. Van Gool, Surf: Speeded up robust features, in: Computer vision–ECCV 2006, Springer, 2006, pp. 404–417. [9] Y. Pang, W. Li, Y. Yuan, J. Pan, Fully affine invariant surf for image matching, Neurocomputing 85 (2012) 6–10.

G. Wang et al. / Neurocomputing 216 (2016) 393–401

[10] M.A. Fischler, R.C. Bolles, Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography, Commun. ACM 24 (6) (1981) 381–395. [11] P.H. Torr, A. Zisserman, Mlesac: a new robust estimator with application to estimating image geometry, Comput. Vis. Image Underst. 78 (1) (2000) 138–156. [12] O. Chum, J. Matas, Matching with prosac-progressive sample consensus, in: Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, Vol. 1, IEEE, 2005, pp. 220–226. [13] Q.-H. Tran, T.-J. Chin, G. Carneiro, M.S. Brown, D. Suter, In defence of RANSAC for outlier rejection in deformable registration, in: European Conference on Computer Vision (ECCV), Springer 2012, pp. 274–287. [14] C. Sunglok, K. Taemin, Y. Wonpil, Performance evaluation of ransac family, in: Proceedings of the British Machine Vision Conference (BMVC), 2009. [15] P.J. Besl, N.D. McKay, A method for registration of 3-d shapes, IEEE Trans. Pattern Anal. Mach. Intell. 14 (2) (1992) 239–256. [16] D. Qian, T. Chen, H. Qiao, T. Tang, Iterative point matching via multi-direction geometric serialization and reliable correspondence selection, Neurocomputing 197 (2016) 171–183. [17] A.L. Yuille, N.M. Grzywacz, A mathematical analysis of the motion coherence theory, Int. J. Comput. Vis. 3 (2) (1989) 155–175. [18] A. Myronenko, X. Song, Point set registration: coherent point drift, IEEE Trans. Pattern Anal. Mach. Intell. 32 (12) (2010) 2262–2275. [19] B. Jian, B.C. Vemuri, Robust point set registration using gaussian mixture models, IEEE Trans. Pattern Anal. Mach. Intell. 33 (8) (2011) 1633–1645. [20] J. Zhao, J. Ma, J. Tian, J. Ma, D. Zhang, A robust method for vector field learning with application to mismatch removing, in: Computer Vision and Pattern Recognition (CVPR), IEEE Conference on, IEEE, 2011, pp. 2977–2984. [21] J. Ma, J. Zhao, J. Tian, A.L. Yuille, Z. Tu, Robust point matching via vector field consensus, IEEE Trans. Image Process. 23 (4) (2014) 1706–1721. [22] G. Wang, Z. Wang, W. Zhao, Q. Zhou, Robust point matching using mixture of asymmetric gaussians for nonrigid transformation, in: Computer Vision–ACCV 2014, Springer, 2015, pp. 433–444. [23] G. Wang, Z. Wang, Y. Chen, W. Zhao, A robust non-rigid point set registration method based on asymmetric gaussian representation, Comput. Vis. Image Underst. 141 (2015) 67–80. [24] J. Ma, W. Qiu, J. Zhao, Y. Ma, A.L. Yuille, Z. Tu, Robust l2e estimation of transformation for non-rigid registration, IEEE Trans. Signal Process. 63 (5) (2015) 1115–1129. [25] G. Wang, Z. Wang, Y. Chen, Q. Zhou, W. Zhao, Context-aware gaussian field for non-rigid point set registration, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 5811–5819. [26] Y. Wang, D. Zhang, J. Tian, Topological clustering and its application for discarding wide-baseline mismatches, Opt. Eng. 47 (5) (2008) 057202. [27] X. Li, Z. Hu, Rejecting mismatches by correspondence function, Int. J. Comput. Vis. 89 (1) (2010) 1–17. [28] A.N. Tikhonov, V.Y. Arsenin, Solutions of ill-posed problems. [29] B. Schölkopf, A.J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, Cambridge MA, US, 2002. [30] M. Belkin, P. Niyogi, V. Sindhwani, Manifold regularization: a geometric framework for learning from labeled and unlabeled examples, J. Mach. Learn. Res. 7 (2006) 2399–2434. [31] H.Q. Minh, V. Sindhwani, Vector-valued manifold regularization, in: Proceedings of the 28th International Conference on Machine Learning (ICML-11), 2011, pp. 57–64. [32] G. Wang, Z. Wang, Y. Chen, W. Zhao, X. Liu, Fuzzy correspondences and kernel density estimation for contaminated point set registration, in: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC), IEEE, 2015, pp. 1936–1941. [33] H. Chui, A. Rangarajan, A new point matching algorithm for non-rigid registration, Comput. Vis. Image Underst. 89 (2) (2003) 114–141. [34] K. Mikolajczyk, C. Schmid, A performance evaluation of local descriptors, IEEE Trans. Pattern Anal. Mach. Intell. 27 (10) (2005) 1615–1630. [35] A. Vedaldi, B. Fulkerson, Vlfeat: an open and portable library of computer vision algorithms, in: Proceedings of the international conference on Multimedia, ACM, 2010, pp. 1469–1472.

Gang Wang is currently an Assistant Professor with School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China. He received the Ph.D. degree in Computer Science and Technology from Tongji University in 2016. His current research interest covers computer vision, pattern recognition, and data analysis.

401 Zhicheng Wang received the Ph.D. degree from Huazhong University of Science and Technology (HUST) in 2006. He is currently an Associate Researcher with CAD research center of Tongji University. His research topics include machine learning and image processing.

Yufei Chen is presently a Senior Lecturer in the CAD Research Center of Tongji University. She was a Postdoctoral Researcher in Control Science and Engineering of Tongji University from 2010 to 2012. She received the Ph.D. degree from Tongji University in 2010. She was also a Guest Researcher in Fraunhofer Institute for Computer Graphics Research, Germany from 2008 to 2009. Her research topics include image processing and data analysis.

Xianhui Liu is presently a Lecturer in the CAD Research Center of Tongji University. He received the Ph.D. degree from Tongji University in 2015. He is also an Associate Director in the CAD Research Center of Tongji University. His research topics include image processing, enterprise informatization, and data analysis.

Yingchun Ren received his M.Sc. degree from University of Shanghai for Science and Technology. He is currently working toward the Ph.D. degree in the CAD Research Center of Tongji University. His current research interest covers mathematics, image processing, pattern recognition, and data dimension reduction.

Lei Peng is presently an Associate Professor with College of Information Engineering, Taishan Medical University, Taian, Shandong, China. He is currently working toward the Ph.D. degree in the Department of Computer Science and Technology of Tongji University. He received his M.Sc. degree from Beijing Institute of Technology. His research topics include image processing, pattern recognition, and medical image analysis.

Learning coherent vector fields for robust point ...

Aug 8, 2016 - In this paper, we propose a robust method for coherent vector field learning with outliers (mismatches) using manifold regularization, called manifold regularized coherent vector field (MRCVF). The method could remove outliers from inliers (correct matches) and learn coherent vector fields fitting for the ...

3MB Sizes 2 Downloads 235 Views

Recommend Documents

Point Set Registration: Coherent Point Drift
May 15, 2009 - Missing points are the features that are not found in the image due to ..... where the matrix P has elements pmn = Pold(m|xn) in. (6) and the ...

Vector fields and differential forms
Apr 28, 2006 - 1.3 Local and global . .... Most of what we do in these notes is local. ... Here x, y are coordinates on the manifold, and a = f(x, y) and b = g(x, ...

A robust method for vector field learning with application to mismatch ...
Huazhong University of Science and Technology, Wuhan, China. {zhaoji84 ... kernel methods for learning vector fields, which is based on filtering the spectrum ...

DEEP LEARNING VECTOR QUANTIZATION FOR ...
Video, an important part of the Big Data initiative, is believed to contain the richest ... tion of all the data points contained in the cluster. k-means algorithm uses an iterative ..... and A. Zakhor, “Applications of video-content analysis and r

Robust point matching method for multimodal retinal ...
Gang Wang, Zhicheng Wang∗, Yufei Chen, Weidong Zhao. CAD Research Center, Tongji University, No. 4800, Cao'an Highway, ... [email protected] (W. Zhao). Recently, many related registration approaches have been ...... 110 (3) (2008) 346–359. [37] A.

Interactive Exploratory Visualization of 2D Vector Fields
... of 2D Vector Fields ization, dense texture-based visualization, geometric visual- ... Vector field visualizations tend to focus on creating global representations of ..... an illustration is highlighting the strength of the flow in a particular r

Vector Fields with the Oriented Shadowing Property 1 ...
Roosevelt Road, Taipei 106, Taiwan. email: [email protected]. Abstract. We give a description of the C1-interior (Int1(OrientSh)) of the set of smooth vector fields on a smooth closed manifold that have the oriented shadowing property. A sp

Interiors of Sets of Vector Fields with Shadowing ... - Springer Link
Corresponding to Certain Classes of Reparameterizations. S. B. Tikhomirov. Received May 18, 2008. Abstract—The structure of the C1-interiors of sets of vector ...

RESEARCH ARTICLE Newton Vector Fields on the ...
Email: [email protected]. ISSN: print/ISSN online ... that a method, first presented in [3] for visualising rational vector fields, can be extended to all Newton ...

Robust Interactive Learning - Steve Hanneke
contrasts with the enormous benefits of using these types of queries in the realizable case; this ... We consider an interactive learning setting defined as follows.

Robust Interactive Learning - Steve Hanneke
... 23 (2012) 1–34. 25th Annual Conference on Learning Theory ... the general agnostic setting and for the bounded noise model. We further show ... one of them. We call such queries class conditional queries. ...... operate this way. We show in ...

Semi-supervised learning of the hidden vector state model for ...
capture hierarchical structure but which can be ... sibly exhibit the similar syntactic structures which ..... on protein pairs to gauge the relatedness of the abstracts ...

Semi-supervised learning of the hidden vector state model for ...
trained automatically from only lightly annotated data. To train the HVS model, an abstract annotation needs to be provided for each sentence. For exam- ple, for the ...... USA, 2005. [14] Xu L, Schuurmans D. Unsupervised and semi-supervised multi-cl

Tutorials Point, Simply Easy Learning
Here table is a selector and border is a property and given value 1px solid #C00 is the value of ...... should be placed in relationship to the table. ...... Learn SQL.

Tutorials Point, Simply Easy Learning
Basic understanding on internet browsing using a browser like Internet ...... The letter-spacing property is used to add or subtract space between the letters that.

Robust Bayesian Learning for Wireless RF Energy ...
rely on ambient sources such as solar or wind in which the amount of energy harvested strongly depends on envi- ronmental factors. The RF energy source can ...

A robust incremental learning framework for accurate ...
Human vision system is insensitive to these skin color variations due to the .... it guides the region growing flow to fill up the interstices. 3.1. Generic skin model.

Iterative Online Subspace Learning for Robust Image ...
Facebook had 100 million photo uploads per day [24] and. Instagram had a database of 400 ...... https://www.facebook.com/blog/blog.php?post= 403838582130.

Robust Ground Plane Detection from 3D Point Clouds
support vector machine (SVM) were also popular tools to .... All objects exist above the ground so ..... [7] J. Byun, K. in Na, B. su Seo, and M. Roh, “Drivable.

Iterative Learning Control for Optimal Multiple-Point Tracking
on the system dynamics. Here, the improved accuracy in trajectory tracking results has led to the development of various control schemes, such as proportional ...

Robust Tracking with Weighted Online Structured Learning
Using our weighted online learning framework, we propose a robust tracker with a time-weighted appearance ... The degree of bounding box overlap to the ..... not effective in accounting for appearance change due to large pose change. In the.

Multidimensional generalized coherent states
Dec 10, 2002 - Generalized coherent states were presented recently for systems with one degree ... We thus obtain a property that we call evolution stability (temporal ...... The su(1, 1) symmetry has to be explored in a different way from the previo