Paper Title (use style: paper title)

Viewer
Transcript

Approximate Asymmetric kNN Search for Binary Features Chih-Yi Chiu and Yu-Cyuan Liou Department of Computer Science and Information Engineering, National Chiayi University, Chiayi City, Taiwan 60004 E-mail: [email protected]; [email protected] INTRODUCTION A number of binary embedding algorithms have been proposed in the computer vision community recently. By transforming the real-number vector of a visual descriptor into the corresponding binary pattern, these algorithms enable a fast nearest neighbor search and compact storage space. Although search in the binary space can be done by efficient machine instructions (e.g., the POPCNT instruction), using linear search that matches with all binary features exhaustively is still inefficient in a large-scale dataset. Some studies [1-2] presented approximate nearest neighbor search methods in the binary space. They are fast but might be inaccurate. Norouzi et al. [3] proposed an exact search through efficient multi-index hashing. However, it spends too much memory space. For example, a dataset of one billion 64-bit binary features has to spend 86GB memory. Besides, the kNN ranking in the binary space is less discriminating than that in the real-number space. On the other hand, asymmetric distance matching for binary features is shown to be more accurate than Hamming distance matching [4]. “Asymmetric” means the distance is computed between two different spaces, e.g., the query is a real-number vector while the reference data are binary patterns. Since asymmetric matching does not transform the query to a binary feature, it takes advantage of less information loss in the query side, and hence the kNN ranking is more accurate and discriminating. We propose the approximate asymmetric nearest neighbor search for binary features. Based on [4], we assume that the binary embedding function ℎ(𝑥𝑖 ) = 𝑞�𝑓(𝑥𝑖 )� can be decomposed into two functions f and q, where 𝑓(𝑥𝑖 ): {𝐑}𝑠 → {𝐈}𝑡 transforms xi to a t-dimensional intermediate vector yi, and 𝑞(𝑦𝑖 ): {𝐈}𝑡 → {𝐁}𝑡 transforms yi to a t-dimensional binary vector zi. 1 In [4], the authors adopted the linear search scheme, which is inefficient in a large-scale dataset. In this paper, we present an approximate version that can yield comparable accuracy but much faster than [4], and can spend less memory than [3].

the intermediate space for the jth subvector (Step 2), and sort them to obtain an index list {βl} (Step 3). Next, we sequentially calculate the intermediate distances of reference files from the nearest group (𝑗) 𝑍𝛽𝑙 (Step 5). Input: A feature xq, a reference binary dataset {zi}, a set of (𝑗) (𝑗) groups �𝑍𝛽 � and corresponding intermediate means �𝑦𝛽 �. Output: k nearest neighbors. Step 1: Generate the t-dimensional intermediate feature 𝑦𝑞 = 𝑓�𝑥𝑞 �. Step 2: Divide yq into m subvectors. For each j and β, compute (𝑗)

(𝑗)

(𝑗)

(𝑗)

the 2-norm distance 𝑑 �𝑦𝑞 , 𝑦𝛽 � = �𝑦𝑞 , 𝑦𝛽 � in the 2

intermediate space. Step 3: For each j, sort the intermediate distances in ascending order. Denote the sorted indexes as {𝛽𝑙 , 𝑙 = 1,2, … 2 𝑔 }, where (𝑗) (𝑗) 𝑑 �𝑦𝑞 , 𝑦𝛽𝑙 � is the lth nearest intermediate distance. Step 4: Initialize xi.distance = 0 and xi.vote = 0 for all i. Initialize an empty nearest neighbor list D. Step 5: Do the following loop for l = 1 to 2g for j = 1 to m (𝑗) for 𝑖 ∈ 𝑍𝛽𝑙 (𝑗)

(𝑗)

𝑥𝑖 . 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 = 𝑥𝑖 . 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 + 𝑑 �𝑦𝑞 , 𝑦𝛽𝑙 �. 𝑥𝑖 . 𝑣𝑜𝑡𝑒 = 𝑥𝑖 . 𝑣𝑜𝑡𝑒 + 1. if xi.vote == m Add xi in D and sort in ascending order. end end end if the size of D is equal to or greater than k return D. end end

THE PROPOSED METHOD

COMPLEXITY ANALYSIS

Suppose we have a reference dataset of n s-dimensional realnumber vectors {xi | i = 1, 2, … , n}, and its t-dimensional binary features {zi | i = 1, 2, … , n}. Every binary feature zi is divided into 𝑡 m subvectors, each of which comprising 𝑔 = bits (assume t is

The time complexity of the online search algorithm is about 𝑛 𝑚 ∙ 2 𝑔 ∙ 𝑔 + 𝑙 ∙ 𝑚 ∙ 𝑔 , i.e., the sorting time plus the number of 2 computations for updating the reference distance. The space complexity is about 𝑚 ∙ 𝑛 ∙ 4 bytes + 𝑛 ∙ 4 bytes, i.e., the size of m index tables (each of 4n bytes) plus the distance voting table.

𝑚 (𝑗)

divisible by m). Denote the jth subvector as 𝑧𝑖 , j = 1, 2, … , m. (𝑗) For a g-bit binary pattern 𝛽 ∈ {𝐁}𝑔 , we build the group 𝑍𝛽 = (𝑗)

�𝑖�𝑧𝑖 ∈ 𝛽, 𝑖 = 1,2, … , 𝑛� and intermediate space: (𝑗)

𝑦𝛽 = 𝑧 (𝑗)

1

compute

its

mean

in

the

(𝑗) � 𝑦𝑖 . (𝑗) �𝑍𝛽 � (𝑗) 𝑖∈𝑍 𝛽

If a subvector = 𝛽 , its correspondence in the intermediate (𝑗) (𝑗) space can be considered to be 𝑦𝛽 . We pre-compute 𝑦𝛽 for all j and β offline. Given a query 𝑥𝑞 ∈ {𝐑}𝑠 , we list the online search algorithm (𝑗)

(𝑗)

as follows. We first compute the distance between 𝑦𝑞 and 𝑦𝛽 in

1

R, I, and B represents the real-number, intermediate, and binary spaces respectively

REFERENCE [1] M. Muja and D. G. Lowe, “Fast matching of binary features,” In Proceedings of International Conference on Computer and Robot Vision, 2012. [2] M. M. Esmaeili, R. K. Ward, and M. Fatourechi, “A fast approximate nearest neighbor search algorithm in the Hamming space,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, No. 12, pp. 2481-2488, 2012. [3] M. Norouzi, A. Punjani, and D. J. Fleet, “Fast search in Hamming space with multi-index hashing,” In Proceedings of International Conference on Computer Vision and Pattern Recognition, 2012. [4] A. Gordo, F. Perronnin, Y. Gong, and S. Lazebnik, “Asymmetric distances for binary embeddings,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 36, No. 1, pp. 33-47, 2014.