Semantic Hashing Presented by : Ali Vashaee IFT725 Neural network With Hugo larochelle Hinton, Salakhutdinov 2009

Outline • • • • •

What is semantic hashing Unlabeled data Multilevel autoencoder Results Conclusion

Semantic hashing

Document retrieval • Word-count vector • If we computes document similarity directly in the word-count space, which can be slow for large vocabularies.

Document Retrieval • • • •

N= size of the document corpus V= latent variable size LSA: O(Vlog(N)) with kdtree Semantic Hashing : is not dependent on the size of the document corpus. And linear of the size of the short list that it will produce.

Deep auto-encoder

Word count vector

First layer RBM • ‘‘Constrained Poisson Model” that is used for modeling word-count vectors.

The constrained Poisson model

is the bias of the conditional Poisson model for word i, and bj is the bias of feature j.

RBM

Unrolling Word count vector

Word count vector

Fine Tuning • Conjugate gradient for fine-tuning. • Reconstructing the code words. • Replacing the stochastic binary values of hidden units with probability. • Code layer close to binary . • Deterministic noise to force fine tuning to find the binary codes in top layer.

Activities of 128 bit code

• After fine tuning the array codes are threshold to get binary values.

Training • Pretraining: Mini-baches with 100 cases. • Greedy pretrained with 50 epochs. • The weights were initialized with small random values sampled from a zero-mean normal distribution with variance 0.01. • For fine-tuning : conjugate gradients. • After fine-tuning, the codes were thresholded to produce binary code vectors.

Hashing • Hashing : a shortlist of similar documents in a time that is independent of the size of the document collection and linear.

Semantic Hashing Gaurman et al

Screen clipping taken: 11/28/2012, 6:36 PM

Hamming distance • Very little memory is needed for codes. • Fast to find the hamming distance between binary codes.

Experimental Results

• Using a 2000 word counts vector. • Two text datasets: 20-newsgroups and Reuters corpus Volume I (RCV1-v2). • 2000-500-500 (128 ,30 )

Class structure of documents • Semantic hashing with 128 bit code:

20 Newsgroups

• 3.6 ms to search through 1 million documents using 128-bit codes. • The same search takes 72 ms for 128-dimensional LSA

Reuters RCV2

Image retrieval Torralba et al

• nearest neighbors from a database of 12,900,000 images

Thank you

Semantic Hashing -

Conjugate gradient for fine-tuning. • Reconstructing the code words. • Replacing the stochastic binary values of hidden units with probability. • Code layer close ...

876KB Sizes 0 Downloads 252 Views

Recommend Documents

Semantic Hashing -
Deep auto-encoder. Word count vector ... Training. • Pretraining: Mini-baches with 100 cases. • Greedy pretrained with 50 epochs. • The weights were initialized ...

Discrete Graph Hashing - Semantic Scholar
matrix trace norm, matrix Frobenius norm, l1 norm, and inner-product operator, respectively. Anchor Graphs. In the discrete graph hashing model, we need to ...

An Improved Version of Cuckoo Hashing - Semantic Scholar
Proof (Sketch). We model asymmetric cuckoo hashing with help of a labelled bipartite multigraph, the cuckoo graph (see [11]). The two sets of labelled nodes ...

Semi-Supervised Hashing for Large Scale Search - Semantic Scholar
Unsupervised methods design hash functions using unlabeled ...... Medical School, Harvard University in 2006. He ... stitute, Carnegie Mellon University.

Sparse Semantic Hashing for Efficient Large Scale ...
Nov 7, 2014 - explosive growth of the internet, a huge amount of data have been ... its fast query speed and low storage cost. ..... The test time for SpSH is sufficiently fast especially when compared to the nonlinear hashing method SH. The reason i

An Improved Version of Cuckoo Hashing: Average ... - Semantic Scholar
a consequence, new implementations have been suggested [5–10]. One of ... such a case, the whole data structure is rebuild by using two new hash functions. ... by s7,s6,...,s0 and assume that f denotes an array of 32-bit random integers.

Complementary Projection Hashing - CiteSeerX
Given a data set X ∈ Rd×n containing n d-dimensional points ..... data set is publicly available 3 and has been used in [7, 25, 15]. ..... ing for large scale search.

Corruption-Localizing Hashing
Mar 8, 2011 - 2 School of Computer Science, University of Electronic Science and ... known paradigms, such as virus signature detection, which we do not ...

Hashing with Graphs - Sanjiv Kumar
2009) made a strong assumption that data is uniformly distributed. This leads to a simple analytical eigen- function solution of 1-D Laplacians, but the manifold.

Discrete Graph Hashing - Sanjiv Kumar
IBM T. J. Watson Research Center. ‡ ... gantic databases. ..... We call the code learning model formulated in Eq. (4) as Discrete Graph Hashing (DGH). Because.

Rapid Face Recognition Using Hashing
cal analysis on the recognition rate of the proposed hashing approach. Experiments ... of the images as well as the large number of training data. Typically, face ...

Rapid Face Recognition Using Hashing
cal analysis on the recognition rate of the proposed hashing approach. Experiments ... of the images as well as the large number of training data. Typically, face ...

Backyard Cuckoo Hashing: Constant Worst-Case ... - CiteSeerX
Aug 4, 2010 - dictionary. Cuckoo hashing uses two tables T1 and T2, each consisting of r = (1 + δ)ℓ entries ...... Internet Mathematics, 1(4), 2003. [CFG+78].

Scalable Heterogeneous Translated Hashing
stant or sub-linear query speed and low storage cost, hashing based methods .... tions and problem definitions, we show that HTH can be achieved by solving a ...

Optimized Spatial Hashing for Collision Detection of ...
In [20], the interaction of a cylindrical tool with deformable tis- ..... Journal of Graphics Tools, vol. 2, no. 4, pp ... metric objects,” Proceedings of IEEE Visualization.

Sequential Projection Learning for Hashing with ... - Sanjiv Kumar
the-art methods on two large datasets con- taining up to 1 .... including the comparison with several state-of-the-art .... An illustration of partitioning with maximum.

A New Hashing and Caching Approach for Reducing ...
and reduces the database operation. Fig. 1 shows the architecture of the wireless mobile networks which uses caches to store the location of MHs and hashing function for load balancing among replicated HLRs. III. PROPOSED APPROACH. In our proposed ap

SPEC Hashing: Similarity Preserving algorithm for Entropy-based ...
This paper presents a novel and fast algorithm for learning binary hash ..... the hypothesis space of decision stumps, which we'll call. H, is bounded. .... One way to optimize the search .... Conference on Computer Vision, 2003. [11] A. Torralba ...

Compact Hyperplane Hashing with Bilinear ... - Research at Google
Compact Hyperplane Hashing with Bilinear Functions. 0. 0.5. 1. 1.5. 2. 2.5. 0. 0.05. 0.1. 0.15. 0.2. 0.25 ..... 2http://www.zjucadcg.cn/dengcai/Data/TextData.html.

Linear Cross-Modal Hashing for Efficient Multimedia ...
Oct 21, 2013 - {zhux,huang,shenht}@itee.uq.edu.au, [email protected]. ABSTRACT. Most existing cross-modal hashing methods suffer from the scalability issue in the training phase. In this paper, we propose a novel cross-modal hashing approach with a li

Backyard Cuckoo Hashing: Constant Worst-Case ...
Aug 4, 2010 - dynamic dictionaries have played a fundamental role in computer ..... independent functions is the collection of all polynomials of degree k ...... national Colloquium on Automata, Languages and Programming, pages 107–118,.

A New Hashing and Caching Approach for ...
(HLR/VLR) architecture and the second one is based on the distributed database architecture. Both these strategies use the same components of the networks, ...