Making Sense of Word Embeddings 1

2

1

1

Maria Pelevina , Nikolay Arefiev , Chris Biemann , Alexander Panchenko 1

2

TU Darmstadt, Germany



Moscow State University, Russia

Introduction



We present a simple yet effective approach for learning word sense embeddings. In contrast to existing techniques, which either directly learn sense representations from corpora or rely on sense inventories from lexical resources, our approach can induce a sense inventory from existing word embeddings via clustering of egonetworks of related words. An integrated WSD mechanism enables labeling of words in context with learned sense vectors, which gives rise to downstream applications. Experiments show that the performance of our method is comparable to state-of-theart unsupervised WSD systems.

Word Sense Induction

Learning Sense Embeddings from Word Embeddings Learning Word Vectors Text Corpus

Word Vectors

Calculate Word Similarity Graph

Visualization of the ego-network of the word "table" with the "furniture" and the "data" sense clusters.

Word Similarity Graph Pooling of Word Vectors

Word Sense Induction Sense Inventory

Sense Vectors

Schema of the word sense embeddings learning method.

Word sense clusters from inventories derived from the Wikipedia corpus via crowdsourcing (TWSI), JoBimText (JBT) and word embeddings (w2v).

Word Sense Disambiguation with Sense Embeddings Context representation based on k context or word vectors:

Similarity- and probability-based disambiguation in context:

Sense vector:

Filtering the k context words: Neighbours of the word “table” and its senses.

Results: WSD Evaluation on the TWSI and the SemEval 2013 Task 13 Datasets

Performance of our method trained on the Wikipedia corpus on the full (on the left) and on the sense-balanced (on the right) TWSI dataset.

The best configurations of our method on the SemEval 2013 Task 13 dataset. All systems were trained on the ukWaC corpus

Code & Data: https://github.com/tudarmstadt-lt/sensegram Data & Code: http://github.com/cental/stc

Introduction Results: WSD Evaluation on the TWSI and the ... - GitHub

Introduction. We present a simple yet effective approach for learning word sense embeddings. In contrast to existing techniques, which either directly learn ...

881KB Sizes 0 Downloads 276 Views

Recommend Documents

WSD Vision.pdf
... tests and statistics, and it's not hard to see that the essence of education is an individual. student learning. Education is an investment in the future success of a ...

New Results on the DMC Capacity and Renyi's ...
This proposed problem has great significance in cryptography and communications. In early 2015, Yi JANET Lu first constructed novel imaginary channel transition matrices and introduced Shannon's channel cod- ing problem to statistical cryptanalysis.

Some results on the optimality and implementation of ...
be used as means of payment the same way money can. ... cannot make binding commitments, and trading histories are private in a way that precludes any.

Solution for the Search Results Relevance Challenge - GitHub
Jul 17, 2015 - They call such method as semi-supervised learning. ... 2. calculate the pdf/cdf of each median relevance level, 1 is about 7.6%, 1 + 2 is ..... Systems: Proceedings of the 2011 Conference (NIPS '11), pages 2546–2554, 2011.

WSD Vision.pdf
Everyone in Wenatchee is impacted by what we do here at the School District – and the. reverse is also true: what we do is influenced by the world beyond our ...

introduction - GitHub
warehouse to assemble himself. Pain-staking and time-consuming... almost like building your own base container images. This piggy purchased high- quality ...

Introduction - GitHub
software to automate routine labor, understand speech or images, make diagnoses ..... Shaded boxes indicate components that are able to learn from data. 10 ...... is now used by many top technology companies including Google, Microsoft,.

Introduction - GitHub
data. There are many ways to learn functions, but one particularly elegant way is ... data helps to guard against over-fitting. .... Gaussian processes for big data.

Introduction - GitHub
For the case that your PDF viewer does not support this, there is a list of all the descriptions on ...... 10. Other Formats. 10.1. AMS-TEX. AMS-TEX2.0. A macro package provided by the American .... A TeX Live port for Android OS. Based on ...

An introduction to pplex and the Simplex Method - GitHub
Nov 16, 2012 - include: simple command line interface, visualization (two variables), file input in ... program is brought into the following form, called a dictionary in [2]: ζ. = x + ..... [7] http://campuscgi.princeton.edu/~rvdb/JAVA/pivot/simple

McLab tools on the web - GitHub
Jan 6, 2016 - tools developed under the McLab project. This application is explicitly .... library developed by Facebook[5], and the Flux architecture pattern that complements React's composable. 4 ... Another option is to instead of selectively chan

McLab tools on the web - GitHub
highlighting. ➔ Message terminal. ➔ API for code highlighting using analysis results ... React.js. UI library built by Facebook https://facebook.github.io/react/ ...

On Keyboards and Things... - GitHub
The problem with this is that bigrams like ST would jam the typewriter by ... Issues with QWERTY. Many common letter .... 2 Change layouts on your computer.

Introduction - GitHub
them each year. In an aggregate travel demand model, this would be represented as 100/365.25 = 0.2737851 trucks per day. In the simulation by contrast, this is represented as ... based on the distance traveled (Table 3.3). 2FAF3 Freight Traffic Analy

Lifting the Fog on RedStar OS - GitHub
Dec 27, 2015 - https://blog.whitehatsec.com/north-koreas-naenara-web-browser-its-weirder-than- .... /lib/modules/2.6.38.8-24.rs3.0.i686/kernel/fs/rtscan.ko. ¬.

On the Complexity and Performance of Parsing with ... - GitHub
seconds to parse only 31 lines of Python. ... Once these are fixed, PWD's performance improves to match that of other ...... usr/ftp/scan/CMU-CS-68-earley.pdf.

Packer Jaccard Index Experimental Evaluation Generating ... - GitHub
A packer compresses or encrypts the instructions and data of a program ... the code must be decrypted before static analysis can be applied. Moreover .... The research aims at developing a detection mechanism based on multiple classifier ...

Girls on the Run 5k Results Sheet1.pdf
Page 1 of 3. Place Bib Name Final Time 1 194 Sydney Saunders 18:46.9 2 205 James Esernio 20:56.8 3 207 Kevin Marshall 21:06.5 4 209 Myles Anderson ...

Introduction to REST and RestHUB - GitHub
2. RestHUBанаRESTful API for Oracle DB querying. 2.1. Overview. RestHub was designed .... For example we want to create a simple HTML + Javascript page.

FURTHER RESULTS ON THE H-TEST OF DURBIN ...
to give a new light on the distribution of the Durbin-Watson statistic under the null hypothesis as .... distributed ran- dom variables with zero mean and variance σ2 > 0. ..... in the multivariate framework to the content of Remark 3.2. Our goal is

Further Results on the Existence of Nash Equilibria ... - Semantic Scholar
University of Chicago. May 2009 ... *Financial support from the National Science Foundation (SES#9905599, SES#0214421) is gratefully ac# knowledged.

Introduction to R - GitHub
Nov 30, 2015 - 6 Next steps ... equals, ==, for equality comparison. .... invoked with some number of positional arguments, which are always given, plus some ...

Introduction To DCA - GitHub
Maximum-Entropy Probability Model. Joint & Conditional Entropy. Joint & Conditional Entropy. • Joint Entropy: H(X,Y ). • Conditional Entropy: H(Y |X). H(X,Y ) ...