Embedding Probabilistic Logic for Machine Reading Sebastian Riedel (University College London)

1

Overview Machine Reading & Reasoning … … with Probabilistic Logics and Embeddings Challenges Injecting Explanations Extracting Explanations

2

Machine Reading “Who works in London and is interested in NLP?

in(UCL,London)

interest(x,NLP),! worksFor(x,y),
 in(y,London)

Relational DB

topic(Seb,NLP) worksFor(Seb,UCL)

[Kwiatkowski et al., 2013]

Narrow domain-specific schema

[Mintz et al., 2009]

Semantics Statistical NLP Syntax

Coreference

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 3

Machine Reading [Riedel et al., 2013] in(UCL,London)

“Who works in London and is interested in NLP?

works-in-area-of(Seb,NLP) lecturer-at(Seb,UCL)

Relational DB

interest(x,NLP),! worksFor(x,y),! in(y,London)

Semantics Wide universal schema Syntax

Coreference

Statistical NLP

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 4

Semantics as Reasoning [Riedel et al., 2013] in(UCL,London)

“Who works in London and is interested in NLP? interest(x,NLP),! worksFor(x,y),! in(y,London)

works-in-area-of(Seb,NLP) lecturer-at(Seb,UCL) worksFor(x,y): faculty-at(x,y) interest(x,y): works-in-area-of(x,y)[0.9]

Statistical Relational Learner and Reasoner

faculty-at(x,y): lecturer-at(x,y)

Wide universal schema Syntax

Coreference

Statistical NLP

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 5

Benefit: Transitive Reasoning in(UCL,London)

“Who works in London and is interested in NLP? interest(x,NLP),! worksFor(x,y),! in(y,London)

works-in-area-of(Seb,NLP) lecturer-at(Seb,UCL) worksFor(x,y): faculty-at(x,y) interest(x,y): works-in-area-of(x,y)[0.9]

Statistical Relational Learner and Reasoner

faculty-at(x,y): lecturer-at(x,y)

Wide universal schema Syntax

Coreference

Statistical NLP

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 6

Benefit: More Coverage in(UCL,London)

“Who is faculty in London and interested in NLP? interest(x,NLP),! worksFor(x,y),! in(y,London)

works-in-area-of(Seb,NLP) lecturer-at(Seb,UCL) worksFor(x,y): faculty-at(x,y) interest(x,y): works-in-area-of(x,y)[0.9]

Statistical Relational Learner and Reasoner

faculty-at(x,y): lecturer-at(x,y)

Wide universal schema Syntax

Coreference

Statistical NLP

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 7

Benefit: Code Reuse in(UCL,London)

“Who lives in London and is interested in NLP? interest(x,NLP),! worksFor(x,y),! in(y,London)

works-in-area-of(Seb,NLP) lecturer-at(Seb,UCL) worksFor(x,y): faculty-at(x,y) interest(x,y): works-in-area-of(x,y)[0.9] livesIn(x,z): worksFor(x,y),! locatedIn(y,z) [0.6]

Statistical Relational Learner and Reasoner

[Lao et al., 2011]

Wide universal schema Syntax

Coreference

Statistical NLP

”Sebastian Riedel works in the area of NLP and is now Lecturer at UCL“ 8

Reasoner and Learner Statistical Relational Learner and Reasoner

? 9

Probabilistic Logics Use (weighted) logics to define graphical models lecturer-at

prof-at

works-for

Examples Markov Logic
 [Richardson and Domingos, 2006]

Bayesian Logic
 Programs [Kersting , 2007]

10

Probabilistic Logics Use (weighted) logics to define graphical models lecturer-at

prof-at

works-for

Problems Inference Rule Learning

11

Matrix Factorization Think of database as a matrix or tensor lecturer-at

prof-at

works-for

1

1

1 1 1

1

12

Matrix Factorization Embed entity (pairs) in low dimensional vector spaces lecturer-at

prof-at

works-for

1

1

1 1 1

?? ??

1

?? ?? 13

Matrix Factorization Embed relations in low dimensional vector spaces

1

1

lecturer-at

1 1 1

?? ??

1

??

? ?

prof-at

? ?

works-for

? ?

?? 14

Matrix Factorization Find a matrix-matrix product that approximates observed DB

1

1

lecturer-at

1 1 1

?? ??

1



??



? ?

prof-at

? ?

works-for

? ?

?? 15

Matrix Factorization Or a non-linear function of this product

1

1

1 1 1

1



sigmoid



16

Matrix Factorization Low rank forces some 0 cells to become non-zero => prediction

1 1

1 .9

1 1

1 .9



sigmoid



[Nickel, Bordes, …] 17

Results for Relation Extraction [Riedel et al. 2013, NAACL] Averaged 11-point Precision/Recall 1 0.9 0.8

Precision

0.7 SU12 N F NF NFE

0.6 0.5 0.4 0.3 0.2 0.1 0

0.2

0.4

0.6

0.8

1

Recall

18

Facts

|P|

Challenge 1: Injecting Symbolic Rules

First-orde Formulae

KB

8x, y : #2-unit-of-#1(x, y) ) organi Example: “Boeing and the Sikorsky Aircraf 8x, y : #1-city-in-#2(x, y) ) locati Example: “With 900,000 people, San Jose#1 “lecturers are employees!”

?

sigmoid



Figure 1: Injecting Logic into Matrix Factorization: G

entity-pairs P and predicates/relations R, matrix factori embeddings that approximate the observed matrix. In entities and relations to learn the embeddings such that th 19

Challenge 1: Injecting Symbolic Rules

“a liquid turns into a solid ! when its temperature is ! lowered below its freezing point

?

sigmoid



20

Some Experiments “Zero-shot” learning Given: a lot of relational data, but not for worksFor Goal: given few of worksFor rules, learn to predict worksFor

Results (in MAP for several relations) Only rules: 0.23 Apply rules after factorization: 0.34 Apply rules before factorization: 0.43 Incorporate rules into training objective: 0.52

[Rocktaeschel et al. 2014, SP14] 21

Facts

|P|

Challenge 1: Injecting Symbolic Rules

First-orde Formulae

KB

8x, y : #2-unit-of-#1(x, y) ) organi Example: “Boeing and the Sikorsky Aircraf 8x, y : #1-city-in-#2(x, y) ) locati Example: “With 900,000 people, San Jose#1 “lecturers are employees!”

?

sigmoid



Figure 1: Injecting Logic into Matrix Factorization: G

entity-pairs P and predicates/relations R, matrix factori embeddings that approximate the observed matrix. In entities and relations to learn the embeddings such that th 22

Facts

|P|

Challenge 2: Extracting Explanations

First-orde Formulae

KB

8x, y : #2-unit-of-#1(x, y) ) organi Example: “Boeing and the Sikorsky Aircraf 8x, y : #1-city-in-#2(x, y) ) locati Example: “With 900,000 people, San Jose#1 “lecturers are employees!”

?

sigmoid



Figure 1: Injecting Logic into Matrix Factorization: G

entity-pairs P and predicates/relations R, matrix factori embeddings that approximate the observed matrix. In entities and relations to learn the embeddings such that th 23

Facts

|P|

Challenge 2: Extracting Explanations

First-orde Formulae

KB

8x, y : #2-unit-of-#1(x, y) ) organi Example: “Boeing and the Sikorsky Aircraf 8x, y : #1-city-in-#2(x, y) ) locati Example: “With 900,000 people, San Jose#1 “I returned Sebastian! because we know he is a lecturer! at UCL, which is in London,! so he most likely lives in London! …

?

sigmoid



Figure 1: Injecting Logic into Matrix Factorization: G

entity-pairs P and predicates/relations R, matrix factori embeddings that approximate the observed matrix. In [Thrun 1995, NIPS, Craven 1996, NIPS] that th entities and relations to learn the embeddings such 24

Summary Do semantics in a probabilistic relational reasoner Reasoner: matrix/tensor factorization (or other LV models) Challenges: inject explanations extract explanations

Do this for: deeper downstream tasks such as question answering, fact checking, machine comprehension We are hiring (thanks to the Paul G. Allen Foundation) 25

Thanks

26

NIPS Learning Semantics 2014

Relational DB. Statistical NLP topic(Seb .... Use (weighted) logics to define graphical models. Probabilistic ... Embed relations in low dimensional vector spaces.

802KB Sizes 1 Downloads 163 Views

Recommend Documents

Cold-Start Reinforcement Learning with Softmax ... - NIPS Proceedings
Policy search in reinforcement learning refers to the search for optimal parameters for a given policy parameterization [5]. ... distribution and the reward distribution, learning would be efficient, and neither warm-start training nor sample varianc

Sequence to Sequence Learning with Neural ... - NIPS Proceedings
uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode ...

Group Sparse Coding - NIPS Proceedings
we propose and evaluate the mixed-norm regularizers [12, 10, 2] to take into account the structure ... 2 introduces the notation used in the rest of the paper, and.

Sequence to Sequence Learning with Neural ... - NIPS Proceedings
large labeled training sets are available, they cannot be used to map sequences ... input sentence in reverse, because doing so introduces many short term dependencies in the data that make the .... performs well even with a beam size of 1, and a bea

pdf-1866\realms-of-meaning-an-introduction-to-semantics-learning ...
... a problem loading more pages. Retrying... pdf-1866\realms-of-meaning-an-introduction-to-semantics-learning-about-language-by-thomas-r-hofmann.pdf.Missing:

Learning the Semantics of Discrete Random Variables ...
between categorical and ordinal data, let alone inferring the ordering. We present ... infer the true ordering of the variables when the data is ordinal. This latter ..... classification models to the data and then evaluates their quality of fit. How

Adaptive Martingale Boosting - NIPS Proceedings
In recent work Long and Servedio [LS05] presented a “martingale boosting” al- gorithm that works by constructing a branching program over weak classifiers ...

Inquisitive semantics lecture notes
Jun 25, 2012 - reformulated as a recursive definition of the set |ϕ|g of models over a domain. D in which ϕ is true relative to an assignment g. The inductive ...

Ontological Semantics
Descriptions in ontological semantics include text meaning representations, lexical ... the development of implementations and comprehensive applications the.

The Design Learning Process - ©2014 Design Learning Network. All ...
The Design Learning Process - ©2014 Design Learning Network. All Rights Reserved.pdf. The Design Learning Process - ©2014 Design Learning Network.

Glue Semantics
Mar 5, 2011 - Based on these premises, we can construct two valid linear logic proofs. ... chapters from Asudeh (2011) available from the workshop website.

From Operational Semantics to Denotational Semantics ...
that maps any program text to a logic formula representing its denotational ... Once a Verilog process is activated, it continues its execution until the completion.

Semantics & Ontologies Roadmap.pdf
data types will require many semantic technologies, such as ontology and vocabulary mapping, and. development of data ontologies at various levels of ...

Radical inquisitive semantics
a parallel characterization of positive and negative responses. To illustrate this ... Definition 3 below recursively defines, for every sentence ϕ in our language,.

Syntax–Semantics Interface
mous systems. This is widely assumed to ... The task of semantics is to capture the meaning of a sentence. Consider .... filling this position as the theta-criterion.

The Design Learning Process - ©2014 Design Learning Network. All ...
Identify Criteria, Embed. Feedback Loops. Learners set clear criteria as. indicators of success and align. with checkpoints. Conduct Formative. Assessments. Facilitator assesses learners'. level of readiness to apply and. transfer key concepts and sk

Minimax Optimal Algorithms for Unconstrained ... - NIPS Proceedings
regret, the difference between his loss and the loss of a post-hoc benchmark strat- ... While the standard benchmark is the loss of the best strategy chosen from a.

Inquisitive Semantics - CiteSeerX
We introduce an inquisitive semantics for a language of propo- sitional logic .... For a declarative language, a standard way to define the interpretation of ..... What we have arrived at, is that there are, as we will call them, four possibilities f

Ontological Semantics
issues relating to knowledge representation and implementation system ... able level of automation in all of its processes—both the runtime procedures and the ...

Radical inquisitive semantics
In much recent work, this notion is given a dynamic twist, and the meaning of a sentence is .... A translation of (1) into our formal language is: (4) p ∧ (q ∨ r).

Syntax-Semantics Interface
Oct 14, 2008 - 1 Course information ... Webpage: http://user.uni-frankfurt.de/∼castrovi ... The syntax-semantics interface: what is its role in the overall design of ...

Semantics-Pragmatics Interface
Apr 15, 2009 - Varrentrappstr. 40-42. 60486 Frankfurt/M (opposite Adorno Platz) http://web.uni-frankfurt.de/fb10/grad koll/index.htm. 3 Goals of this seminar. ◦ Read papers that deal with topics relevant to the semantics-pragmatics interface. ◦ I

Verum Focus in Alternative Semantics
Jan 9, 2016 - The relevant empirical domain is also a matter of controversy. • The most ... free head features of Φ with the occupant of Φ (complementizer, finite verb). e. A feature ..... (33) A: I was wondering how much food to buy for tonight.