MARKOV LOGIC NASSLLI 2010 Mathias Niepert

MAP INFERENCE IN MARKOV LOGIC NETWORKS We’ve tried Alchemy (MaxWalkSAT) with poor results  Better results with integer linear programming (ILP)  ILP performs exact inference  Works very well on the problems we are concerned with  Originated in the field of operations research 

LINEAR PROGRAMMING A linear programming problem is the problem of maximizing (or minimizing) a linear function subject to a finite number of linear constraints  Standard form of linear programming: 

n

maximize

c x j

j 1 n

subject to

a x j 1

ij

xj

j

j

 bi

(i



( j  1, 2, ...,

0

 1, 2, ..., m) n)

INTEGER LINEAR PROGRAMMING An integer linear programming problem is the problem of maximizing (or minimizing) a linear function subject to a finite number of linear constraints  Difference to LP: Variables only allowed to have integer values 

n

maximize

c x j

j 1

subject to

n

a x j 1

ij

xj

j

j

 bi

(i



( j  1, 2, ...,

0

 1, 2, ..., m) n)

x j  {...,1,0,1,...} 4

MAP INFERENCE 1.5 x Smokes( x )  Cancer ( x ) 1.1 x, y Friends ( x, y )  Smokes( x )  Smokes( y ) 

Two constants: Anna (A) and Bob (B) Evidence: Friends(A,B), Friends(B,A), Smokes(B) :Smokes(A) _ Cancer(A) 1.5 :Smokes(B) _ Cancer(B) 1.5 :Smokes(A) _ Cancer(B) 1.5 :Smokes(B) _ Cancer(A) 1.5 :Friends(A,B) _ :Smokes(A) _ Smokes(B) 0.55 :Friends(A,B) _ :Smokes(B) _ Smokes(A) 0.55 …

MAP INFERENCE - EXAMPLE :Smokes(A) _ Cancer(A) 1.5

Introduce one variable for each ground atom: sa , ca Introduce one variable for each formula: xj Add the following three constraints:  -sa + -xj · -1  ca - xj · 0  xj + sa - ca · 1 Add 1,5xj to the objective function

n

maximize

c x j 1

subject to

j

j

j

 bi

n

a x j 1

ij

x j  {0,1}

(i  1, 2, ..., m)

ONTOLOGY MATCHING Ontology O1

Ontology O2

Person

People Author

Author

CommitteeMember

Reviewer

PCMember

Document

Doc

reviews Paper Review

reviews

writes

Paper

writes

MARKOV LOGIC & ONTOLOGY MATCHING Markov logic supports hard and soft constraints  Ontology alignment involves both types of constraints 



Hard constraints To ensure 1-1 and functional alignment  To mitigate incoherence in the merged ontology 



Soft Constraints “A-priori” confidence for correspondences of concepts and roles based on lexical similarity measures  “Stability constraints”: Alignment should use structural information of the two ontologies during the alignment process 

DESCRIPTION LOGICS 

Logic-based knowledge representation formalisms Descendants of semantic networks and KL-ONE  Describe domain in terms of concepts (classes), roles (properties, relationships) and individuals (instances) 



Typical Properties of a DL Formal model theoretic semantics  Decidable fragments of FOL  Closely related to Propositional Modal & Dynamic Logics  Availability of inference algorithms  Decision procedures for key problems (satisfiability, subsumption, etc)  Highly optimized implemented systems 

DESCRIPTION LOGIC BASICS 

Concept names are equivalent to unary predicates 



In general, concepts are equivalent to formulas with one free variable

Role names are equivalent to binary predicates 

In general, roles are equivalent to formulas with two free variables

Individual names are equivalent to constants  Operators restricted so that: 

Language is decidable and, if possible, of low complexity  No need for explicit use of variables 





Restricted form of 9 and 8

Features such as counting can be succinctly expressed

DL SYSTEM ARCHITECTURE

Man ´ Human u Male Happy-Father ´ Man u 9 has-child Woman v :Man

Abox (data) John : Happy-Father hJohn, Maryi : has-child John: 6 1 has-child

(Horrocks 2005)

Interface

Tbox (schema)

Inference System

Knowledge Base

SYNTAX OF DESCRIPTIONS (ALC) A description C or D can be: A > ? C C1 u D1 C1 t D1 R.C R.C

an atomic concept (top) the universal concept (bottom) the null concept a negated concept the intersection of concepts the union of two concepts (restriction) (existential quantification)

* C atomic concept

* *

DL SEMANTICS  Semantics  An 

defined by interpretations

interpretation I = (DI, ¢I), where

DI is the domain (a non-empty set of individuals)

 ¢I

is an interpretation function that maps:



Concept (class) name A to subsets AI of DI



Role (property) name R to binary relation RI over DI



Individual name i to iI element of DI

DL SEMANTICS (CONT.) Interpretation function ¢I extends to concept (and role) expressions

DL KNOWLEDGE BASE 

A DL Knowledge base K is a pair hT ,Ai where T is a set of “terminological” axioms (the Tbox)  A is a set of “assertional” axioms (the Abox) 



Tbox axioms are of the form:

C v D, C ´ D, R v S, R ´ S and R+ v R where C, D concepts, R, S roles, and R+ set of transitive roles



Abox axioms are of the form: x:D, hx,yi:R where x,y are individual names, D a concept and R a role

DL KNOWLEDGE BASE Knowledge Base Tbox (schema) Man ´ Human u Male Happy-Father ´ Man u 9 has-child Woman v :Man has-child.Female isEmployedBy.Farmer

Abox (data) John : Happy-Father hJohn, Maryi : has-child John: 6 1 has-child

DL KNOWLEDGE BASE SEMANTICS 

An interpretation I satisfies (models) a Tbox axiom A (I ² A): I ² C v D iff CI µ DI I ² R v S iff RI µ SI





I satisfies a Tbox T (I ² T ) if and only if I satisfies every axiom A in T An interpretation I satisfies an Abox axiom A (I ² A) I ² x:D iff xI 2 DI





I ² C ´ D iff CI = DI I ² R ´ S iff RI = SI

I ² hx,yi:R iff (xI,yI) 2 RI

I satisfies an Abox A (I ² A) if and only if I satisfies every axiom A in A I satisfies an KB K (I ² K) if and only if I satisfies both T and A

ONTOLOGY MATCHING Ontology O1

Ontology O2

Person

People Author CommitteeMember

PCMember

Document

Author

< Author, Author, =, 0.97 > < Paper, Paper, =, 0.94 > < reviews, reviews, =, 0.91 > < writes, writes, =, 0.7 > < Person, People, =, 0.8 > < Document, Doc, =, 0.7 > < Reviewer, Review, ≤, 0.6 > …

Reviewer

reviews

Doc

writes reviews Paper Review

Paper

writes

ONTOLOGY MATCHING WITH ML 

We can compute confidence values for matching correspondences (equivalence) < Author, Author, =, 0.97 >

< Paper, Paper, =, 0.94 > < reviews, reviews, =, 0.91 > < writes, writes, =, 0.7 > < Person, People, =, 0.8 > < Document, Doc, =, 0.7 > < Reviewer, Review, ≤, 0.6 > …

We used the Levensthein distance between the labels of properties and concepts  More sophisticated approaches possible  The goal is a 1-1 and functional alignment 

ONTOLOGY MATCHING AND ML Typed predicates cmap and pmap modeling the sought-after matching correspondences  Incorporating confidence values sX,Y 



Constraining the alignment to be 1-1 and functional (cardinality constraints)

ONTOLOGY MATCHING AND ML We want the merged ontology to be “as coherent as possible” (no unsatisfiable concepts)  First, we need to introduce predicates to the ML formulation  Predicates for subsumption, disjointness, domain restrictions, range restrictions, …  Used the tableau reasoner Pellet for this preprocessing step 

COHERENCE CONSTRAINTS 

Hard (!) constraints (formulas with “infinite weight”) that ensure that incoherences cannot occur in the merged ontology

Incoherence!

EXAMPLE

EXAMPLE – GROUND MARKOV LOGIC NETWORK

EXAMPLE – INTEGER LINEAR PROGRAM

STABILITY CONSTRAINTS Stability formulas (soft!) to propagate evidence derived from structural properties  Example: 

Formula reduces the probability of alignments that map concept X to Y and X’ to Y’ if X’ subsumes X but Y’ does not subsume Y  Would introduce new structural knowledge 

MANUALLY SET WEIGHTS

We set the weights for the first group to −0.5 and the weights for the second group to −0.25  Subsumption axioms between concepts are specified by ontology engineers more often than domain and range restriction of properties (Ding and Finin 2006).  A pair of two correct correspondences will less often violate constraints of the first type than constraints of the second type 

LEARNING WEIGHTS An Online learner roughly works as follows: 1. set number of epochs to 0. 2. for each instance (xj, yj) of the training corpus do • run inference to calculate y ^ = arg maxy s(xj ; y) • 3.

4.

update current weights wj by comparing y^ to yj

If number of epochs is larger than some predefined value go to 4, otherwise increase number of epochs and go to 2. Return last solution

LEARNING WEIGHTS Run inference to calculate y^ = arg maxy s(xj ; y) ^ to yj  Update current weights wi by comparing y ^ )]  wi ← wi + η [counti(yj) – counti(y  Voted perceptron rule 

RESULTS

RESULTS – MANUAL VS. LEARNED

THANKS

Thank you for attending the course!

Markov Logic

MAP INFERENCE IN MARKOV LOGIC. NETWORKS. ○ We've tried Alchemy (MaxWalkSAT) with poor results. ○ Better results with integer linear programming.

698KB Sizes 1 Downloads 250 Views

Recommend Documents

Markov Logic Networks
A Markov Logic Network (MLN) is a set of pairs. (F i. , w i. ) where. F i is a formula in first-order logic. w ... A distribution is a log-linear model over a. Markov network H if it is associated with. A set of features F = {f. 1. (D ..... MONTE CAR

Tractable Learning of Liftable Markov Logic Networks - UCLA CS
Each clique of variables Xk in the graph has an associated potential function ... As wi increases, so does the strength of the constraint Fi imposed on the world.

Tractable Learning of Liftable Markov Logic ... - Lirias - KU Leuven
learning task is to learn the weights associated with each formula in a given theory. This is an analogous problem to that of learning feature weights in a log-linear propositional. Markov network. For the structure learning task, one also learns the

Web-Scale Knowledge Inference Using Markov Logic ...
web-scale MLN inference by designing a novel ... this problem by using relational databases and task ... rithms in SQL that applies MLN clauses in batches.

Markov Logic Networks for Natural Language ... - Ashish Sabharwal
Computer Sci. and Engr. University of Washington. Seattle ... Markov Logic Network (MLN) is a formal probabilistic in- ference framework that ... S, KB]. This is a Partial MAP computation, in general #P- hard (Park 2002). Hence methods such as Intege

Using Markov Logic Network for On-line Activity ...
is dressing up in the morning (in that case, the best action could be to provide high intensity light .... command), to ease social inclusion and to provide security reassurance by de- tecting situations of ... the home automation system transmitted

Markov Bargaining Games
apply mutatis mutandis to any irreducible component of the Markov chain. A sufficient condition for the. Markov chain to be aperiodic is for πii > 0 ... subgame perfect equilibrium. 4It is not necessary for the discount factors to be strictly less t

markov chain pdf
File: Markov chain pdf. Download now. Click here if your download doesn't start automatically. Page 1 of 1. markov chain pdf. markov chain pdf. Open. Extract.

Markov Bargaining Games
I am grateful to Klaus Schmidt,. Avner Shaked and to the University of Bonn for their hospitality and support whilst I completed this ..... define b(i) and B(i) to be the bounds on the buyer's equilibrium payoffs when he is the proposer in a subgame

Hidden Markov Models - Semantic Scholar
A Tutorial for the Course Computational Intelligence ... “Markov Models and Hidden Markov Models - A Brief Tutorial” International Computer Science ...... Find the best likelihood when the end of the observation sequence t = T is reached. 4.

Hidden Markov Models - Semantic Scholar
Download the file HMM.zip1 which contains this tutorial and the ... Let's say in Graz, there are three types of weather: sunny , rainy , and foggy ..... The transition probabilities are the probabilities to go from state i to state j: ai,j = P(qn+1 =

Finite discrete Markov process clustering
Sep 4, 1997 - Microsoft Research. Advanced Technology Division .... about the process clustering that is contained in the data. However, this distribution is ...

Finite discrete Markov process clustering
Sep 4, 1997 - about the process clustering that is contained in the data. ..... Carlin, Stern, and Rubin, Bayesian Data Analysis, Chapman & Hall, 1995. 2.

Bayesian Variable Order Markov Models
ference on Artificial Intelligence and Statistics (AISTATS). 2010, Chia Laguna .... over the set of active experts M(x1:t), we obtain the marginal probability of the ...

Word Confusability --- Measuring Hidden Markov Model Similarity
240–243. [6] Markus Falkhausen, Herbert Reininger, and Dietrich Wolf,. “Calculation of distance measures between hidden Markov models,” in Proceedings of ...

Lumping Markov Chains with Silent Steps
a method for the elimination of silent (τ) steps in Markovian process ...... [8] M. Bravetti, “Real time and stochastic time,” in Formal Methods for the. Design of ...

Semiparametric Estimation of Markov Decision ...
Oct 12, 2011 - procedure generalizes the computationally attractive methodology of ... pecially in the recent development of the estimation of dynamic games. .... distribution of εt ensures we can apply Hotz and Miller's inversion theorem.

Implementing a Hidden Markov Model Speech ... - CiteSeerX
School of Electronic and Electrical Engineering, University of Birmingham, Edgbaston, ... describe a system which uses an FPGA for the decoding and a PC for pre- and ... Current systems work best if they are allowed to adapt to a new speaker, the ...

Markov Processes on Riesz Spaces
May 11, 2011 - theory in the measure free context of a Riesz space with weak order unit. ... natural domain of T, denoted dom(T) in the universal completion, Eu ...

Unsupervised Learning of Probabilistic Grammar-Markov ... - CiteSeerX
Computer Vision, Structural Models, Grammars, Markov Random Fields, .... to scale and rotation, and performing learning for object classes. II. .... characteristics of both a probabilistic grammar, such as a Probabilistic Context Free Grammar.