Automorphism Groups of Graphical Models and Lifted Variational Inference Hung Hai Bui1 1 Laboratory

Tuyen N. Huynh2

Sebastian Riedel3

for Natural Language Understanding, Nuance Communications 2 AI

3 Department

Center, SRI International

of Computer Science, University College London

July 14, 2013

UAI 2013

1/26

UAI 2013

Motivations • Probabilistic inference – exploit low tree-width, local graph

sparsity

• What about the above graphical models? • Inference can be done efficiently using lifted inference.

• Why? “Symmetry” is essential. • What is symmetry? • How do we exploit symmetry in variational inference?

2/26

Motivations (cont.) • Lifted inference methods mostly are derived procedurally: • identify same duplicated computational steps that can be

performed once. • Not clear what form of symmetry is being exploited. • Hard to generalize.

• We propose a declarative approach • Formalize symmetry in graphical models • Lift variational optimization formulations rather than lift

inference algorithms • Lifted problems can be solved with the usual optimization

tool-box (LP solvers, cutting plane, dual decomposition, etc) • Connect lifted inference and mainstream variational

approximations

UAI 2013

3/26

Outline

1 Motivation 2 Symmetry of Graphical Models 3 A General Framework for Lifted Convex Variational Inference 4 Lifted MAP with Local and Cycle constraints 5 Experiments

UAI 2013

4/26

UAI 2013

Symmetry in Graphs • Symmetry is formalized as the set of transformations that

preserve the object. • The set of all such transformations (permutations) π form the automorphism group.

S5

S4 x S3

(a)

(b)

D5 (c)

1 (d)

• The automorphism group partitions the set of nodes into

orbits. • Node orbit is a set of nodes equivalent up to symmetry. • Similarly for edge orbits and configuration (node-value

assignment) orbits. 5/26

UAI 2013

Symmetry of Graphical Models

• Exponential family

F(x | θ) = exp (hΦ(x), θi − A(θ)) x ∈ Rn , θ ∈ Rm • Automorphisms are permutations of variables and features

preserving Φ. • Formally, a pair (π, γ) ∈ Sn × Sm such that −1

Φγ (x π ) ≡ Φ(x) i.e., Φi (x π ) ≡ Φγ(i) (x)

6/26

Symmetry of Graphical Models (cont.)

1 f1 2

f2

f3

f4

3 f5

4 G

Colored G

Orbits of G

G

• Automorphisms of Colored G are automorphisms of F • So far we’ve ingored parameters. If parameters are tied • Refine colors of features to be consistent with parameter tying • Similarly compute the automophorism group consistent with

parameter tying

UAI 2013

7/26

Properties of Exponential Family Automorphisms

Theorem If (π, γ) ∈ Aut[F] then • Pr(x π |θ γ ) = Pr(x|θ) • π is an automorphism of the structure graph G[F]. • Θγ = Θ and A(θ γ ) = A(θ) • Mγ = M and A∗ (µγ ) = A∗ (µ) • mγ (θ) = m(θ γ )

Proof tools: theory of group actions, orbit-stabilizer theorem

UAI 2013

8/26

UAI 2013

Partitions and Symmetrized Subspaces A partition ∆ of {1 . . . m} induces a symmetrized subspace Rm ∆ of points r in Rm such that if j and j 0 are in the same cell of ∆, then rj = rj 0 . 1

Δ = {1,{2, 3}}

2

RΔ3

3

S∆ denotes the intersection of S with the subspace: S ∩ Rm ∆

9/26

UAI 2013

Variational Inference with Parameter Tying

Given parameter-tying partition ∆ and θ ∈ Θ∆ . Solve sup hθ, µi − A∗ (µ) µ∈M

How does symmetry help? Intuitions: • some features have the same expectations, so some µi ’s are

the same. • optimal µ is trapped in a symmetrized subspace Rm ϕ for some

lifting partition ϕ • this generalizes to general convex optimization problem

• How to find ϕ?

10/26

UAI 2013

Lifted Variational Infence Framework Parameter-tying

Symmetry of the family

Δ

Aut[F]

Lifting group

AutΔ [F]

Lifting partition (feature orbit partition)

Mean parameter space

Symmetrized subspace

Lifted mean parameter space

ϕ

m ϕ

R

M



11/26

Lifted Variational Inference Framework (cont.)

Theorem

Marginal inference

sup hθ, µi − A∗ (µ) = sup hθ, µi − A∗ (µ) µ∈M

MAP inference

sup hθ, µi = sup hθ, µi µ∈M

UAI 2013

µ∈Mϕ µ∈Mϕ

12/26

Lifted Marginal Polytope Mϕ Configuration orbit



Centroid

m • M projected to Rm ϕ = M intersects with Rϕ • Projection view:

• Ground configurations in the same orbit project to the same

point, which is the centroid of the orbit • num. extreme points ≤ num. of configuration orbits (typically

still exponential) • For complete symmetric graphical models with N binary

variables: • num. configuration orbits = N + 1 UAI 2013

13/26

Lifted Marginal Polytope Mϕ

• Intersection view • Let ρ be the orbit mapping function ρ : i 7→ orb(i) • M described by a system of constraints T1 (µ) . . . TK (µ) • Intersect Tk with Rm ¯ρ(i) ϕ means substitute µi by µ • num. variables reduced to num. feature orbits • num. constraints unchanged • However, many constraints become redundant and can be removed

• This view is useful in practice for lifting outer bounds of M

UAI 2013

14/26

Lifted Approximate MAP Inference

Theorem Lifted Relaxed MAP: If OUTER = OUTER(G) depending only on the graphical model structure G sup µ∈OUTER

UAI 2013

hθ, µi =

sup

hθ, µi

µ∈OUTERϕ

15/26

Lifted MAP on LOCAL  τv :0 + τv :1 = 1     τ{u:0,v :0} + τ{u:0,v :1} = τu:0  LOCAL= τ ≥ 0 τ {u:0,v :0} + τ{v :0,u:1} = τv :0   τ{u:1,v :1} + τ{u:0,v :1} = τv :1    τ{u:1,v :1} + τ{v :0,u:1} = τu:1      

LOCALϕ = τ¯ ≥ 0     

τ¯v:0 + τ¯v:1 = 1 τ¯e:00 + τ¯(u,v ):01 = τ¯u:0 ¯ τ¯e:00 + τ¯(v ,u):01 = τ¯v¯:0 τ¯e:11 + τ¯(u,v ):01 = τ¯v¯:1 τ¯e:11 + τ¯(v ,u):01 = τ¯u:1 ¯

∀v ∈ V(G) ∀ {u, v } ∈ E (G)

          

∀ node orbit v

     

∀ edge orbit e with  {u, v } a representative of e    

• num. constraints is O(#node orbits + #edge orbits). • Lifted LOCAL LP can be solved by a generic solver, or a

message-passing variant, e.g., MPLP. UAI 2013

16/26

Tightening Bound: Cycle Polytope

Cycle constraints (Sontag & Jaakkola 07): 0 X

X

1

1

{u,v }∈F

nocut({u, v }, τ ) +

P

{u,v }∈C \F

cut({u, v }, τ ) ≥ 1

=

X 0

P

X

1

C a cycle in G , F ⊂ C , |F | is odd cut({u, v }, τ ) = τu:0,v :1 + τu:1,v :0 nocut({u, v }, τ ) = τu:0,v :0 + τu:1,v :1

• num. constraints is large • Iteratively add violated cycle contraint via cutting plane • Run shortest path on mirror graph of G to find violated

constraint

UAI 2013

17/26

UAI 2013

Lifted Cycle Polytope • Substituting ground variables τ by lifted variables τ¯ yields

lifted cycle constraints X nocut({u, v }, τ¯) + {u,v }∈F

X

cut({u, v }, τ¯) ≥ 1

{u,v }∈C \F

• However • these constraints are still defined on cycles of the ground

graphical model G . • many constraints are duplicates

• Can we characterize cycle constraints using the lifted graph G¯? • not straightforward, since cycles in G¯ might not correspond to cycles in G

18/26

UAI 2013

Lifted Cycle Polytope (cont.) • Fix a node i (mark i distinct), let G¯[i] be the new lifted graph

i {i} Node Orbits of G

Node Orbits of G fixing i

G[i]

• A cycle through {i} on G¯[i] always corresponds to a set of

cycles throuh i on G . • Violated constraints on ground cycles through i can be found

by • forming mirror graph of G¯[i] • running shortest path from {i} to its mirror node

19/26

UAI 2013

Lifted MAP with Cycle Constraints

Algorithm constraints ← LOCALϕ

2 τ¯ ←Solve maxt¯ t¯, θ¯ given current constraints 1

3

∀ node orbit v, pick a representative i ∈ v. Find the shortest path ¯ from {i} to {i}-mirror on the mirror graph of G[i].

4

If shortest-path < 1, add violated constraint to the current set of constraints.

5

If no more violated constraint, exit; else goto 2.

20/26

Markov Logic Networks (MLNs)

• “Lovers & Smokers” Markov Logic Network 100

Male(x) ⇔ ¬Female(x)

2

Male(x) ∧ Smokes(x)

2

Female(x) ∧ ¬Smokes(x)

0.5

x 6= y ∧ Male(x) ∧ Female(y ) ∧ Loves(x, y )

0.5

x 6= y ∧ Loves(x, y ) ⇒ (Smokes(x) ⇔ Smokes(y ))

−100

x 6= y ∧ y 6= z ∧ z 6= x ∧ Loves(x, y ) ∧ Loves(y , z) ∧ Loves(x, z)

• How to find lifting partition? • Ground the model, construct colored factor graph and run

graph automorphism tool (e.g. nauty) • Can we do this without grounding?

UAI 2013

21/26

Renaming Symmetry of MLNs • Let D∗ be the set of individuals that do not appear as

constants in MLN. • Intuition: every permutation (renaming) of individuals in D∗

preserves the probabilistic model.

Theorem The renaming group S(D∗ ) is (isomorphic to) a subgroup of the MLN’s lifting group • The number of orbits induced by this group does not depend

on domain size |D|. • Example: observed one constant a • Renaming group induces 5 orbits for Love predicate {L(a, a)}, {L(x, x)}, {L(a, x)}, {L(x, a)}, {L(x, y )} x, y 6= a, x 6= y UAI 2013

22/26

UAI 2013

Experiment

Finding symmetry: renaming group versus graph automorphism (nauty)

Nauty Renaming

domain size #Orbits Time(s) #Orbits Time(s)

10 12 .49 12 .08

20 23 1.77 23 .09

50 25 172.79 80 .221

100 27 9680.48 255 .4

200 * * 905 .84

1000 * * 20505 2.19

23/26

UAI 2013

Experiment (cont.) Lifted MAP (LOCAL, CYCLE) inference versus ground counterparts

True  Op2mal  

Local  

Cycle  

Time  (s)  

Objec&ve    

100000   10000   1000   100   10   1   0.1   0.01   0.001  

Renaming-­‐Local  

5  

10  

15  

20   100   1000  

Domain  size  

(a)  Run&me  vs.  domain  size  

Renaming-­‐Cycle  

522   521   520   519   518   517   516   515   0  

10  

20   30   Time(s)  

40  

(b)  Objec&ve  over  &me  for  domain  size  5  

“Lovers & Smokers” MLN without evidence, varying domain size 24/26

UAI 2013

Experiment (cont.)

100000   10000   1000   100   10   1   0.1   0.01   0.001  

16500   Objec7ve  

Time(s)  

Lifted CYCLE versus lifted LOCAL

16000   15500   15000  

14500   2   4   6   8   10   2   4   6   8   10   Number  of  observed  constants   Number  of  observed  constants   Renaming-­‐Local   Renaming-­‐Cycle  

“Lovers & Smokers” MLN with varying amount of soft evidence, fixing domain size to 100 25/26

Conclusion and Future Direction

• Summary: • First rigorous formalization of symmetry in graphical models • A general framework for lifting convex outer bound variational

approximations • First lifted algorithm that works with a bound tighter than the

local polytope • Related and Future work: • Exploit symmetry of the log-partition function A (Bui et al.,

AAAI’12) • Exploit symmetry in sampling methods (Niepert UAI’12,

AAAI’13) • Lift convex variational marginal inference

UAI 2013

26/26

Automorphism Groups of Graphical Models and Lifted ...

Jul 14, 2013 - f4 f5. G. Colored G. Orbits of G. • Automorphisms of Colored G are automorphisms of F. • So far we've ingored parameters. If parameters are tied.

930KB Sizes 2 Downloads 219 Views

Recommend Documents

Automorphism Groups of Graphical Models and Lifted ...
work for lifted inference in the general exponen- tial family. Its group ..... working directly with the aggregation features, the struc- ture of the original ... f5. (b). 1,4. 2,3. (c). 1. 3. 4. 2. (a). Figure 2: Graph construction for computing the

Graphical Models
Nov 8, 2003 - The fields of Statistics and Computer Science have generally followed ...... is a binary symmetric channel (BSC), in which each message bit is ...

Graphical Models
Nov 8, 2003 - Computer scientists are increasingly concerned with systems that interact with the external world and interpret uncertain data in terms of underlying probabilistic models. One area in which these trends are most evident is that of proba

Graphical RNN Models
Dec 15, 2016 - Further, stations with extreme data were then manually removed. This process was repeated till we got a reasonably clean dataset. For each weather station, we also have its physical location on the map as given by the latitude and long

The Extraction and Complexity Limits of Graphical Models for Linear ...
graphical model for a classical linear block code that implies a de- ..... (9) and dimension . Local constraints that involve only hidden variables are internal ...

The automorphism group of Cayley graphs on symmetric groups ...
May 25, 2012 - Among the Cayley graphs of the symmetric group generated by a set ... of the Cayley graph generated by an asymmetric transposition tree is R(Sn) .... If π ∈ Sn is a permutation and i and j lie in different cycles of π, then.

Graphical Models of the Visual Cortex - Semantic Scholar
chain. The activity in the ith region is influenced by bottom-up feed-forward data xi−1 and top-down .... nisms which combine a bottom-up, small-image-patch, data-driven component with a top-down ..... Optimal Statistical Decisions. New York: ...

Welling - Graphical Models and Deep Learning.pdf
A graphical representation to concisely represent (conditional) independence relations between variables. • There is a one-to-one correspondence between the ...

Graphical Models of the Visual Cortex - Semantic Scholar
book Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Infer ..... the commodity-hardware-plus-software that comprise most distributed processing .... relatively stable, accounting for small adjustments due to micro saccades.

Object Detection in Video with Graphical Models
demonstrate the importance of temporal information, we ap- ply graphical models to the task of text detection in video and compare the result of with and without ...

Planar graphical models which are easy
Nov 2, 2010 - additional light onto this question. In [1]–[3] Valiant described a list of easy planar models reducible to dimer models on planar graphs via a set of 'gadgets'. The gadgets were of. 'classical' and 'holographic' types. A classical ga

Adaptive Inference on General Graphical Models
ning tree and a set of non-tree edges and cluster the graph ... computing the likelihood of observed data. ..... 3 for computing the boundaries and cluster func-.