Automorphism Groups of Graphical Models and Lifted Variational Inference Hung Hai Bui1 1 Laboratory
Tuyen N. Huynh2
Sebastian Riedel3
for Natural Language Understanding, Nuance Communications 2 AI
3 Department
Center, SRI International
of Computer Science, University College London
July 14, 2013
UAI 2013
1/26
UAI 2013
Motivations • Probabilistic inference – exploit low tree-width, local graph
sparsity
• What about the above graphical models? • Inference can be done efficiently using lifted inference.
• Why? “Symmetry” is essential. • What is symmetry? • How do we exploit symmetry in variational inference?
2/26
Motivations (cont.) • Lifted inference methods mostly are derived procedurally: • identify same duplicated computational steps that can be
performed once. • Not clear what form of symmetry is being exploited. • Hard to generalize.
• We propose a declarative approach • Formalize symmetry in graphical models • Lift variational optimization formulations rather than lift
inference algorithms • Lifted problems can be solved with the usual optimization
tool-box (LP solvers, cutting plane, dual decomposition, etc) • Connect lifted inference and mainstream variational
approximations
UAI 2013
3/26
Outline
1 Motivation 2 Symmetry of Graphical Models 3 A General Framework for Lifted Convex Variational Inference 4 Lifted MAP with Local and Cycle constraints 5 Experiments
UAI 2013
4/26
UAI 2013
Symmetry in Graphs • Symmetry is formalized as the set of transformations that
preserve the object. • The set of all such transformations (permutations) π form the automorphism group.
S5
S4 x S3
(a)
(b)
D5 (c)
1 (d)
• The automorphism group partitions the set of nodes into
orbits. • Node orbit is a set of nodes equivalent up to symmetry. • Similarly for edge orbits and configuration (node-value
assignment) orbits. 5/26
UAI 2013
Symmetry of Graphical Models
• Exponential family
F(x | θ) = exp (hΦ(x), θi − A(θ)) x ∈ Rn , θ ∈ Rm • Automorphisms are permutations of variables and features
preserving Φ. • Formally, a pair (π, γ) ∈ Sn × Sm such that −1
Φγ (x π ) ≡ Φ(x) i.e., Φi (x π ) ≡ Φγ(i) (x)
6/26
Symmetry of Graphical Models (cont.)
1 f1 2
f2
f3
f4
3 f5
4 G
Colored G
Orbits of G
G
• Automorphisms of Colored G are automorphisms of F • So far we’ve ingored parameters. If parameters are tied • Refine colors of features to be consistent with parameter tying • Similarly compute the automophorism group consistent with
parameter tying
UAI 2013
7/26
Properties of Exponential Family Automorphisms
Theorem If (π, γ) ∈ Aut[F] then • Pr(x π |θ γ ) = Pr(x|θ) • π is an automorphism of the structure graph G[F]. • Θγ = Θ and A(θ γ ) = A(θ) • Mγ = M and A∗ (µγ ) = A∗ (µ) • mγ (θ) = m(θ γ )
Proof tools: theory of group actions, orbit-stabilizer theorem
UAI 2013
8/26
UAI 2013
Partitions and Symmetrized Subspaces A partition ∆ of {1 . . . m} induces a symmetrized subspace Rm ∆ of points r in Rm such that if j and j 0 are in the same cell of ∆, then rj = rj 0 . 1
Δ = {1,{2, 3}}
2
RΔ3
3
S∆ denotes the intersection of S with the subspace: S ∩ Rm ∆
9/26
UAI 2013
Variational Inference with Parameter Tying
Given parameter-tying partition ∆ and θ ∈ Θ∆ . Solve sup hθ, µi − A∗ (µ) µ∈M
How does symmetry help? Intuitions: • some features have the same expectations, so some µi ’s are
the same. • optimal µ is trapped in a symmetrized subspace Rm ϕ for some
lifting partition ϕ • this generalizes to general convex optimization problem
• How to find ϕ?
10/26
UAI 2013
Lifted Variational Infence Framework Parameter-tying
Symmetry of the family
Δ
Aut[F]
Lifting group
AutΔ [F]
Lifting partition (feature orbit partition)
Mean parameter space
Symmetrized subspace
Lifted mean parameter space
ϕ
m ϕ
R
M
Mϕ
11/26
Lifted Variational Inference Framework (cont.)
Theorem
Marginal inference
sup hθ, µi − A∗ (µ) = sup hθ, µi − A∗ (µ) µ∈M
MAP inference
sup hθ, µi = sup hθ, µi µ∈M
UAI 2013
µ∈Mϕ µ∈Mϕ
12/26
Lifted Marginal Polytope Mϕ Configuration orbit
Mϕ
Centroid
m • M projected to Rm ϕ = M intersects with Rϕ • Projection view:
• Ground configurations in the same orbit project to the same
point, which is the centroid of the orbit • num. extreme points ≤ num. of configuration orbits (typically
still exponential) • For complete symmetric graphical models with N binary
variables: • num. configuration orbits = N + 1 UAI 2013
13/26
Lifted Marginal Polytope Mϕ
• Intersection view • Let ρ be the orbit mapping function ρ : i 7→ orb(i) • M described by a system of constraints T1 (µ) . . . TK (µ) • Intersect Tk with Rm ¯ρ(i) ϕ means substitute µi by µ • num. variables reduced to num. feature orbits • num. constraints unchanged • However, many constraints become redundant and can be removed
• This view is useful in practice for lifting outer bounds of M
UAI 2013
14/26
Lifted Approximate MAP Inference
Theorem Lifted Relaxed MAP: If OUTER = OUTER(G) depending only on the graphical model structure G sup µ∈OUTER
UAI 2013
hθ, µi =
sup
hθ, µi
µ∈OUTERϕ
15/26
Lifted MAP on LOCAL τv :0 + τv :1 = 1 τ{u:0,v :0} + τ{u:0,v :1} = τu:0 LOCAL= τ ≥ 0 τ {u:0,v :0} + τ{v :0,u:1} = τv :0 τ{u:1,v :1} + τ{u:0,v :1} = τv :1 τ{u:1,v :1} + τ{v :0,u:1} = τu:1
LOCALϕ = τ¯ ≥ 0
τ¯v:0 + τ¯v:1 = 1 τ¯e:00 + τ¯(u,v ):01 = τ¯u:0 ¯ τ¯e:00 + τ¯(v ,u):01 = τ¯v¯:0 τ¯e:11 + τ¯(u,v ):01 = τ¯v¯:1 τ¯e:11 + τ¯(v ,u):01 = τ¯u:1 ¯
∀v ∈ V(G) ∀ {u, v } ∈ E (G)
∀ node orbit v
∀ edge orbit e with {u, v } a representative of e
• num. constraints is O(#node orbits + #edge orbits). • Lifted LOCAL LP can be solved by a generic solver, or a
message-passing variant, e.g., MPLP. UAI 2013
16/26
Tightening Bound: Cycle Polytope
Cycle constraints (Sontag & Jaakkola 07): 0 X
X
1
1
{u,v }∈F
nocut({u, v }, τ ) +
P
{u,v }∈C \F
cut({u, v }, τ ) ≥ 1
=
X 0
P
X
1
C a cycle in G , F ⊂ C , |F | is odd cut({u, v }, τ ) = τu:0,v :1 + τu:1,v :0 nocut({u, v }, τ ) = τu:0,v :0 + τu:1,v :1
• num. constraints is large • Iteratively add violated cycle contraint via cutting plane • Run shortest path on mirror graph of G to find violated
constraint
UAI 2013
17/26
UAI 2013
Lifted Cycle Polytope • Substituting ground variables τ by lifted variables τ¯ yields
lifted cycle constraints X nocut({u, v }, τ¯) + {u,v }∈F
X
cut({u, v }, τ¯) ≥ 1
{u,v }∈C \F
• However • these constraints are still defined on cycles of the ground
graphical model G . • many constraints are duplicates
• Can we characterize cycle constraints using the lifted graph G¯? • not straightforward, since cycles in G¯ might not correspond to cycles in G
18/26
UAI 2013
Lifted Cycle Polytope (cont.) • Fix a node i (mark i distinct), let G¯[i] be the new lifted graph
i {i} Node Orbits of G
Node Orbits of G fixing i
G[i]
• A cycle through {i} on G¯[i] always corresponds to a set of
cycles throuh i on G . • Violated constraints on ground cycles through i can be found
by • forming mirror graph of G¯[i] • running shortest path from {i} to its mirror node
19/26
UAI 2013
Lifted MAP with Cycle Constraints
Algorithm constraints ← LOCALϕ
2 τ¯ ←Solve maxt¯ t¯, θ¯ given current constraints 1
3
∀ node orbit v, pick a representative i ∈ v. Find the shortest path ¯ from {i} to {i}-mirror on the mirror graph of G[i].
4
If shortest-path < 1, add violated constraint to the current set of constraints.
5
If no more violated constraint, exit; else goto 2.
20/26
Markov Logic Networks (MLNs)
• “Lovers & Smokers” Markov Logic Network 100
Male(x) ⇔ ¬Female(x)
2
Male(x) ∧ Smokes(x)
2
Female(x) ∧ ¬Smokes(x)
0.5
x 6= y ∧ Male(x) ∧ Female(y ) ∧ Loves(x, y )
0.5
x 6= y ∧ Loves(x, y ) ⇒ (Smokes(x) ⇔ Smokes(y ))
−100
x 6= y ∧ y 6= z ∧ z 6= x ∧ Loves(x, y ) ∧ Loves(y , z) ∧ Loves(x, z)
• How to find lifting partition? • Ground the model, construct colored factor graph and run
graph automorphism tool (e.g. nauty) • Can we do this without grounding?
UAI 2013
21/26
Renaming Symmetry of MLNs • Let D∗ be the set of individuals that do not appear as
constants in MLN. • Intuition: every permutation (renaming) of individuals in D∗
preserves the probabilistic model.
Theorem The renaming group S(D∗ ) is (isomorphic to) a subgroup of the MLN’s lifting group • The number of orbits induced by this group does not depend
on domain size |D|. • Example: observed one constant a • Renaming group induces 5 orbits for Love predicate {L(a, a)}, {L(x, x)}, {L(a, x)}, {L(x, a)}, {L(x, y )} x, y 6= a, x 6= y UAI 2013
22/26
UAI 2013
Experiment
Finding symmetry: renaming group versus graph automorphism (nauty)
Nauty Renaming
domain size #Orbits Time(s) #Orbits Time(s)
10 12 .49 12 .08
20 23 1.77 23 .09
50 25 172.79 80 .221
100 27 9680.48 255 .4
200 * * 905 .84
1000 * * 20505 2.19
23/26
UAI 2013
Experiment (cont.) Lifted MAP (LOCAL, CYCLE) inference versus ground counterparts
True Op2mal
Local
Cycle
Time (s)
Objec&ve
100000 10000 1000 100 10 1 0.1 0.01 0.001
Renaming-‐Local
5
10
15
20 100 1000
Domain size
(a) Run&me vs. domain size
Renaming-‐Cycle
522 521 520 519 518 517 516 515 0
10
20 30 Time(s)
40
(b) Objec&ve over &me for domain size 5
“Lovers & Smokers” MLN without evidence, varying domain size 24/26
UAI 2013
Experiment (cont.)
100000 10000 1000 100 10 1 0.1 0.01 0.001
16500 Objec7ve
Time(s)
Lifted CYCLE versus lifted LOCAL
16000 15500 15000
14500 2 4 6 8 10 2 4 6 8 10 Number of observed constants Number of observed constants Renaming-‐Local Renaming-‐Cycle
“Lovers & Smokers” MLN with varying amount of soft evidence, fixing domain size to 100 25/26
Conclusion and Future Direction
• Summary: • First rigorous formalization of symmetry in graphical models • A general framework for lifting convex outer bound variational
approximations • First lifted algorithm that works with a bound tighter than the
local polytope • Related and Future work: • Exploit symmetry of the log-partition function A (Bui et al.,
AAAI’12) • Exploit symmetry in sampling methods (Niepert UAI’12,
AAAI’13) • Lift convex variational marginal inference
UAI 2013
26/26