Compact Part-Based Image Representations Marc Goessling, Yali Amit Department of Statistics Introduction

Geometric component

Learning compact, interpretable part-based image representations is still an unsolved task. We review various existing composition rules for binary data and introduce the max-minus-min rule. We also propose a novel sequential initialization procedure based on a process of oversimplification and correction. The experiments show that our approach leads to very intuitive models. Composition rules The (binary) image data I is modeled through a Bernoulli distribution P(I | µ) where the global template µ(x) = γ(µ1(x), . . . , µK (x)) is a composition of part templates µk which are defined on the entire image grid. Different composition rules γ : [0, 1]K → [0, 1] can be considered [1-4]. We propose to use the max-minus-min rule (where q specifies ‘no opinion’) γ(p1, . . . , pK ) = q + (max pk − q)+ − (min pk − q)− k

1.0 1.0

` (x) = arg min µk,tk (x) k

In the M-step we update the parts by computing P ? ? −1 n 1{k = kn (x) or k = `n (x)} Φtnk (In )(x) P µk (x) = ?(x) or k = `? (x)} 1{k = k n n n which is simply the average of all (back-transformed) images for which the part was responsible.

Figure 3 : Parts learned from 20 examples per class. Each part is plotted at its mean location with mean orientation. For each pixel the color (red, green, blue, magenta) indicates the maximum part and the intensity visualizes the template value (white corresponding to 0, color corresponding to 1).

Synthetic experiment We synthesize data by independently sampling each image quadrant. A quadrant is either entirely white (w.p. 41 ), entirely black (w.p. 41 ) or 1 drawn from a symmetric Bernoulli distribution (w.p. 2 ). For a fair comparison with other models we omit the sequential initialization.

0.0

0.2

0.6

0.8

1.0

p2

12

14

p2

0.4

denoising autoencoder restricted Boltzmann machine max−minus−min model

20

log−odds sum normalized sum max minus min average

Good initializations are crucial because the learning problem is non-convex. We start with oversimplified models which try to explain the data using only very few parts. These models are then ‘corrected’ by appending residual images (difference between a training example and the model explanation) as additional parts.

18

0.8

k

?

16

0.6

Define the parts with the most extreme opinion for pixel x:

cross entropy

0.2 0.4

EM Learning

Sequential initialization

0.0

0.6

0.2

Given current templates µk the task is to find the part configuration (µ1,t1, . . . , µK ,tK ) which maximizes the likelihood of the image data I : set µ(x) = q REPEAT ? ? find k , t = arg maxk,t P(I | γ(µ(x), µk,t (x))) update µ(x) = γ(µ(x), µk ?,t ?(x)) UNTIL no improvement is possible anymore

0.4

γ(0.7, p2)

0.6

0.8

1.0 0.9 0.8 0.7

γ(0.7, p2)

0.0

We train models with up to 4 parts on the letter classes from the TiCC handwritten characters dataset [5].

Inference: Likelihood matching pursuit

k (x) = arg max µk,tk (x),

which reduces redundancy (since only the most extreme template votes) and encourages vote abstention (because opposing opinions are penalized strongly).

noisy OR odds sum maximum

The spatial arrangement of the parts is modeled as a joint Gaussian distribution on locations and orientations.

?

k

Handwritten letters

20

40

60

80

100

120

140

160

180

200

training samples

Figure 4 : Left: Initialization (1st row) and learned parts after 1, 2 and 5 EM iterations (2nd-4th row) for the max-minus-min model trained on 100 examples. Right: Cross-entropy reconstruction error for different models and various training sizes (lower is better). The dashed black line is the cross-entropy of the ground-truth model.

Figure 1 : Top: Asymmetric and symmetric composition rules, as a function of p2 for p1 = 0.7. Bottom: Compositions of two parts using the different rules (dark means higher probability). The probabilities in the first template are 0.5 and 0.7, the probabilities in the second template are 0.7 and 0.01.

References Part transformations Explicitly modeling shifts and rotations allows to share parameters among all transformed versions µk,t = Φt (µk ) of the part template µk .

http://galton.uchicago.edu/~goessling/

Figure 2 : Learning a part model for the letter T. 1st row: The 10 examples used for training. 2nd & 3rd row: Online learning of two parts. Shown are the two templates at step i = 1, . . . , 10 (blue corresponding to 0, yellow corresponding to 1). 4th row: Sampled part configurations using a multivariate Gaussian distribution on the spatial arrangement of the parts.

[1] [2] [3] [4] [5]

E. Saund. A multiple cause mixture model for unsupervised learning. Neural Computation, 1995. P. Dayan and R. S. Zemel. Competition and multiple cause models. Neural Computation, 1995. Y. Amit and A. Trouv´e. Pop: Patchwork of parts models for object recognition. IJCV, 2007. J. L¨ucke and M. Sahani. Maximal causes for non-linear component extraction. JMLR, 2008. L. van der Maaten. A new benchmark dataset for handwritten character recognition. TiCC, 2009.

http://galton.uchicago.edu/~amit/

Compact Part-Based Image Representations - UChicago Stat

P(I |µ) where the global template µ(x) = γ(µ1(x),...,µK(x)) is a composition of part templates .... drawn from a symmetric Bernoulli distribution (w.p.. 1. 2. ). For a fair.

813KB Sizes 1 Downloads 279 Views

Recommend Documents

Learning Compact Representations of Time-varying ...
As stated in the title, the goal of our work is to learn compact representations of time-varying processes. The methods we introduce here concern cases in which a parametric model serves as representation, changes in its optimal parametrization provi

stat notes -
However, in no way does it replace clinical judgment nor mitigate the need for clinical expertise. The reader is to exercise clinical judgment and consult with other sources of information that may become available with continuing research. Library o

stat pop rom.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. stat pop rom.pdf.

STAT 230 - GitHub
Dec 4, 2013 - 3 Probability - Counting Techniques ... 3.4 Solutions to Problems on Chapter 3 . .... only one point, e.g. A1 = {a1} we call it a simple event. .... of words in S, count the number of ways that we can construct such a word, each way.

STAT.(H) .PDF
Kalkaji, New Delhi-110019. 2013-2014 Short Attandance Details (Att % from 0 To 49.44) 2nd Semester. Jatin Taneja ... 2013/1403. 2013/1409. 2013/ ... Honours Statistics ... Ist Year. Ist Year. Ist Year. Ist Year. Ist Year. Ist Year. Ist Year. Ist Year

UChicago Social Hosting Guidelines.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. UChicago ...

UChicago Tata Centre for Development_Executive Ed Training ...
UChicago Tata Centre for Development_Executive Ed Training Program Manager_JD_Final.pdf. UChicago Tata Centre for Development_Executive Ed Training ...

THE-COMPACT-TIMELINE-OF-AVIATION-HISTORY-COMPACT ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item.

Shotgun Version Representations v6
... flexible enough to handle clients who may use some but not all of ... Screening Room adds support for multiple ... on Versions even if a client doesn't use Tank.

Graph representations
Models for small world? ▫ Erdos-Renyi model. ▫ n nodes, each node has a probability p of ... Barabasi-Albert model. ▫ Graph not static, but grows with time.

C-Stat Video Conferencing Policy.pdf
C-Stat Video Conferencing Policy.pdf. C-Stat Video Conferencing Policy.pdf. Open. Extract. Open with. Sign In. Main menu.

Stat-consulting-for-students.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item.

Stat 226.book - Tally Solutions
Apr 18, 2014 - On printing Annexure B in landscape orientation either in PDF or ..... incorrectly in the XML file generated for e-return. .... finished goods had conversion factor and unit of measure was in ...... Services Provided by Electricity Exc

Untitled - Washington Campus Compact
Apr 24, 2010 - ЕСОЛОДОСЛОЛООДОЛООЛОЛД ДОАОЛООДОЛОЛТООЛЛОЛЛОЛЛОЛТОДААСОЛОДОКОЛОДОЛООЛОДОЛЛОЛТАСДОЛДДОДЕЛУДА. WHEREAS, as we commemorate the one-year anniversary

Untitled - Washington Campus Compact
Apr 24, 2010 - KAUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU. WHEREAS, engaged citizens, many recruited by Volunteer Centers of. Washington, and AmeriCorps; VISTA; Learn and Serve America, and National Senior. Service Corps participan

INJURY STAT BN 2013 FR.pdf
Page 3 of 6. Whoops! There was a problem loading this page. INJURY STAT BN 2013 FR.pdf. INJURY STAT BN 2013 FR.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying INJURY STAT BN 2013 FR.pdf. Page 1 of 6.

2013 R4A ENR STAT PROFILE.pdf
Revenue Generated 34. 3. SECTORAL STATISTICS. Forest Management Service 35. Land Management Service 81. Protected Area, Wildlife and Coastal ...

Selecting different protein representations and ...
Apr 7, 2010 - selects the best type of protein representation in a data-driven manner, ..... lection seems similar to the well-known idea of feature selection in data mining ..... Figure 3: Analysis of relative protein representation importance on th

Designing Numerical Representations for Young Children
Institute of Education ... Digital technology presents opportunities to design novel forms of numerical ... children to explore the meaning behind these ideas or.

Highest weight representations of the Virasoro algebra
Oct 8, 2003 - Definition 2 (Antilinear anti-involution). An antilinear anti-involution ω on a com- plex algebra A is a map A → A such that ω(λx + µy) = λω(x) + ...

Stat Weights via Taylor Series.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Stat Weights via ...