Wei-Chen Chiu and Mario Fritz Max Planck Institute for Informatics, Saarbrücken, Germany {walon, mfritz}@mpi-inf.mpg.de

See the Difference: Direct Pre-Image Reconstruction and Pose Estimation by Differentiating HOG Motivation descriptor [3] has led to many advances in computer vision and is still part of many state-of-the-art methods, e.g. [2, 4] • While HOG is only defined as a feed-forward computation and introduces an information bottleneck, approximation of HOG and sampling approach have been proposed to circumvent the problem of the non-invertible HOG [1, 8, 9]. • We realize that the associated feature computation of HOG is piecewise differentiable and therefore facilitate differentiable vision pipelines which includes HOG descriptors.

Proposed Method

• HOG

histogram binning as spatial filtering

h Igray w compute gradients h kOk

o Fb

w

Contributions exploitation of piece-wise differentiability of HOG feature representation • Enable inverting vision pipelines which build on HOG by optimizing the input given a desired output • We exemplify the two use cases: – End-to-end optimization of pre-image reconstruction – End-to-end optimization of pose estimation

h

= kOk h

⇥ w

• First

F

o fb

[2] C. B. Choy, M. Stark, S. Corbett-Davies, and S. Savarese. Enriching object detection with 2d-3d registration and continuous viewpoint estimation. In CVPR, 2015.

[7] M. M. Loper and M. J. Black. Opendr: An approximate differentiable renderer. In ECCV, 2014. [8] A. Mahendran and A. Vedaldi. Understanding deep image representations by inverting them. In CVPR, 2015.

Model :

ˆ = argmin φ(I) − φ(I) ˆ X×Y I∈R

o

w

NB







[v1 , v2 , v3 , v4 , v5 , v6 , · · · ] HOG vector v

@E @✓

@E @ Iˆ

• If

color image I ∈ R

w×h×3

(8)







given

rendered image

ˆ φ(I)

is given, transform into gray-level, (1)

gradient maps Gx and Gy on horizontal and vertical directions, we compute magnitude k∇k and direction Θ of gradients by: k∇k = G2x + G2y (2) Θ = arctan(Gy , Gx)

iter-1

• Original

histogramming and voting steps of HOG computation miss the positional information of pixels, we rewrite them as linear filtering operations.

the cell centers (X , Y), we concatenate v=

s {Fb (x, y|x

s Fb

(4) (5)

cross mutual structural correlation information similarity 0.287 1.182 0.252

BoVW [6]

Example images

HOG visualization

(6)

align

HOGgles [9]

0.409

1.497

0.271

CNN-HOG [8]

0.632

1.211

0.381

our ∇HOG (single scale)

Results :

0.760

1.908

0.433

our ∇HOG (single scale)

0.170

1.464

0.301

our ∇HOG (multi-scale)

0.147

1.478

0.293

We test on the chairs validation set of PASCAL VOC 2012 dataset and use the continuous 3D pose annotations from PASCAL3D+ benchmark [10]. We compare our method with the baseline from Aubry et al. [1].

BOVW [6] Bag-of-Words

HOGgles [9] UoCTTI-HOG

CNN-HOG [8] UoCTTI-HOG

Our ∇HOG (single-scale) UoCTTI-HOG

Our ∇HOG (single-scale) Dalal-HOG

4 views / 90◦ 8 views / 45◦ 16 views / 22.5◦ 24 views / 15◦

Our ∇HOG (multi-scale) Dalal-HOG

Aubry et al. [1]

use the L2-norm for global contrast normalization: v vnormalized = kvk +  v u u u u t

test images

Aubry et al. [1] (7)

[9] C. Vondrick, A. Khosla, T. Malisiewicz, and A. Torralba. HOGgles: Visualizing Object Detection Features. In CVPR, 2013.

⇒ All the operations are (piecewise-) differentiable (summation, multiplication, divi-

[10] Y. Xiang, R. Mottaghi, and S. Savarese. Beyond pascal: A benchmark for 3d object detection in the wild. In WACV, 2014.

sion, square, square root, arc-tangent, clip), with the use of the chain rule, our HOG implementation is differentiable on each pixel position.

47.33

35.39

20.16

15.23

our method 58.85 40.74 22.22 16.87 *The viewpoint estimation is correct if its distance to ground-truth is lower than a threshold.

Contrast Normalization : • We

observation

on an approximate differentiable renderer, OpenDR [7], and parameterize the pose of CAD models by: azimuth θ, elevation ψ, and distance to camera γ. • Use Examplar LDA to extract the visual discriminative patches on both rendered image and observation, the matching are addressed by the similarities between HOG vectors of patches, described by our ∇HOG. • The similarity can be traversed back to the pose parameters, thus an end-to-end optimization.

to get the HOG vector v:

∈ X , y ∈ Y)}b=1···B

HOG

• Build

We evaluate on the dataset from [6] and show outperformance w.r.t few state-of-the-art baselines: BoVW [6], HOGgles [9], and CNN-HOG [8].

Weighted Vote into Spatial and Orientation Cells :

(3)

iter-final

Method

Pf

initial guess

Results :

v u u u u u t

Orientational: B max=1 o fb (Θ) = clipmin=0 (1 − |Θ − µb| × ) 180 o o Fb = k∇k fb (Θ), ∀b ∈ 1 · · · B Spatial: s o s Fb = Fb ∗ f , ∀b ∈ 1 · · · B

iter-2

·

pose



iter-0

PI

LDA
 discriminative patches

differentiable renderer

Examples for optimization procedure:

=

our
 diff.HOG

CAD model

Igray = I(:, :, 0) ∗ 0.299 + I(:, :, 1) ∗ 0.587 + I(:, :, 2) ∗ 0.114

• With

similarity E



Gradients Computation :

[4] J. Dong and S. Soatto. Domain-size pooling in local descriptors: Dsp-sift. In CVPR, 2015.

[6] H. Kato and T. Harada. Image reconstruction from bag-of-visual-words. In CVPR, 2014.

Model :

(⇥)

contrast p v normalization kvk+✏

[3] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, 2005.

[5] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. TPAMI, 2010.

We also apply our ∇HOG approach on a pose estimation task where 3D CAD models have to be aligned to objects in 2D images.

Dalal’s UoCTTI HOG HOG [5]

[1] M. Aubry, D. Maturana, A. Efros, B. Russell, and J. Sivic. Seeing 3d chairs: exemplar part-based 2d-3d alignment using a large dataset of cad models. In CVPR, 2014.

We first experiment the proposed ∇HOG on the image reconstruction task based on the feature descriptors.

ˆ X×Y I∈R

• With

References

Application 2: Pose Estimation

Given an image I and its HOG vectors as φ(I), we optimize to reconstruct ˆ have image Iˆ whose HOG features φ(I) the minimum L2 distance E to φ(I): Iˆ = argmin E

fs

orientation filter o {fb }b=1···NB

Application 1: Pre-Image Reconstruction

our method

See the Difference: Direct Pre-Image Reconstruction ...

•We realize that the associated feature computation of HOG is piecewise differentiable and ... Visualizing Object Detection Features. In CVPR, 2013. ... histogram binning as spatial filtering. HOG vector v. [v1,v2,v3,v4,v5,v6,···] v. / kvk+∈ contrast normalization. /s. Gradients Computation : •If color image I ∈ R w×h×3 is given ...

3MB Sizes 2 Downloads 206 Views

Recommend Documents

See the Difference: Direct Pre-Image Reconstruction ...
Wei-Chen Chiu and Mario Fritz. Max Planck Institute for Informatics, Saarbrücken, Germany. {walon, mfritz}@mpi-inf.mpg.de. Motivation. •HOG descriptor [3] has ...

the difference.
HE,-3|" centre lap belt 0 High-level rear brake light I Headlamp and rear screen washrwioe I Child safety look on rear door and tailgate. BUT NOT THE PRICE. How better to judge an estate oar than by its sheer toadaoility? With 2123 litres oi rear lug

A Formula for the General Solution of a Difference ... - Science Direct
We give a formula for the general solution of a dth-order linear difference equation with constant coefficients in terms of one of the solutions of its associated ...

Symmetric Difference in Difference Dominates ...
The use of a selection model to assess the properties of econometric estimators owes a lot to earlier similar efforts by Heckman (1978), Heckman and Robb (1985), Ashenfelter and Card (1985) and Abadie (2005). The consistency of Symmetric DID with tim

MapsIndoors - The Google Cloud Difference
Mar 24, 2017 - MapsIndoors is built with Google Maps which offers a seamless transition from the outdoor world and into your shopping facility. Shoppers can ...

MapsIndoors - The Google Cloud Difference
Mar 24, 2017 - For both experienced and novice travellers, airports are always synonymous with some ... MapsIndoors is optimised to work on all modern day.

RTTI reconstruction - GitHub
Mobile. Consumer. Cmd. Consumer. Munch. Sniffer. FileFinder. FileCollect. Driller ... o Custom data types: ✓ wrappers ... Identify Custom Type Operations ...

Image Reconstruction in the Gigavision Camera
photon emission computed tomography. IEEE Transactions on Nuclear Science, 27:1137–1153, June 1980. [10] S. Kavadias, B. Dierickx, D. Scheffer, A. Alaerts,.

PATELLAR TENDON GRAFT RECONSTRUCTION OF THE ACL.pdf ...
PATELLAR TENDON GRAFT RECONSTRUCTION OF THE ACL.pdf. PATELLAR TENDON GRAFT RECONSTRUCTION OF THE ACL.pdf. Open. Extract.

Schematic Surface Reconstruction - Semantic Scholar
multiple swept surfaces, of which the transport curves lie in horizontal planes. This section will introduce the basic reconstruction framework that initializes a set ...

TV Direct
Jul 27, 2016 - Year-end 31 Dec. 2014. 2015 ... and the bottom 40% of KG I's coverage universe in the related m arket (e.g. Taiw an).1.3. U nder perform (U ).

Schematic Surface Reconstruction - Changchang Wu
This paper introduces a schematic representation for architectural scenes together with robust algorithms for reconstruction from sparse 3D point cloud data. The.

ODT data reconstruction
ODT data reconstruction is an issue related to the coupling of the one-dimensional ..... velocity, u cell average u. −. , u. + f. −. , f. +. Fromm slope cell boundary, k.

Difference
Fun Night DJ's. $1,200.00. $600.00. $600.00. $0.00. Fun Night Expenses. $75.00. $75.00. $0.00. Fundraiser-Coupon Books. $1,820.00. $1,900.00. $780.00 ... $360.00. Secretary of State. After School Activities. $360.00. Bank Expense. Net Profit. $0.00.

The Convergence of Difference Boxes
Jan 14, 2005 - On the midpoint of each side write the (unsigned) difference between the two numbers at its endpoints. 3. Inscribe a new square in the old one, ...

The Effect of Recombination on the Reconstruction of ...
Jan 25, 2010 - Guan, P., I. A. Doytchinova, C. Zygouri and D. R. Flower,. 2003 MHCPred: a server for quantitative prediction of pep- tide-MHC binding. Nucleic ...