With One Look: 3D Face Shape Estimation from a ...

Viewer
Transcript

July 14th, 2016

With One Look: 3D Face Shape Estimation from a Single Snapshot Chia‐Po Wei and Yu‐Chiang Frank Wang Research Center for IT Innovation, Academia Sinica, Taipei, Taiwan

IEEE International Conference on Multimedia & Expo 2016

Outline • Introduction  3D Face Shape Estimation from a Single Snapshot

• Related Work  3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning

• Our Proposed Method  Pose Estimation with Pose Regularization  Shape Estimation via Non‐Negative Least Squares (NNLS)

• Experiments • Conclusions 2

Outline • Introduction  3D Face Shape Estimation from a Single Snapshot

• Related Work  3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning

• Our Proposed Method  Pose Estimation with Pose Regularization  Shape Estimation via Non‐Negative Least Squares (NNLS)

• Experiments • Conclusions 3

Why 3D Face Shape Estimation? Frontalization

Face Recognition

3D Printing

3D Avatar

4

Challenges • 3D Face Shape Estimation from a Single Snapshot  Only one single image is available for estimating the 3D face shape.  Face images with possible pose, illumination, and expression variations  Face images could be with disguise or makeup.  Face images could be of low resolution.

5

Outline • Introduction  3D Face Shape Estimation from a Single Snapshot

• Related Work  3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning

• Our Proposed Method  Pose Estimation with Pose Regularization  Shape Estimation via Non‐Negative Least Squares (NNLS)

• Experiments • Conclusions 16

3D Morphable Model • References  (Blanz and Vetter, SIGGRAPG’99), (Blanz and Vetter, PAMI’03), (Matthews et al., IJCV’07)

• Method  The 3D morphable model method optimizes the shape and the texture coefficients of a set of 3D reference models such that the rendered image is close to the input image.

• Issues  The cost function is nonlinear and is difficult to be optimized (easy to get stuck in local minima).  Additional efforts for model initialization are typically required.  Not preferable for unconstrained images

7

Shape from Shading • References  (Dovgard and Basri, ECCV’04), (Kemelmacher and Basri, PAMI’11), (Barron and Malik, CVPR’12)

• Method  The shape from shading method utilizes shading cues for estimating the 3D shape information from a single image.

• Issues  Proper image segmentation is required, because the extraction of shading cues is sensitive to the presence of background clutter.  The reconstruction performance would degrade when input faces are with makeup or are occluded/disguised.

8

Single 3D Reference Model • References  T. Hassner, “View Real‐World Faces in 3D,” ICCV’13.  Y. Taigman et al., “Deepface…,” CVPR’14.  T. Hassner et al., “Effective Face Frontalization in Unconstrained Images,” CVPR’15.

• Method  The single 3D reference (S3DR) model method performs face frontalization based on a single 3D reference model.

• Issues  No model fitting is required for frontalization. However, the outputs of S3DR would not contain sufficient person‐specific information.  If the 3D shape of the input image is very different from that of the reference model, the resulting performance would not be satisfactory.

9

Deep Learning Methods • References  Z. Zhu et al., “Deep Learning Identity‐Preserving Face Space,” ICCV’13.  J. Yim et al., “Rotating Your Face Using Multi‐task Deep Neural Network,” CVPR’15.

• Method  Deep learning methods train deep neural networks to predict the frontal view of the input face.

• Issues  Deep learning methods require substantial training efforts.  The frontalization performance would degrade if the input image exhibits variations (e.g., pose or illumination) that are not presented in the training data.

10

Outline • Introduction  3D Face Shape Estimation from a Single Snapshot

• Related Work  3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning

• Our Proposed Method  Pose Estimation with Pose Regularization  Shape Estimation via Non‐Negative Least Squares (NNLS)

• Experiments • Conclusions 111

Our Contributions • 3D Face Shape Estimation  Only one single snapshot is needed for 3d face shape estimation.  Our method is able to tackle images taken under different scenarios:   low resolution, extreme lighting, disguise/occlusion, and makeup.  Our proposed algorithm utilizes non‐negative least squares (NNLS) with additional pose regularization, which does not require careful initialization or segmentation.

12

Our Framework Input Image with   Detected Landmarks

Estimated 3D Face Shape

Pose and Shape Estimation

 Given •

∈

denotes the kth 3D reference model, and

 Input •

denotes the detected landmarks, and

∈

is the nth landmark of

.

is the nth landmark of .

 Output •

is the projection matrix, and

• The estimated 3D face shape will be

,

,…,

is the shape parameter. ⋯

. 13

Outline • Introduction  3D Face Shape Estimation from a Single Snapshot

• Related Work  3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning

• Our Proposed Method  Pose Estimation with Pose Regularization  Shape Estimation via Non‐Negative Least Squares (NNLS)

• Experiments • Conclusions 114

Pose Estimation with Pose Regularization •

Our formulation for learning projection matrix ∈

(w/ fixed ):

min , ∈

 Homogeneous coordinates of •

≡

;1 ∈

 Linear combination of 

and

and

≡

:

≡∑

:

;1 ∈

and are selected landmark indices for pose regularization.

 Our formulation is convex and admits a closed‐form solution for .

15

Pose Regularization Detected Landmarks

Estimated Landmarks

 Pose regularization enforces the pose similarity between the input and the estimated shapes by minimizing ∑ , ∈  That is, each of the 7 red arrows in the left figure is similar to the corresponding magenta arrow in the right figure.  Pose regularization could avoid local minima and yield better shape estimation results. 16

Outline • Introduction  3D Face Shape Estimation from a Single Snapshot

• Related Work  3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning

• Our Proposed Method  Pose Estimation with Pose Regularization  Shape Estimation via Non‐Negative Least Squares (NNLS)

• Experiments • Conclusions 117

Shape Estimation via NNLS •

Our formulation for learning shape parameter ∈

(w/ fixed ):

min , ∈

 Projected 2D landmarks •

 The constraint

1 ;

2 /

∈

: 3 where

∈

0 enforces that all entries of are non‐negative.

 Our formulation can be reduced to the standard form of Non‐Negative Least Squares (NNLS).

18

Algorithm for 3D Face Shape Estimation •

Input: Test image with detected 2D landmarks , 3D reference models with landmarks , and pose regularization parameter .       while not converged do Pose Estimation: Update with fixed . Shape Estimation: Update with fixed . end while

•

Output: The projection matrix and the shape parameter .

•

Remark:    Our algorithm typically converges within 20 iterations.  We use Chehra1 for detecting 49 facial landmarks. (Other software such as Dlib or SDM can also be used.) [1] Asthana et al., “Incremental Face Alignment in the Wild,” CVPR, 2014. 19

Outline • Introduction  3D Face Shape Estimation from a Single Snapshot

• Related Work  3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning

• Our Proposed Method  Pose Estimation with Pose Regularization  Shape Estimation via Non‐Negative Least Squares (NNLS)

• Experiments • Conclusions 120

BU‐3DFE (3D Face Database) •

100 subjects available. The neutral expression of each subject has a textural image and a corresponding 3D shape model.

•

The input textural image can be a frontal view image or a side view image at a 45° yaw angle.

•

We conduct the leave‐one‐out test and report the average reconstruction error over the 100 subjects. Error Metrics

Ground Truth ∗

• RMSE:   ∑ • REL:  ∑ • Log10:  ∑ •

/∑ log

Test Image

/

/ ∗

log

∗

/

and ∗ are the estimated 3D depth and the ground truth depth of the test image, resp. 21

BU‐3DFE – Quantitative Evaluation Frontal‐view images Log10

REL

Side‐view images (at a 45° yaw angle)

RMSE

Log10

REL

RMSE

S3DR

0.3246 0.9937 17.316

S3DR

0.3246 0.9937 17.316

L1LS

0.2926 0.4623 5.5326

L1LS

0.3245 0.6665 8.2484

Ours*

0.2705 0.3723 4.2298

Ours*

0.2677 0.3964 4.9663

Ours

0.2641 0.3619 4.1364

Ours

0.2644 0.3795 4.6284

• Ours* denotes our method without pose regularization. • S3DR (T. Hassner et al., CVPR 2015) • L1LS (X. Zhou et al., CVPR 2015)

22

BU3DFE – Depth Error & Frontalization S3DR

L1LS

Ours*

Ours

RMSE Depth Error

Frontalization Result

 The method for frontalization is based on (T. Hassner et al., CVPR 2015). 23

Computation Time (in seconds)

Time

S3DR

L1LS

Ours*

Ours

0.0016

0.0605

0.0177

0.0216

• Ours* denotes our method without pose regularization. • S3DR (T. Hassner et al., CVPR 2015) • L1LS (X. Zhou et al., CVPR 2015) • The runtime estimates were obtained on a PC with Intel Quad Core 2.33 GHz processors and 4G RAM.

24

LFW Images Input

S3DR

L1LS

Ours

Estimated 3D Model

Frontalization Result

25

Unconstrained Images

• The top row contains the input images, while the bottom row contains the corresponding 3D outputs recovered by our method. 26

Outline • Introduction  3D Face Shape Estimation from a Single Snapshot

• Related Work  3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning

• Main Result  Pose Estimation with Pose Regularization  Shape Estimation via Non‐Negative Least Squares (NNLS)

• Experiments • Conclusions 127

Conclusions •

We present a framework for 3D face shape estimation from a single unconstrained image which can be taken under different pose and illumination, or even with occlusion.

•

Our proposed algorithm utilizes non‐negative least squares (NNLS) with additional pose regularization, which produces satisfactory estimation outputs without the need of careful initialization or segmentation.

•

In addition to quantitative evaluations and comparisons on the 3D face database of BU‐3DFE, we also consider LFW and other practical face images (e.g., paintings) for qualitative evaluations.

28

Thank You!

29

Face Pose Estimation with Combined 2D and 3D ... - Jiaolong Yang