July 14th, 2016
With One Look: 3D Face Shape Estimation from a Single Snapshot Chia‐Po Wei and Yu‐Chiang Frank Wang Research Center for IT Innovation, Academia Sinica, Taipei, Taiwan
IEEE International Conference on Multimedia & Expo 2016
Outline • Introduction 3D Face Shape Estimation from a Single Snapshot
• Related Work 3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning
• Our Proposed Method Pose Estimation with Pose Regularization Shape Estimation via Non‐Negative Least Squares (NNLS)
• Experiments • Conclusions 2
Outline • Introduction 3D Face Shape Estimation from a Single Snapshot
• Related Work 3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning
• Our Proposed Method Pose Estimation with Pose Regularization Shape Estimation via Non‐Negative Least Squares (NNLS)
• Experiments • Conclusions 3
Why 3D Face Shape Estimation? Frontalization
Face Recognition
3D Printing
3D Avatar
4
Challenges • 3D Face Shape Estimation from a Single Snapshot Only one single image is available for estimating the 3D face shape. Face images with possible pose, illumination, and expression variations Face images could be with disguise or makeup. Face images could be of low resolution.
5
Outline • Introduction 3D Face Shape Estimation from a Single Snapshot
• Related Work 3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning
• Our Proposed Method Pose Estimation with Pose Regularization Shape Estimation via Non‐Negative Least Squares (NNLS)
• Experiments • Conclusions 16
3D Morphable Model • References (Blanz and Vetter, SIGGRAPG’99), (Blanz and Vetter, PAMI’03), (Matthews et al., IJCV’07)
• Method The 3D morphable model method optimizes the shape and the texture coefficients of a set of 3D reference models such that the rendered image is close to the input image.
• Issues The cost function is nonlinear and is difficult to be optimized (easy to get stuck in local minima). Additional efforts for model initialization are typically required. Not preferable for unconstrained images
7
Shape from Shading • References (Dovgard and Basri, ECCV’04), (Kemelmacher and Basri, PAMI’11), (Barron and Malik, CVPR’12)
• Method The shape from shading method utilizes shading cues for estimating the 3D shape information from a single image.
• Issues Proper image segmentation is required, because the extraction of shading cues is sensitive to the presence of background clutter. The reconstruction performance would degrade when input faces are with makeup or are occluded/disguised.
8
Single 3D Reference Model • References T. Hassner, “View Real‐World Faces in 3D,” ICCV’13. Y. Taigman et al., “Deepface…,” CVPR’14. T. Hassner et al., “Effective Face Frontalization in Unconstrained Images,” CVPR’15.
• Method The single 3D reference (S3DR) model method performs face frontalization based on a single 3D reference model.
• Issues No model fitting is required for frontalization. However, the outputs of S3DR would not contain sufficient person‐specific information. If the 3D shape of the input image is very different from that of the reference model, the resulting performance would not be satisfactory.
9
Deep Learning Methods • References Z. Zhu et al., “Deep Learning Identity‐Preserving Face Space,” ICCV’13. J. Yim et al., “Rotating Your Face Using Multi‐task Deep Neural Network,” CVPR’15.
• Method Deep learning methods train deep neural networks to predict the frontal view of the input face.
• Issues Deep learning methods require substantial training efforts. The frontalization performance would degrade if the input image exhibits variations (e.g., pose or illumination) that are not presented in the training data.
10
Outline • Introduction 3D Face Shape Estimation from a Single Snapshot
• Related Work 3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning
• Our Proposed Method Pose Estimation with Pose Regularization Shape Estimation via Non‐Negative Least Squares (NNLS)
• Experiments • Conclusions 111
Our Contributions • 3D Face Shape Estimation Only one single snapshot is needed for 3d face shape estimation. Our method is able to tackle images taken under different scenarios: low resolution, extreme lighting, disguise/occlusion, and makeup. Our proposed algorithm utilizes non‐negative least squares (NNLS) with additional pose regularization, which does not require careful initialization or segmentation.
12
Our Framework Input Image with Detected Landmarks
Estimated 3D Face Shape
Pose and Shape Estimation
Given •
∈
denotes the kth 3D reference model, and
Input •
denotes the detected landmarks, and
∈
is the nth landmark of
.
is the nth landmark of .
Output •
is the projection matrix, and
• The estimated 3D face shape will be
,
,…,
is the shape parameter. ⋯
. 13
Outline • Introduction 3D Face Shape Estimation from a Single Snapshot
• Related Work 3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning
• Our Proposed Method Pose Estimation with Pose Regularization Shape Estimation via Non‐Negative Least Squares (NNLS)
• Experiments • Conclusions 114
Pose Estimation with Pose Regularization •
Our formulation for learning projection matrix ∈
(w/ fixed ):
min , ∈
Homogeneous coordinates of •
≡
;1 ∈
Linear combination of
and
and
≡
:
≡∑
:
;1 ∈
and are selected landmark indices for pose regularization.
Our formulation is convex and admits a closed‐form solution for .
15
Pose Regularization Detected Landmarks
Estimated Landmarks
Pose regularization enforces the pose similarity between the input and the estimated shapes by minimizing ∑ , ∈ That is, each of the 7 red arrows in the left figure is similar to the corresponding magenta arrow in the right figure. Pose regularization could avoid local minima and yield better shape estimation results. 16
Outline • Introduction 3D Face Shape Estimation from a Single Snapshot
• Related Work 3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning
• Our Proposed Method Pose Estimation with Pose Regularization Shape Estimation via Non‐Negative Least Squares (NNLS)
• Experiments • Conclusions 117
Shape Estimation via NNLS •
Our formulation for learning shape parameter ∈
(w/ fixed ):
min , ∈
Projected 2D landmarks •
The constraint
1 ;
2 /
∈
: 3 where
∈
0 enforces that all entries of are non‐negative.
Our formulation can be reduced to the standard form of Non‐Negative Least Squares (NNLS).
18
Algorithm for 3D Face Shape Estimation •
Input: Test image with detected 2D landmarks , 3D reference models with landmarks , and pose regularization parameter . while not converged do Pose Estimation: Update with fixed . Shape Estimation: Update with fixed . end while
•
Output: The projection matrix and the shape parameter .
•
Remark: Our algorithm typically converges within 20 iterations. We use Chehra1 for detecting 49 facial landmarks. (Other software such as Dlib or SDM can also be used.) [1] Asthana et al., “Incremental Face Alignment in the Wild,” CVPR, 2014. 19
Outline • Introduction 3D Face Shape Estimation from a Single Snapshot
• Related Work 3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning
• Our Proposed Method Pose Estimation with Pose Regularization Shape Estimation via Non‐Negative Least Squares (NNLS)
• Experiments • Conclusions 120
BU‐3DFE (3D Face Database) •
100 subjects available. The neutral expression of each subject has a textural image and a corresponding 3D shape model.
•
The input textural image can be a frontal view image or a side view image at a 45° yaw angle.
•
We conduct the leave‐one‐out test and report the average reconstruction error over the 100 subjects. Error Metrics
Ground Truth ∗
• RMSE: ∑ • REL: ∑ • Log10: ∑ •
/∑ log
Test Image
/
/ ∗
log
∗
/
and ∗ are the estimated 3D depth and the ground truth depth of the test image, resp. 21
BU‐3DFE – Quantitative Evaluation Frontal‐view images Log10
REL
Side‐view images (at a 45° yaw angle)
RMSE
Log10
REL
RMSE
S3DR
0.3246 0.9937 17.316
S3DR
0.3246 0.9937 17.316
L1LS
0.2926 0.4623 5.5326
L1LS
0.3245 0.6665 8.2484
Ours*
0.2705 0.3723 4.2298
Ours*
0.2677 0.3964 4.9663
Ours
0.2641 0.3619 4.1364
Ours
0.2644 0.3795 4.6284
• Ours* denotes our method without pose regularization. • S3DR (T. Hassner et al., CVPR 2015) • L1LS (X. Zhou et al., CVPR 2015)
22
BU3DFE – Depth Error & Frontalization S3DR
L1LS
Ours*
Ours
RMSE Depth Error
Frontalization Result
The method for frontalization is based on (T. Hassner et al., CVPR 2015). 23
Computation Time (in seconds)
Time
S3DR
L1LS
Ours*
Ours
0.0016
0.0605
0.0177
0.0216
• Ours* denotes our method without pose regularization. • S3DR (T. Hassner et al., CVPR 2015) • L1LS (X. Zhou et al., CVPR 2015) • The runtime estimates were obtained on a PC with Intel Quad Core 2.33 GHz processors and 4G RAM.
24
LFW Images Input
S3DR
L1LS
Ours
Estimated 3D Model
Frontalization Result
25
Unconstrained Images
• The top row contains the input images, while the bottom row contains the corresponding 3D outputs recovered by our method. 26
Outline • Introduction 3D Face Shape Estimation from a Single Snapshot
• Related Work 3D Morphable Model, Shape from Shading, Single 3D Reference Model, and Deep Learning
• Main Result Pose Estimation with Pose Regularization Shape Estimation via Non‐Negative Least Squares (NNLS)
• Experiments • Conclusions 127
Conclusions •
We present a framework for 3D face shape estimation from a single unconstrained image which can be taken under different pose and illumination, or even with occlusion.
•
Our proposed algorithm utilizes non‐negative least squares (NNLS) with additional pose regularization, which produces satisfactory estimation outputs without the need of careful initialization or segmentation.
•
In addition to quantitative evaluations and comparisons on the 3D face database of BU‐3DFE, we also consider LFW and other practical face images (e.g., paintings) for qualitative evaluations.
28
Thank You!
29