Nonrigid Image Deformation Using Moving Regularized Least Squares Jiayi Ma, Ji Zhao, and Jinwen Tian

This letter presents an image deformation method based on Moving Regularized Least Squares optimization. The user controls the deformation by simply choosing a set of point handles in the input image, and also the target positions that the source point handles should be deformed to. The deformation function in our method is nonrigid and specified in a functional space, more specifically a reproducing kernel Hilbert space. The proposed method possesses three characteristics: 1) it is able to create detail-preserving and intuitive deformations; 2) the solution of the deformation function has a simple closed-form; 3) it is extremely computationally efficient which can be performed in real-time (less than 0.1 milliseconds per frame for an image of size 500 × 500). We compare our method to a state-of-the-art method which is modeled by rigid transformations; the qualitative and quantitative results demonstrate the benefits of using the nonrigid formulation in aspects of both accuracy and efficiency. Moreover, the proposed method is general and it can be applied to other applications for interpolation. Index Terms—Deformation, nonrigid, Moving Least Squares, regularization, interpolation.

I. I NTRODUCTION Image deformation has long been an active research area in image processing and has a number of useful applications ranging from animation and morphing [1], [2], [3] to medical imaging [4]. The goal of image deformation is to deform objects into desired shapes or poses. The deformations are typically built based on a set of user-controlled distinct handles, including points [5], lines [6], and even polygon grids [7]. As the user modifies the position and orientation of these handles, the image should deform in an intuitive way. To illustrate, consider Fig. 1 where we are given an image of Burning Candle and we aim to deform its flame. To this end, we first choose a set of control points, as marked by the red points in the left image, and then adjust the control point located on the flame to a new position, as shown in the right image. Given this user input, we would like to compute a natural deformation of the flame which conforms to the new positions of the control points (see Fig. 1 (right)). Another important feature to be considered in a deformation method is the computational efficiency, since a user might want to interactively manipulate an image in real-time live performance. Many image deformation methods have been proposed in the literature to satisfy the requirements such as intuitive user interaction, realistic deformation results, and interactive The authors are with the Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan, 430074, China. J. Ma is also with the Department of Statistics, University of California, Los Angeles, CA, 90095. E-mail: {jyma2010, zhaoji84}@gmail.com, [email protected]

Fig. 1. Schematic illustration of image deformation. Left: the original image (Burning Candle) with the control points marked by red points; right: a deformed image with the deformed control points marked by red points.

performance. Among them the techniques that avoid unnatural local scaling and shearing are of special interest [2], [3]. To produce such deformations, Schaefer et al. [3] proposed to deform images based on Moving Least Squares (MLS) [8] using linear functions such as the rigid transformation. The use of MLS and rigid formulation makes the deformation “as-rigid-as-possible” [2], and tends to preserve global shape and local details. Due to the as-rigid-as-possible principle, the image deformation methods are typically modeled by rigid transformations. This is specially beneficial in situations where individual rigid parts of an object move independently of one another, e.g., changing the pose of a person or an object in a photo. However, for motions of coherent objects such as the dancing flame of a candle, the motion of a heart, or the waving of a cloth, where the shape of the object deforms under certain nonrigid models, the principle of as-rigid-as-possible deformation then may not work well. In this case, a nonrigid transformation is desirable to provide realistic deformation results, which is the very goal of our work in this letter. The technique of nonrigid transformation has been considered in plane deformations [9] and facial expression editing [10]. Here we consider the deformation on general objects. We propose a new algorithm based on MLS and a nonrigid transformation which is specified in a reproducing kernel Hilbert space (RKHS). On the one hand, this MLS-based algorithm keeps the property of avoiding superfluous global or local deformations, and, it is more promising in handling the motion of coherent objects due to the nonrigid formulation. One the other hand, the transformation specified in an RKHS leads to a simple closed-form solution which is extremely computationally efficient and can be performed in real-time. Our contribution in this paper includes the following two aspects. (1) We propose an efficient image deformation method using nonrigid deformations; it can produce more realistic deformation results than rigid model-based methods in handling the motion of coherent objects, and hence can be seen as a complement of the existing rigid model-based methods. (2)

2

We generalize the Moving Least Squares technique to the nonlinear case, which can be broadly used in the scattered data interpolation problem, i.e., interpolating a nonlinear function from a discrete set of sample points. II. M ETHOD In this section, we describe our nonrigid image deformation algorithm. The deformations are builded based on a set of user-controlled distinct points. Let {xi }ni=1 be a set of control points and {yi }ni=1 their corresponding deformed positions, where xi and yi are column vectors. Typically, the control points can be chosen to locate on the boundary or articulations, though not necessarily. We view the deformation as a function f that maps points in the original image to the deformed image. For f to be meaningful to deformations it should satisfy the following properties [3]: (i) Interpolation: the handles {xi }ni=1 should map directly to {yi }ni=1 under deformation (i.e., f (xi ) = yi ); (ii) Smoothness: f should produce smooth deformations; (iii) Identity: f should be the identity function if the deformed handles {yi }ni=1 are the same as {xi }ni=1 (i.e., ∀i, xi = yi ⇒ f (p) = p with p being an arbitrary point in the image). These properties are very similar to those used in scattered data interpolation. Next, we construct a nonrigid deformation function f satisfying these three properties with a closed-form solution. A. Problem Formulation The mathematical formulation is based on Moving Least Squares (MLS) [8], [3]. Given an arbitrary point p in the image, MLS solves for a rigid-body transformation fp (x) that minimizes a weighted least squares error functional n X

wi (p)kfp (xi ) − yi k2 ,

(1)

i=1

where wi (p) is a non-negative weight function defined as −2α

wi (p) = kp − xi k

wi (p)kxi + gp (xi ) − yi k2 + λΦ(gp ),

i=1

where the coefficient ci is a 2 × 1 dimensional vector (to be determined). Hence, the minimization over the infinite dimensional Hilbert space reduces to finding a finite set of n coefficients ci . Now we define our deformation function f as the initial position plus the displacement function: f (p) = p + gp (p).

(2)

(3)

i=1

where Φ(gp ) is a regularization term with λ > 0 controlling the trade-off between the two terms. Since the weights {wi } are dependent on the evaluation point p and the regularization technique is used to impose smoothness, we call this a Moving Regularized Least Squares (MRLS) minimization. Therefore, we obtain a different displacement function gp (x) for each image point p. The displacement function gp is nonrigid, and we model it by requiring it to lie within a specific functional space,

(5)

Note that this deformation function f is smooth, and as p approaches xi , wi (p) approaches infinity and then the function f interpolates, i.e., f (xi ) = yi . Moreover, if ∀i, xi = yi , then gp (x) ≡ 0 and, therefore, f is the identity transformation, i.e. f (p) = p. B. A Closed-form Solution As we model the displacement function gp in an RKHS, the regularization term then has the form Φ(gp ) = kgp k2Γ , where k · kΓ is the Hilbertian norm of H defined by an inner product, e.g., kgp k2Γ = hgp , gp iΓ . Now we consider the regularized least squares error, i.e. equation (3). Substituting equation (4) into it and choosing a diagonal decomposable kernel [17]: 2 2 Γ(xi , xj ) = e−kxi −xj k /β I with β determining the width of the range of interaction between points and I being an identity matrix, the regularized least squares error can be conveniently expressed in the following matrix form: kW1/2 (X + ΓC − Y)k2F + λtr(CT ΓC),

with α controlling the weight of each control point and k · k being the Euclidean distance metric. The deformation function f is then defined as f (p) = fp (p). Here we introduce a regularization term [11] and generalize this formulation to the nonrigid case. For a point p in the image, we solve for the best displacement function gp (x) that minimizes a weighted regularized least squares error functional n X

namely a reproducing kernel Hilbert space [12], [13]. Note that other parameterized transformation models, for example, thin-plate splines (TPS) [14], can also be easily incorporated into our formulation. We define an RKHS H by a positive definite matrix-valued kernel Γ : IR2 × IR2 → IR2×2 [15], [16]. The optimal displacement function gp which minimizes the regularized least squares error (3) then takes the form [13] n X gp (x) = Γ(x, xi )ci , (4)

(6)

where the kernel matrix Γ ∈ IRn×n is called the Gram matrix 2 2 with Γij = e−kxi −xj k /β , the weight matrix W is a diagonal matrix with the i-th entry wi (p) determined by equation (2), X and Y are the control points and deformed control points respectively, in which the i-th rows represent xi and yi , C = (c1 , · · · , cn )T is the coefficient matrix of size n×2, and k·kF denotes the Frobenius norm. Equation (6) is quadratic in C. Taking the derivative of it with respect to C and setting it to zero, we obtain a closedform solution C = (Γ + λW−1 )−1 (Y − X).

(7)

With this closed-form solution for C we can write a simple expression for the deformation function f (p): f (p) = p + [Γp (Γ + λW−1 )−1 (Y − X)]T ,

(8)

where Γp is an 1 × n row vector with the i-th entry 2 2 e−kp−xi k /β . To create a new deformed image, we approximate the image with a grid and apply the deformation function (8) to each vertex, followed by a bilinear interpolation in each quad. We summarize our method in algorithm 1.

3

Algorithm 1: The Nonrigid Image Deformation Algorithm Input: An image, a kernel Γ, parameters α, β, λ Output: A deformed image n 1 Extract a set of control points: {xi }i=1 ; n 2 Determine the deformed control points: {yi }i=1 ; n 3 Construct the Gram matrix Γ based on {xi }i=1 ; 4 Approximate the image with a grid; 5 repeat 6 Choose a vertex p on the grid; 7 Compute the weight W and vector Γp ; 8 Compute f at vertex p by using equation (8); 9 until all the vertexes have been computed; 10 The deformed image is generated by a bilinear interpolation of {f (p)}.

C. Computational Complexity Due to that the solution (8) involves the inversion of a matrix of size n × n at each image point, the time complexity of our MRLS algorithm is O(n3 l), where l is the total number of evaluation points in an image. In the image deformation problem, the number n of control points is typically small, e.g., in the order of 101 , so the matrix is fast to invert. Moreover, the user in general creates the deformations by manipulating the point set Y, and the points X are fixed. Therefore, much of equation (8) can be precomputed yielding very fast deformations. In particular, we can rewrite equation (8) in the form f (p) = T + (SY)T ,

(9)

where S = Γp (Γ + λW−1 )−1 and T = p − (SX)T can be precomputed leading to very fast implementation. In this case, the time complexity of our MRLS algorithm is reduced to O(nl). Therefore, our MRLS with nonrigid model can perform more efficiently (see Table I). Parameter Initialization. There are mainly three parameters in our deformation algorithm: α, β, and λ. Parameter α controls the weight of each control point. Parameters β and λ both reflect the amount of the smoothness constraint. Parameter β determines how wide the range of interaction between points. Parameter λ controls the trade-off between the closeness to the data and the smoothness of the solution. In our evaluation, we found our method was very robust to parameter tuning, and we set α = 2, β = 2, and λ = 1 throughout our experiments. Besides, the performance depends, typically, on the coordinate system in which points are expressed. We use data normalization to control for this. More specifically, we perform a linear re-scaling of the correspondences so that the control points and the deformed control points both have zero mean and unit variance. III. E XPERIMENTAL R ESULTS We present a few representative deformation results obtained with our technique in Fig. 2. For comparison, we also provide the results of MLS [3], a state-of-the-art algorithm which operates on the rigid transformation. In the figures, the first row presents the original images with the control points

TABLE I RUNTIMES OF MLS AND MRLS ON FOUR IMAGES IN F IG . 2.

Woody1 Woody2 Tux Pregnant

grid

#ctrl pt

MLS [3]

55 × 48 55 × 48 151 × 128 128 × 113

9 6 6 11

3.6 ms 2.6 ms 12.8 ms 15.5 ms

MRLS 0.08 0.08 0.32 0.31

ms ms ms ms

marked by red points, while the middle and bottom rows are the corresponding deformation results of MLS and MRLS respectively with the deformed control points marked by red points. We first consider the first column (Woody1), in which the control points are located on the junctions between body parts; we see that both algorithms have their advantages and can produce natural deformations. For the MLS algorithm with rigid transformation, it preserves the global shape and local details quite well, especially the shape of the head. While our MRLS algorithm with nonrigid transformation can produce a streamlined shape, for example, the contours of the arms and torso are smooth; nevertheless, the head becomes a little flat. To further investigate performance differences between the two methods, we subsequently choose a set of control points distributed uniformly on the body as shown in the second column (Woody2), where the body should deform coherently. We see that in this case the performance of MLS begins to degenerate, however, our MRLS is much more robust and still produces a natural deformation. Another undesirable property of MLS is fold-over, as shown in the third column (Tux) highlighted with a blue box; this can be eliminated in our MRLS method. Finally, the example in the last column (Pregnant) shows our method’s capability of reshaping human body. Note that in the result of MLS, i.e., the middle right figure, the deformation area on the clothes and stomach becomes waves. Table I summarizes the runtimes of MLS and MRLS on an Intel Core 2.5 GHz PC with M ATLAB code. We see that both algorithms perform quite fast and can be performed in real-time. Due to the closed-form solution (9) only involves simple matrix operations such as addition and multiplication, the use of nonrigid model in our MRLS algorithm does not lead to efficiency decreases; on the contrary, it performs even faster than the MLS algorithm with rigid model. IV. C ONCLUSION In conclusion, this letter presents a new technique for image deformation based on Moving Regularized Least Squares (MRLS) minimization. We model the deformation function by requiring it to lie within an RKHS, and provide a simple closed-form solution of it. The proposed method is able to produce natural nonrigid deformations, especially for motions of coherent objects. Moreover, it is very fast and can be performed in real-time. ACKNOWLEDGMENT The authors gratefully acknowledge the financial support from the China Scholarship Council (No. 201206160008) and National Natural Science Foundation of China (No. 61273279).

4

4

4

5

Original

MLS

MRLS

Fig. 2. Deformation results. Top: original images; middle: deformation results of MLS (rigid) [3]; bottom: deformation results of MRLS (nonrigid). From Fig. 2. results. Deformation results. Top: original images; middle: deformation of MLS[3]; (rigid) [3]; bottom: deformation results of MRLS (nonrigid). From Fig.right: 2. Deformation Top: images; middle: deformation ofresults MLS (rigid) bottom: deformation results ofdeformed MRLS (nonrigid). From left to Woody1, Woody2, Tux, andoriginal Pregnant. marked redThe points inresults the original are control points, andpoints, those in the images are left to right: Woody1, Woody2, Tux,The andThe Pregnant. marked red points in images the original images are control and those indeformed the deformed images are left to right: Woody1, Woody2, Tux, and Pregnant. marked red points in the original images are control points, and those in the images are deformed control points. deformed deformed control points.control points. [2] T. Igarashi, T.[2]Moscovich, J. F. Hughes, [12] G. Wahba, Spline Models for Observational Data. Philadelphia, American Mathematical Society, vol. 68, SIAM, no. 3, pp. 337–404, 1950. T. Igarashi,and T. Moscovich, and “As-Rigid-as-Possible J. F. Hughes, “As-Rigid-as-Possible R EFERENCES [10] Yang, L.A.Bourdev, Shechtman, J. Wang, andVector-Valued D. Metaxas, Functions,” “Facial Shape Manipulation,” ACM on Graphics, 3, F. [11] C. MicchelliE.and M. Pontil, “On Learning Shape Manipulation,” ACM Transactions on Transactions Graphics, vol. 24, no. 3,vol. 24, no. PA, 1990. Expression in Video Temporally-Smooth Factorization,” pp. 1134–1141, 2005. vol. 17,aD. no. 1, pp. 177–204, 2005. pp. 1134–1141, J. Neural Ma,Editing J. Computation, Tian, J. Ma,Using and Zhang, “A Robust Method [1] M. Alexa, D.2005. Cohen-Or, and D. Levin, “As-Rigid-as-Possible Shape In-[13] J. Zhao, in Proceedings of Spline IEEEwith conference on Computer Vision andPhiladelphia, Pattern [12] Field G. Wahba, Models for Observational Data. SIAM, Schaefer, J. Warren, “Image Deformation Using for Vector Learning Application to Mismatch Removing,” [3] S. Schaefer, T.[3]McPhail, and T.J.ofMcPhail, Warren, “Image Deformation terpolation,” inS.Proceedings the 27thand annual conference on Using Computer Recognition, 2012, pp.conference 861–868. on Computer Vision and Pattern PA, of 1990. MovingACM LeastTransactions Squares,” ACM Transactions on 25, Graphics, vol. 25, no. 3,Proceedings in IEEE Moving Least Squares,” on Graphics, vol. no. 3, graphics and interactive techniques, 2000, pp. 157–164. [11] V. [13] Devlaminck, “A2977–2984. Functional Compressible or “A Incompressible J.2011, Zhao,pp.J. Ma, J. Tian, J.forMa, and D. Zhang, Robust Method pp. 533–540, 2006. Recognition, pp. 533–540, 2006. [2] T. Igarashi, T. Moscovich, and J. F. Hughes, “As-Rigid-as-Possible for Vector Field Learning withYuille, Application to Estimation Mismatch Elastic Deformation Estimation,” Signal Processing Letters,Removing,” vol. [4] T. Ju, J. Warren, G. Eichele, C. Thaller, W. Chiu, and J. Carson, [14] J. Ma, J. Zhao, J. Tian, Z. Tu, and IEEE A. “Robust of 6, Shape Manipulation,” ACM Transactions on Graphics, vol. 24, no. 3, [4] T. Ju, J. Warren, G. Eichele, C. Thaller, W. Chiu, and J. Carson, in 162–164, Proceedings of IEEE conference on Computer Vision and Pattern “A Geometric Database for Gene Expression Data,” in Proceedings no. 7,Transformation pp. 1999. Nonrigid for Point Set Registration,” in Proceedings of pp. 1134–1141, 2005. “A Geometric Database for Gene Expression Data,” in Proceedings Recognition, 2011, 2977–2984. Kernels,” Transactions of the of the 2003 Eurographics/ACM SIGGRAPH symposium on Geometry [12] Aronszajn, “Theory ofpp. Reproducing IEEEN.conference Vision and Pattern Recognition, [3] S. 2003 Schaefer, T. McPhail,2003, and J. Warren, “Image Deformation Using of the Eurographics/ACM SIGGRAPH symposium on Geometry [14] J. Mathematical Ma,onJ.Computer Zhao, J.Society, Tian, Z.vol. Tu,68, andno. A. 3,Yuille, “Robust2013. Estimation of processing, pp. 166–176. American pp. 337–404, 1950. Moving2003, Least Squares,” ACM “Principal Transactions on Graphics, 25, and no. the 3, Decomprocessing, Nonrigid Transformation for Point Set Registration,” in Functions,” Proceedings of [5] pp. F. L.166–176. Bookstein, Warps: Thin-Plate vol. Splines [13] C. A. Micchelli and M. Pontil, “On Learning Vector-Valued 533–540,“Principal 2006. IEEE conference Computer Vision and Pattern position ofWarps: Deformations,” IEEE Transactions Pattern Analysis and Neural Computation, [5] F. L.pp. Bookstein, Thin-Plate Splines and theonDecomvol.on17, no. 1, pp. 177–204, 2005.Recognition, 2013. [4] T. Ju, J. Warren, G. Intelligence, Eichele, C.vol. Thaller, Chiu, and J.1989. Carson, [14] G. Wahba, Spline Models for Observational Data. SIAM, Philadelphia, Machine 11, no. 6, pp. 567–585, position of Deformations,” IEEE Transactions onW. Pattern Analysis and “A Geometric Gene Expression Data,” Proceedings in ACM PA, 1990. [6] T.Database Beier S. Imagein Metamorphosis,” Machine Intelligence, vol. and 11, for no.Neely, 6, pp.“Feature-Based 567–585, 1989. of the Eurographics/ACM SIGGRAPH symposium on SIGGRAPH Computer Graphics, 1992, pp. 35–42. [6] T. Beier and2003 S. Neely, “Feature-Based Image Metamorphosis,” in Geometry ACM [15] J. Zhao, J. Ma, J. Tian, J. Ma, and D. Zhang, “A Robust Method [7]2003, R. MacCracken and K. pp. I. Joy, “Free-Form Deformations with Lattices processing, pp. 166–176. SIGGRAPH Computer Graphics, 1992, 35–42. for Vector Field Learning with Application to Mismatch Removing,” of Arbitrary Topology,” in ACM SIGGRAPH Computer Graphics, 1996, [5] F. L. Bookstein, “Principal Warps: Thin-Plate Splines and the Decomin Proceedings of IEEE conference on Computer Vision and Pattern [7] R. MacCracken and I. Joy, “Free-Form Deformations with Lattices pp.K. 181–188. position of Deformations,” IEEE Transactions on Pattern Analysis and Recognition, 2011, pp. 2977–2984. of Arbitrary Topology,” in ACM SIGGRAPH Computer Graphics, 1996, [8] D. Levin, “The Approximation Power of Moving Least-Squares,” MathMachine Intelligence, vol. 11, no. 6, pp. 567–585, 1989. [16] J. Ma, J. Zhao, J. Tian, X. Bai, and Z. Tu, “Regularized Vector Field pp. 181–188. ematics of Computation, vol. 67, no. 224, pp. 1517–1531, 1998. [6] T. Beier and S. Neely, “Feature-Based Image Metamorphosis,” in ACM [8] D. Levin, “The[9] Approximation Power of Moving Least-Squares,” V. Devlaminck, “A 1992, Functional for CompressibleMathor Incompressible Learning with Sparse Approximation for Mismatch Removal,” Pattern SIGGRAPH Computer Graphics, pp. 35–42. ematics of Computation, vol. 224, pp. 1517–1531, 1998. Elastic Estimation,” IEEE Signal Processing Letters, vol. 6, Recognition, 2013. [7] R. MacCracken and Deformation K. I.67, Joy,no.“Free-Form Deformations with Lattices [17] J. Ma, J. Zhao, J. Tian, Z. Tu, and A. Yuille, “Robust Estimation of no. 7, pp. 162–164, 1999. [9] V. Devlaminck, “A Functional for Compressible or Incompressible of Arbitrary Topology,” in ACM SIGGRAPH Computer Graphics, 1996, [10] N. Aronszajn, “Theory of Reproducing Kernels,” Transactions of the Nonrigid Transformation for Point Set Registration,” in Proceedings of Elastic Deformation Estimation,” IEEE Signal Processing Letters, vol. 6, pp. 181–188. IEEE conference on Computer Vision and Pattern Recognition, 2013. no. 7, 162–164, 1999. [8] D.pp. Levin, “The Approximation Power of Moving Least-Squares,” Math[10] N. Aronszajn, of Reproducing Transactions1998. of the ematics of “Theory Computation, vol. 67, no. Kernels,” 224, pp. 1517–1531, American Mathematical Society, 68,Funkhouser, no. 3, pp. 337–404, [9] Y. Lipman, V. G. Kim, and vol. T. A. “Simple 1950. Formulas for Plane Deformations,” ACM Transactions on Graphics, [11] C. A.Quasiconformal Micchelli and M. Pontil, “On Learning Vector-Valued Functions,” vol.Computation, 31, no. 5, p. vol. 124, 17, 2012. Neural no. 1, pp. 177–204, 2005.