Realtime Performance-Based Facial Animation.pdf

所需积分/C币:9 2014-05-18 14:18:45 3.18MB PDF
收藏 收藏
举报

基于kinect的人脸重建,取出纹理和噪点,将建模用于后续点对点的时事交流
盘意 Accumulated scans Accumulated Kinect Raw Depth Maps 3D Model Morphable model Manual Markup Generic Template Non-rigid ICP 息 Kinect Raw Imagc Accumulated extre User-specific Expressions Figure 3: Acquisition of user expressions for offiine model build- ng. Aggregating multiple scans under slight head rotation reduces Example-based noise and fills in missing data Facial Rigging Generic blendshapes re can use existing blendshape animations that are ubiquitous in movie and game production, to define the dynamic expression pr @-日 ors. The underlying hypothesis here is that the blendshape weights of a human facial animation sequence provide a sufficient level of User-specific Blendshapes abstraction to enable expression transfer between different charac ters. Finally, the output generated by our algorithm, a temporal se Figure 5: Ofline pre-processing for building the user-specific ex qucncc of blcndshapc weights, can bc directly imported into com pression model. Pre-defined exumple poses of the user with known mercial animation tools, thus facilitating integration into existing blendshape weights are scanned and registered to a template mesh production workflows to yield a set of user-specific expressions. An optimization solves for the user-specific blendshapes that maintain the semantics of a Acquisition Hardware. All input data is acquired using the generic blendshape model. The inset shows how manually selected Kinect system, i.e. no other hardware such as laser scanners is re rrespondences guide the reconstruction of user-specific quired for user-specific model building. The Kinect supports si- expressions multaneous capture of a 2D color image and a 3D depth map at 30 frames per second, based on invisible infrared projection(Figure 4) Essential benefits of this low-coSt acquisition device include ease of cessing step by adapting a generic blendshape model with a small deployment and sustained operability in a natural environment. The set of expressions performed by the user. These expressions are user is neither required to wear any physical markers or specialized captured with the Kincct prior to online tracking and reconstructed Makeup, nor is the performance adversely affected by intrusive light using a morphable model combined with non-rigid alignment meth projections or clumsy hardware contraptions. However, these key ods. Figure 5 summarizes the different steps of our algorithm for dvantages comc at the price of a substantial dcgradation in data building the facial expression model. We omit a detailed description quality compared to state-of-the-art performance capture systems of previous methods that are integrated into our algorithm. Please based on markers and/or active lighting. Ensuring robust process refer to the cited papers for parameter settings and implementation ing given the low resolution and high noise levels of the input data details is the primary challenge that we address in this paper Data Capture. To customize the generic blendshape rig, we 2 Facial Expression Model record a pre-defined sequence of example expressions performed by the uscr. Since singlc depth maps acquired with the Kinect c A central component of our tracking algorithm is a facial expres hibit high noise levels, we aggregate multiple scans over time using sion model that provides a low-dimensional representation of the the method described in [weise et al. 2008](see Figure 3 The user users expression space. We build this model in an offline prepro is asked to perform a slight head rotation while keeping the expres deo). beside sing the entire face to the scanner this rotational motion has the additional benefit of al leviating reconstruction bias introduced by the spatially fixed infrared dot pattern projected by the Kinect. We use the method of [ Viola and Jones 2001 to detect the face in the first frame of the acquisition and accumulate the acquired color images to obtain the skin texture using Poisson reconstruction [Perez et al. 2003 Figure 4: The Kinect simultaneously captures a 640 x 400 color Expression Reconstruction. We use the morphable model of Blanz and Vetter [1999] to represent the variations of different hu- image and corresponding depth map al 30 HerIz, computed via iri man faces in neutral expression. This linear PCa model is first elation of an infrared projector and camera istered towards the recorded neutral pose to obtain a high-quality 0.1 H·ma normalized weight o frame rigid tracking mask 2 i.3 i Figure 6: The colored region on the left indicates the portion of the Figure 7: Robustly trucking the rigid motion of the face is crucial face used for rigid tracking The graph on the right illustrates how for expression reconstruction. Even with large occlusions and fast temporal filtering adapts to the speed of motion motion, we can reliably track the user's global pose thc smoothed vcctor as weighted avcragc in a window of size k as template mesh that roughly matches the geometry of the users face We then warp this template to each of the recorded expressions us ∑=ot ing the non-rigid registration approach of [Li ct al. 2009]. To im- prove registration accuracy, we incorporate additional texture con straints in the mouth and eye regions. For this purpose, we man- ally mark features as illustrated in Figure 5. The integration of where ti-j denotes the vector at frame i-3. The weights u'jare defined as these constraints is straightforward and easily extends the frame work of [li et al. 2009] with positional constraints 们=e~.H,maxl∈1,tz-t-l with a constant H that we empirically determine independently for rotation and translation based on the noise level of a static pose. We Blendshape Reconstruction We represent the dynamics of fa- use a window size of k=5 for all our experiments cial expressions using a generic blendshape rig based on ekman's Facial Action Coding System(FACS)[1978]. To generate the full Scaling the time scale with the imaximum variation in the temporal set of blendshapes of the user we employ example-based facial rig ging as proposed by li et al. [2010]. This method takes as input a high-frcqucncy jitter is cffcctivcly rcmovcd from thc estimated rigid generic blendshape model, the reconstructed example expressions pose(Figure 6, right). As shown in the video, this leads to a stable and approximate blendshape weights that specify the appropriate reconstruction when the user is perfectly still, while fast and jerky linear combination of blendshapes for each expression. Since the motion can still be recovered accuratel user is asked to perform a fixed set of expressions, these weights are manually dctcrmincd oncc and kept constant for all uscrs. Givcn Non-rigid Tracking. Given the rigid pose we now need to esti this data, example-based facial rigging performs a gradient-space mate the blendshape weights that capture the dynamics of the facial optimization to reconstruct the set of user-specific blendshapes that expression of the recorded user. Our goal is to reproduce the users best reproduce the example expressions(Figure 5). We use th performance as closely as possible, while ensuring that the recon- same generic blendshape model with m= 39 blendshapes in all structcd animation lics in the spacc of realistic human facial expres our examples sions. Since blendshape parameters are agnostic to realism and can easily produce nonsensical shapes, parameter fitting using geome- 3 Realtime Tracking try and texture constraints alone will typically not produce satisfac tory results, in particular if the input data is corrupted by noise(see Figure 8). Since human visual interpretation of facial imagery is The user-specific blendshape model defines a compact parameter highly sophisticated, even small tracking errors can quickly lead to space suitable for realtime tracking. We decouple the rigid from visually disturbing artifacts the non-rigid motion and directly estimate the rigid transform of the user's face before performing the optimization of blendshape 3.1 Statistical Model weights. We found that this decoupling not only simplifies the for mulation of the optimization, but also leads to improved robustness We prevent unrealistic face poses by regularizing the blendshape of the tracking weights with a dynamic expression prior computed from a set of existing blendshape animations A=(A1,., AL. Each anima Rigid Tracking. We align the reconstructed mesh of the previous tion A; is a sequence of blendshape weight vectors a; E ir that frame with the acquired depth map of the current frame using ICP sample a continuous path in the m-dimensional blendshape space with point-plane constraints. To stabilize the alignment we use a We exploit temporal coherence of these paths by considering a win pre-segmented template(Figure 6, left) that excludes the chin re dow of n consecutive frames, yielding an effective prior for both the gion from the registration as this part of the face typically exhibits geometry and the motion of the tracked user the strongest deformations. As illustrated in Figure 7 this results in robust tracking even for large occlusions and extreme facial expres MAP Estimation. Let Di=(Gi, Ii) be the input data at the sions. We also incorporate a temporal filter to account for the high current frame i consisting of a depth map Gi and a color im frequency flickering of the Kinect depth maps. The filter is based age Ii. We want to infer from Di the most probable blendshape on a sliding window that dynamically adapts the smoothing coef- weights xi E R for the current frame given the sequence Xn= ficicnts in thc spirit of thc exponentially weighted moving avcrage x Xi-n of the n previously reconstructed blendshape vec method [roberts 1959] to reduce high frequency noise while avoid tors. Dropping the index i for notational brevity we formulate this ing disturbing temporal lags. We independently filter the translation inference problem as a maximum a posteriori(MAP)estimation vector and quaternion representation of the rotation. For a transla- tion or quaternion vector ti at the current time frame i,we compute x=arg maxp(xD, Xn), (3) where p(1) denotes the conditional probability. USing Bayes'rule e obtain x'= arg max p(Dx, Xn)p(x, Xn 身會會 Assuming that D is conditionally independent of Xn given x we can write x N arg maxp(Dx)(x. Xn) (5) Prior Distribution. To adequately capture the nonlinear structure of the dynamic expression space while still enabling realtime per formance, we represent the prior term plx, Xn)as a Mixtures of Probabilistic Principal Component Analyzers(MPPCA) [Tipping and Bishop 1999b]. Probabilistic principal component analysis (PPCA)(see [Tipping and Bishop 1999a])defines the probabil ity density function of some observed data x∈愿。 by assuming that x is a linear function of a latent variable z eR with s> t le x=Cz +u+e (6) input data without prior with prior where Z NN(O, T) is distributed according to a unit Gaussian Figure 8: Without the animation prior, tracking inaccuracies lead C R is the matrix of principal components, u is the mean to visually disturbing self-intersections. Our solution significantly vector, and e nN(0. oI) is a Gaussian-distributed noise variable reduces these artifacts. Even when tracking is not fully accurate as The probability density of x can then be written as in the bottom row, a plausible pose is reconstructed p(x)=N(xu, CC+aI Using this formulation, we define the prior in Equation 5 as a Likelihood Distribution. By assuming conditional indepen weighted combination of K gaussians dence, we can model the likelihood distribution in Equation 5 as the product p(Dx)=p(Gxp(Ix). The two factors capture the alignment of the blendshape model with the acquired depth map p(x,Xn)-∑丌N(x,Xn体,CAC+D).(8) and texture image, respectively. We represent the distribution of each likelihood term as a product of Gaussians, treating each vertex of the blcndshapc modcl independently with weights Tk. This representation can be interpreted as a reduced-dimension Gaussian mixture model that attempts to model Let v be the number of vertices in the template mesh and b E the high-dimensional animation data with locally linear manifolds RvXm the blendshape matrix. Each column of B defines a blend modeled with ppca shape base mesh such that Bx generates the blendshape represen tation of the current pose. We denote with vi =(Bx)i the i-th ver- tex of the reconstructed mesh. The likelihood term p(Gx)models Learning the Prior. The unknown parameters in Equation 8 are a geometric registration in the spirit of non-rigid ICP by assuming the means uk, the covariance matrixes Ck Ck, the noise parame a Gaussian distribution of the per-vertex point-plane distances ters ok, and the relative weights Tk of each PPCa in the mixture Inodel. We learn these paraineters using the Expectation Maxinizd- tion(EM)algorithm based on the given blendshape animation se (10) A. To incrcasc the robustness of thesc computations, we P(Gx) 11(22) -), estimate the MPPCa in a latent space of the animation sequences LA using principal component analysis. By keeping 99% of the to- tal variance we can reduce the dimensionality of the training data where ni is the surface normal at vi, and vi is the correspondin by two-thirds allowing a more stable learning phase with the em closest point in the depth map G algorithm. Equation 8 can thus he rewritten as The likelihood term p(Ix)models texture registration. Since we acquire the user's face texture when building the facial expression p(x,Xn)=∑丌V(x, &nIPah+1,PMP1).(9) model(Figure 3), we can integrate model-based optical flow con straints [Decarlo and Metaxas 2000 by formulating the likelihood function using per-vertex Gaussian distributions as where M=(Ck CK +okr)is the covariance matrix in the latent space, P is the principal component matrix, and u the mean vector. Since the em algorithm converges to local minima, we run the al- (1x)= V/(D-D2) (11) gorithm 50 times with random initialization to improvc the learning 2=1 accuracy. We use 20 Gaussians to model the prior distribution and we use one-third of the latent space dimension for the PPca di- where pi is the projection of vi into the image 1, vli is the gradient mension. More details on the implementation of the em algorithm of I at pi, and p: is the corresponding point in the rendered texture can be found in [ Mclachlan and Krishnan 1996 Image. 3.2 Optimization In order to solve the map problem as defined by equation 5 we minimize the negative logarithm,1.e 堡窗② x'=arg min-Inp(Gx)-In P(I x)-Inp(x, Xn).(12) Discarding constants, we write 堡會@最自 x= arg min Egeo+ Eim prior where In pix, Xn) ∑‖ d (15 量量 Em=∑|V(P1-P2) The parameters o geo and oim model the noise level of the data that controls the emphasis of the geometry and image likelihood terms 国③ relative to the prior term. Since our system provides realtime feed back, we can experimentally dctcrminc suitable valucs that achieve stable tracking performance. For all our results we use the same settings geo = l and oim =0.45 The optimization of equation 13 can be performed efficiently usin an iterative gradient solver, since the gradients can be computed tracked analytically(see the derivations in the appendix). In addition, we input data expression model virtual avatars compute the inverse covariance matrices and the determinants of the mPPCa during the offline learning phase. We use a grad ent projection algorithm based on the limited memory BFGS solver [Lu et al. 1994] in order to enforce that the blendshape weights are between 0 and 1. The algorithm converges in less that 6 iterations as we can use an efficient warm starting with the previous solution We then update the closest point correspondences in Fgec and Fim 每最 and re-compute the MAP estimation. We found that 3 iterations of this outer loop are sufficient for convergence blendshape base meshes 4 Results Figure 9: The user's facial expressions are reconstructed and mapped to different target characters in realtime, enabling inter We present results of our realtime performance capture and anima- active animations and virtual conversations controlled by the per- tion system and illustrate potential applications. The output of the formance of the tracked user. The smile on the green characters tracking optimization is a continuous stream of blendshape weight base mesh gives it a happy countenance for the entire animation vectors; that drive the digital character. Please refer to the ac companying video to better appreciate the facial dynamics of the animated characters and the robustness of the tracking. Figures 1 Statistics. We use 15 user-specific expressions to reconstruct 39 and 9 illustrates how our system can be applied in interactive appli- blendshapes for the facial expression model. Manual markup of cations, where the user controls a digital avatar in realtime. blend texture constraints for the initial offline model building requires ap shape weights can be transmitted in realtime to enable virtual en proximately 2 minutes per expression. Computing the expression counters in cyberspace. Since the blendshape representation facili gl We tates animation transfer, the avatar can either be a digital representa- compute the gaussian mixture model that defines the dynamic e tion of the user himself or a different humanoid character, assuming pression prior from a total of 9,500 animation frames generated on compatible expression spaces the generic template model by an animation artist. Depending on the size of the temporal window, these computations take between While we build the uscr-spccific blendshape model primarily for 10 and 20 minutes realtime tracking, our technique offers a simple way to create per sonalized blendshape rigs that can be used in traditional animation Our online system achieves sustained framerates of 20 Hertz with tools. Since the Kinect is the only acquisition device required, gen a latency below 150 ms. Data acquisition, preprocessing, rigid reg. erating facial rigs becomes accessible for non-professional users istration, and display take less than 5 ms. Nonrigid registration 叠自匾 垦复Q input data expression model Figure 11: Difficult tracking configurations. Right: despite the occlusions by the hands, our algorithm successfully tracks the rigid Input data geometry only texture only geometry+texture motion and the expression of the user: Left with more occlusion or Figure 10: The combination of geometric and texture-based regis- tration is essential for realtime tracking To isolate the effects of the individual components, no animation prior is used in this example tracked model ore closely matches the performing user. what the prior achieves in any case is that the reconstructed pose is plau- sible, even if not necessarily close to the input geometrically(see including constraint setup and gradient optimization require 45 ms also Figure 8). We argue that this is typically much more tolerable per frame. All timing measurements have been done on a Intel I7 than generating unnatural or even physically impossible poses that 2. 8Ghz with 8 GBytes of main memory and a ATI Radeon IID 4850 could severely degrade the visual perception of the avatar. In add aphics card grap tion, our approach is scalable in the sense that if the reconstructed animation does not well represent certain expressions of the user, 5 Evaluation we can manually correct. the sequence using standard blendshape animation tools and add the corrected sequence to the training data et. This allows to successively improve the animation prior in a We focus our evaluation on the integration of 2D and 3D input data and the effect of animation training data. We also comment on bootstrapping manner. For the temporal window Xn used in the limitations and draw backs of our approach animation prior, we found a window size of3< n< 5 to yield good results in general. Longer temporal spans raise the dimen sionality and lcad to incrcascd temporal smoothing. If the window Geometry and Texture. Figure 10 evaluates the interplay bc- is too small, temporal coherence is reduced and discontinuities in tween the geometry and texture information acquired with the the tracking data can lead to artifacts inect Tracking purely based on ge Limitations. The resolution of the acquisition system limits the ometry as proposed in [Weise gec and motion detail that can be tracked for each et al. 2009] is not successful due user, hence slight differences in expressions will not be captured o the high noise level of the adequately. This limitation is aggravated by the wide-angle lens Kinect data. Integrating model of the Kinect installed to enable full-body capture, which confines based optical flow constraints the face region to about 160x 160 pixels or less than 10%o of the reduces temporal jitter and sta- total inage area. As a result, our system cannot recover small-scale WEise et al. 20091 nect bilizes the reconstruction In our wrinkles or very subtle movements. We also currently do not model experiments, only the combina- cycs, tccth, tonguc, or hair tion of both modalities yielded satisfactory results. Compared to purcly imagc-bascd tracking as c g in [Chai ct al. 2003, dircct ac In our current implementation, we require user support during pre cess to 3D geometry offers two main benefits: We can significantl processing in the form of manual markup of lip and eye features improve the robustness of the rigid pose estimation in particular for to register the generic template with the recorded training poses non-frontal views(see also Figure 7). In addition, the expression (see Figure 5). In future work, we want to explore the potential of template mesh generated during preprocessing much more closely generic active appearance models similar to [Cootes et al. 2001]to matches the geometry of the user, which further improves track automate this step of the offline processing pipeline as well ing accuracy. Figure I I shows difficult tracking configurations and provides an indication of the limits of our algorithm while offering many advantages as discussed in Section 1.2, the blendshape representation also has an inherent limitation: The number of blendshapes is a tradeoff between expressiveness of the Animation Prior. Figure 12 studies the effectiveness of our prob model and suitability for tracking. Too few blendshapes may re abilistic tracking algorithm when varying the amount of trainin sult in uscr expressions that cannot be represented adequately by data used for the reconstruction The figure illustrates that if the the pose space of the model. Introducing additional blendshapes training data does not contain any sequences that are sufficiently to the rig can circumvent this problem, but too many blendshapes close to the captured performance, the reconstruction can differ may result in a different issue: Since blendshapes may become ap- substantially from the acquired data. with more training data, the proximately linearly dependent, there might not be a unique set of 會回回回回 Acknowledgements. We are grateful to Lee Perry-Smith for providing the face model for our generic template, Dan Burke for sculpting the CG characters, and Cesar Bravo, Steven McLellan, David rodrigues, and Volker Helzle for the animations. We thank Gabriele Fanelli for our valuable discussions, Duygu Ceylan and Mario Deuss for being actors, and Yuliy Schwarzburg for proof- 鱼置會回 reading the paper. This research is supported by Swiss National Science Foundation grant 20PA21L-129607 ppen We derive the gradients for the optimization of Equation 13. The 鱼霞會會會會 energy terms for geometry registration geo and optical flow E can both bc written in the form 100% f(x)=Ax-bl12 (17) Figure 12: Effect of different amounts of training data on the per- hence the gradients can easily be computed analytically as formance of the tracking algorithm. We successively delete blend shapes rum the input animation sequences, which removes entire af(x) A(Ax-b) (18) portions of the expression space. With only 25%o of the blendshape in the training data the expressions are not reconstructed correctly The prior term is of the form blendshape weights for a given expression. This can potentially re ∑xNx,Xnl,∑k) sult in unstable tracking due to overfitting of the noisy data. While 人=1 the prior prevents this instability, a larger number of blendshapes re- where 2k is the covariance matrix. The Gaussians quires a larger training database and negatively affects performance N(x, Xnlu, 2:)model the combined distributie current blendshape vector x E R and the n previous vectors Xn 6 Conclusion hence the >k are matrices of dimension(n+1)mx(n+1)m Since we are only interested in the gradient with respect to x, we We have demonstrated that high-quality performance-driven fa- can discard all components that do not depend on this variable. We cial animation in realtime is possible even with a low-coSt,non- split the mean vectors as uk =(uk, uk), corresponding to x and intrusive, markerless acquisition system. We show the potential of Xn respectively. We can write the inverse of 2kas our system for applications in human interaction, live virtual TV shows, and computcr gaming Ak Bk (m×m)(m×7m) 20) Robust realtime tracking is achieved by building suitable user CkI D (nm×m)(m×mm) specific blendshape models and exploiting the different character istics of the acquired 2D image and 3D depth map data for regis- tration. We found that learning the dynamic expression space from with Br=Ck. We then obtain for the gradient of the prior energy cxisting animations is csscntial. Combining thesc animation priors term ith effective geometry and texture registration in a single MAP es aprior ax (21) timation is our key contribution to achieve robust tracking even for highly noisy input data. While foreseeable technical advances in ∑k=1丌kN(x,Xn|k,∑k)(x-4)Ak+(Xn-42)7C acquisition hardware will certainly improve data quality in coming years, numerous future applications, e.g. in multi-people tracking, ∑k=1丌kN(x,Xn1|k,∑k) acquisition with mobile devices, or performance capture in diffi The complete gradient is the sum of the three energy gradients de cult lighting conditions, will produce even worse data and will thus rived above put even higher demands on robustness. Our algorithm provides systematic framework for addressing these challenging problems dEgen, dEim, dEpri ax 22) x dx We believe that our system enables a variety of new applications and can be the basis for substantial follow-up research. We cur- rently focus on facial acquisition and ignore other important as pects of human communication, such as hand gestures, which pose interesting technical challenges due to complex occlusion patterns Enhancing the tracking performance using realtime speech analy sis, or integrating secondary effects such as simulation of hair are furthcr arcas of future rcscarch that could hclp incrcasc thc realism of the generated virtual performances. More fundamentally, bein ible to deploy our system at a massive scale can enable interesting new research in human communication and paves the way for new interaction metaphors in performance-based game play References LI H. RoivaINEN.P. and FORCHEIMERR. 1993. 3-d motion estimation in model-based facial image coding. PAMI15, 545 ALEXANDER, O, ROGERS, M, LAMBETH, W, CHANG, M 555 AND DEBEVEC, P. 2009. The digital emily project: photoreal facial modeling and animation. ACM SIGGRAPH 2009 Courses LIH. ADAMS.B. GUIBAS L.. AND PAULY. M. 2009. Robust single-view geometry and motion reconstruction. ACM Trans BEELER T. BICKEL. B. BEARDSLEY P. SUMNERB AND Graph.28,175:1-175:10 GROSS, M. 2010. High-quality single-shot capture of facial LI, H, WEISE, T, AND PAULY. M. 2010. Example-based facial geometry. ACM Trans. Graph. 29, 40: 1-40: 9 rigging. ACM Trans. Graph. 29, 32: 1-32: 6 BLACk, M.J., AND YACOOB,Y.1995. Tracking and recognizing rigid and non-rigid facial motions using local parametric models LIN, I -C, AND OUHYOUNG, M. 2005. Mirror mocap: Automatic of image motion. In /CCV, 374-381 and efficient capture of dense 3d facial motion parameters from video. The visual Computer 21, 6, 355-372 BLANZ. V, AND VETTER, T. 1999. A morphable model for the synthesis of 3d faces. In PrOc. SIGGRAPH 99. LOU, H, AND CHAL, J. 2010. Example-based human motion denoising. IEEE Trans on Visualization and Computer Graphics BORSHUKOV G. PIPONI D. LARSEN.O. LEWIS.. P. AND 16,870-879. TEMPELAAR-LIETZ, C.2005. Universal capture -image-based LU P. NOCEDAL.J. ZHU.C. BYrd.R.H. AND BYRD.R.H facial animation for the matrix reloaded". In siggraPh 2005 1994. A limited-memory algorithm for bound constrained opti Courses mization. SIAM Journal on Scientific Computing BRADLEY D. HEIDRICH W. POPA T AND SHEFFER A MA. W.-C.. HaWKiNS,T. PEERS. P. CHABErt C -F. WEISS 2010. High resolution passive facial performance capture. ACM M., AND DEBEVEC, P. 2007. Rapid acquisition of specular and Trans. Graph.29,41:1-41:10 diffuse normal maps from polarized spherical gradient illumina CHAL.X. XIAO.. AND HODGINS.J. 2003. Vision-based tion. In EUROGRAPHICS Symposium on rendering control of 3d facial animation In sca MCLACHLAN, G.J., AND KRISHNAN, T. 1996. The em algo- CHUANG.E. AND BREGLER. C. 2002. Performance driven facial rithm and Extensions. Wiley-Interscience animation using blendshape interpolation. Tech. rep, Stanford PEREZ, P, GANGNET, M, AND BLAKE, A. 2003. Poisson image University editing. ACM Trans. Graph. 22, 313-318 COOTES. T. EDWARDS G. AND TAYLOR. C. 2001. Active PIGHIN, F, AND LEWIS, J. P. 2006. Performance-driven facial appearance models. PAMI.3. 681-68.5 animation. In ACM SIGGRAPH 2006 Courses COVELL, M. 1996. Eigen-points: Control-point location using PIGHIN F, SZELISKI, R, AND SALESIN D. 1999. Resynthesiz- principle component analyses In FG96 ing facial animation through 3d model-based tracking. ICCV I 143-150 DECARLO, D, AND METAXAS, D. 1996. The integration of optical flow and deformable models with applications to human ROBERTS, S. 1959. Control chart tests based on geometric moving face shape and motion estimation. In CVPR averages. In Technometrics, 239250 DECARLO, D, AND METAXAS, D. 2000. Optical flow constraints TIPPING, M.E., AND BISHOP, C M. 1999. Probabilistic principal on deformable models with applications to face tracking. 1JCV component analysis. Journal of the royal Statistical Society 8.99-127 Series b EKMAN, P, AND FRIESEN, W. 1978. Facial Action Coding Sy TIPPING, M.E., AND BISHOP, C. M. 1999. Mixtures of proba tem: A Technique for the Measurement of Facial Movement. bilistic principal component analyzers. Neural Computation /1 Consulting Psychologists Press VIOLA, P, AND JONES, M. 2001. Rapid object detection using a ESSA.I. BAsU, S. DARRELL. T. AND PENTLAND, A. 1996 boosted cascade of simple features. In CVPR Modeling, tracking and interactive animation of faces and heads WEISE T. LEIBE B. AND GOOL. L.y 2008. Accurate and using input from video. In Proc. Computer animation robust registration for in-hand modeling. In CVPR FURUKAWA, Y, AND PONCE, J. 2009. Dense 3d motion capture WEISE,T. H. GOOL. L.V AND PAULY.M. 2009. Face/off for human faces. In CVPR Live facial puppetry In SCA GROCHOW K. MARTIN. S. L. HERTZMANNA. AND WILLIAMS. L. 1990. Performance-driven facial animation in POPOVIC, Z. 2004. Style-based inverse kinematics. ACM Trans Graph.23,522-531 Comp. Graph(Proc. SIGGRAPH 90) GuenteR. B. GRIMM. C. WooD. D. MALVAR.H. AND WILSON, C. A, GHOSH, A, PEERS, P, CHIANG, J -Y, BUSCH, J, AND DEBEVEC, P. 2010. Temporal upsampling of per- PIGHIN, F. 1993. Making faces. IEEE Computer Graphics formance geometry using photometric alignment. ACM Trans and Applications 13, 6-8 Graph.29,17:1-17:11 IKEMOTO. L. ARiKAN.O. AND FORSYTH. D. 2009. General ZHANG. S. AND HUANG. P. 2004. High-resolution, real-time 3d izing motion edits with gaussian processes. ACM Trans. Graph shape acquisition. In CvPR Workshop 28,1:1-1:12 ZHANG, L, SNAVELY. N. CUrLeSS. B. AND SEITZ. S. M LAU, M, CHAL,J, XU, Y.Q., AND SHUM, H -Y. 2007. Face 2004. Spacetime faces: high resolution capture for modeling poser: interactive modeling of 3d facial expressions using model and animation. ACM Trans. graph. 23, 548-558 priors. In SCa

...展开详情
试读 9P Realtime Performance-Based Facial Animation.pdf
立即下载 低至0.43元/次 身份认证VIP会员低至7折
    一个资源只可评论一次,评论内容不能少于5个字
    hzj415909583 不错,对我的Kinect开发有帮助,谢谢分享
    2015-01-07
    回复
    img
    qq_15551195

    关注 私信 TA的资源

    上传资源赚积分,得勋章
    最新推荐
    Realtime Performance-Based Facial Animation.pdf 9积分/C币 立即下载
    1/9
    Realtime Performance-Based Facial Animation.pdf第1页
    Realtime Performance-Based Facial Animation.pdf第2页
    Realtime Performance-Based Facial Animation.pdf第3页

    试读已结束,剩余6页未读...

    9积分/C币 立即下载 >