NeuralFur Teaser From multi-view captures our method NeuralFur reconstructs detailed geometries of animals with a mesh-based body and strand-based fur. The reconstructions can be integrated in computer graphics frameworks and rendered with artist defined colors.

Abstract

We present DynHair, a novel method for tracking and modeling dynamic hair for human head avatars. From video input, we reconstruct a dynamic head avatar with an explicit strand-based hair representation using structured 3D Gaussian Splatting. In contrast to the face region of human head avatars, which can be modeled with 3D Gaussians that are attached or generated with respect to some expressive 3D head model, hair is particularly challenging as it exhibits dynamic motion effects. Therefore, we present a novel method that models the dynamic deformations of the hair strands using a temporal network that is conditioned on angular velocity and acceleration of the head, as well as relative gravity. Specifically, an LSTM encodes the motion history and modulates per-point strand features via FiLM conditioning which further used by MLP to produce physically plausible displacements to canonical hairstyle. We jointly optimize this motion and appearance representation of the hair, with a 3DGS-based representation of the face-region, via differentiable Gaussian splatting with photometric, geometric, and physics-based supervision. As a result of our method, we retrieve hair tracking of the training video data and an animatable head avatar with controllable hair dynamics. In our experiments, we demonstrate state-of-the-art performance in terms of hair dynamics, temporal consistency, and generalization across subjects.

Video Presentation

Main idea

Given multi-view video input, we first initialize a canonical hybrid model for the upper body and hair. Hair strands are initialized using Im2Haircut prior model, and the upper body is represented with unstructured 3D Gaussians. We then jointly optimize hair and head deformations with two separate motion models. Hair deformation is conditioned on BFM motion history with an additional rigid transformation, while head deformation is conditioned on facial expressions and pose parameters. Finally, the hair and upper body Gaussians are composited and rendered via differentiable splatting to produce the output images.

Comparisons

Acknowledgements and Disclosure

Vanessa Sklyarova is supported by the Max Planck ETH Center for Learning Systems. Berna Kabadayi is supported by the International Max Planck Research School for Intelligent Systems (IMPRS-IS). Justus Thies is supported by the ERC Starting Grant 101162081 ``LeMo'' and the DFG Excellence Strategy— EXC-3057. The authors would like to thank Peter Kulits and Silvia Zuffi for their discussions on the project, Tomasz Niewiadomski for providing results on GenZoo, and Benjamin Pellkofer for IT support.


MJB has received research gift funds from Adobe, Intel, Nvidia, Meta/Facebook, and Amazon. MJB has financial interests in Amazon, Datagen Technologies, and Meshcapade GmbH. While MJB is a consultant for Meshcapade, his research in this project was performed solely at, and funded solely by, the Max Planck Society.