Yuan Gong, Yong Zhang*, Xiaodong Cun, Yin Fei, Yanbo Fan, Xuan Wang,
Baoyuan Wu, Yujiu Yang*
(* Corresponding Authors)
teaser.mp4
We target cross-domain face reenactment in this paper, i.e., driving a cartoon image with the video of a real person and vice versa. Recently, many works have focused on one-shot talking face generation to drive a portrait with a real video, i.e., within-domain reenactment. Straightforwardly applying those methods to cross-domain animation will cause inaccurate expression transfer, blur effects, and even apparent artifacts due to the domain shift between cartoon and real faces. Only a few works attempt to settle cross-domain face reenactment. The most related work AnimeCeleb requires constructing a dataset with pose vector and cartoon image pairs by animating 3D characters, which makes it inapplicable anymore if no paired data is available. In this paper, we propose a novel method for cross-domain reenactment without paired data. Specifically, we propose a transformer-based framework to align the motions from different domains into a common latent space where motion transfer is conducted via latent code addition. Two domain-specific motion encoders and two learnable motion base memories are used to capture domain properties. A source query transformer and a driving one are exploited to project domain-specific motion to the canonical space. The edited motion is projected back to the domain of the source with a transformer. Moreover, since no paired data is provided, we propose a novel cross-domain training scheme using data from two domains with the designed analogy constraint. Besides, we contribute a cartoon dataset in Disney style. Extensive evaluations demonstrate the superiority of our method over competing methods.
You can manually download our pre-trained model and put it in ./checkpoints.
Model | Description |
---|---|
checkpoints/in-domain440000.pth | Pre-trained ToonTalker Checkpoints for In-Domain Reenactment. |
checkpoints/cross-domain.pth | Pre-trained ToonTalker Checkpoints for Cross-Domain Reenactment. |
- In-Domain Reenactment with a single image and a video.
python run_demo_indomain.py \
--source_path source.jpg \
--driving_path input.mp4 \
--output_dir output.mp4
- Cross-Domain Reenactment with a single real-domain image and a cartoon-domain video.
python run_demo_crossdomain.py \
--type c2r \
--source_path source.jpg \
--driving_path input.mp4 \
--output_dir output.mp4
- Cross-Domain Reenactment with a single cartoon-domain image and a real-domain video.
python run_demo_crossdomain.py \
--type r2c \
--source_path source.jpg \
--driving_path input.mp4 \
--output_dir output.mp4
R2C1.mp4
R2C2.mp4
C2R_1.mp4
C2R_2.mp4
indomain1.mp4
indomain2.mp4
@misc{gong2023toontalker,
title={ToonTalker: Cross-Domain Face Reenactment},
author={Gong Yuan and Zhang Yong and Cun Xiaodong and Yin Fei and Fan Yanbo and Wang Xuan and Wu Baoyuan and Yang Yujiu},
year={2023},
eprint={2308.12866},
archivePrefix={arXiv},
primaryClass={cs.CV}
}