Google Scholar

Articles

Scholar

Flow2Flow: Audio-visual cross-modality generation for talking face videos with rhythmic head

Z Wang, W He, Y Wei, Y Luo�- Displays, 2023 - Elsevier

Audio-visual cross-modality generation refers to the generation of audio or visual content
based on input from another modality. One of the key tasks in this field is the generation of
realistic talking facial videos from audio and head pose information, which has significant
applications in human–computer interaction, virtual reality, and video production. However,
previous work has limitations such as the inability to generate natural head poses or interact
with audio, which compromises the realism and expressive power of the generated videos�…

Save Cite Cited by 2 Related articles

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Flow2Flow: Audio-visual cross-modality generation for talking face videos with rhythmic head