Animation of facial speech and
expressions has experienced increased attention recently. Most current
research focuses on techniques for capturing, synthesizing, and retargeting
facial expressions. Little attention has been paid to the problem of
controlling and modifying the expression itself. If the actor/actress
was originally recorded in a happy mood, all output animations are
generated with the same happy expressions. Ultimately, life-like and
realistic facial animation needs to cover the entire range of human
expression and mood change. Of course it is possible to collect all
possible facial motions in every possible mood, and then apply existing
techniques. Unfortunately, this requires a very large amount of data.
Instead, we present techniques that separate video data into expressive
features and underlying content. This allows, for example, a sequence
originally recorded with a happy expression to be modified so that
the speaker appears to be speaking with an angry or neutral expression.
Although the expression has been modified, the new sequences maintain
the same visual speech content as the original sequence. The facial
expression space that allows these transformations is learned with
the aid of a factorization model.
|