Facial Expression Space Learning

Animation of facial speech and expressions has experienced increased attention recently. Most current research focuses on techniques for capturing, synthesizing, and retargeting facial expressions. Little attention has been paid to the problem of controlling and modifying the expression itself. If the actor/actress was originally recorded in a happy mood, all output animations are generated with the same happy expressions. Ultimately, life-like and realistic facial animation needs to cover the entire range of human expression and mood change. Of course it is possible to collect all possible facial motions in every possible mood, and then apply existing techniques. Unfortunately, this requires a very large amount of data. Instead, we present techniques that separate video data into expressive features and underlying content. This allows, for example, a sequence originally recorded with a happy expression to be modified so that the speaker appears to be speaking with an angry or neutral expression. Although the expression has been modified, the new sequences maintain the same visual speech content as the original sequence. The facial expression space that allows these transformations is learned with the aid of a factorization model.

Click on the pictures below to play the Quicktime movies. The sequence on the left is a new recorded sentence spoken with a happy expression. The first sequence on the right is the reconstruction of the same sequence using the statistical model derived from the training data. The two sequences on the right are the extrapolated expressions.
Novel sequence (445KB)	Reconstruction (692KB)	Expression2 (690KB)	Expression3 (701KB)
The are waiting on the shingle -- will you come and join the dance?