## Model-based Face Tracking

 It's happened to all of us. You've wound up in a hole-in-the-wall restaurant you already regret going into, and you're worried about creepy-crawlies. Then there! On the floor! A really big scary bug... Oh wait, it's just a piece of fuzz. Model-based computer vision can use this story for computational inspiration. (No, we're not going to see bugs everywhere!) A computer system can use what it's expecting to see, to help determine what it does see. In particular, the use of a model which describes objects in a scene will make scene understanding easier. Of course, using a model in the wrong situation can produce the wrong interpretation. (Ok, so we might see a few bugs.)

Now consider the problem of tracking the motion of an observed human face. We could use a general purpose object-tracker for this application (one which simply sees the face as a deforming surface), but this would be ignoring everything we know about faces, especially relating to their highly constrained appearance and motion.

Face models and model-based estimation

Instead, a model-based approach to this problem would use a model which describes the appearance, shape, and motion of faces to aid in estimation. This model has a number of parameters (basically, "knobs" of control), some of which describe the shape of the resulting face, and some describe its motion.

In the picture below, the default model (top, center) can be made to look like specific individuals by changing shape parameters (the 4 faces on the right). The model can also display facial motions (the 4 faces on the left showing eyebrow frowns, raises, a smile, and an open mouth) by changing motion parameters. And of course, we can simultaneously change both shape and motion parameters (bottom, center).

A model of face shape and motion

Now you might be asking yourself, how can this model be detailed enough to represent any person making any expression? The answer is: it can't. But, it will be able to represent any of these faces to an acceptable degree of accuracy. The benefit of this simplifying assumption is that we can have a fairly small set of parameters (about 100) which describe a face. This results in a more efficient, and more robust system.

Estimating parameters from images using a model lets us use new methods for processing (which are still related to their counterparts which do not use a model). For example, computing the motion of an object using optical flow (without a model) results in a field of arrows. However, when a model is used, a set of "parameter velocities" is extracted, instead.

Results

Below are some of our face tracking results (this work is joint with Dimitris Metaxas). In each experiment, the tracked subject moves their head and makes some facial expressions (currently, the system is not real-time). The initial position of the face in each sequence is performed by hand (by identifying a small set of feature points).

In each case, the estimated model (the black wireframe) is superimposed on the captured images. In each of the following sequences, the subjects make complex facial and head motions, which are successfully tracked by our framework.

(I would like to thank Yaser Yacoob and the Center for Automation Research at the University of Maryland at College Park for providing the second and third image sequences).

Validation
 We have also performed validation studies where we mark a subject with dots (top), extract the positions of these markers (middle: marker locations are shown as white dots), and compare this with the model-predicted locations of these markers. Our results show that our method can maintain accurate track to about one-half centimeter accuracy (in the image plane) for long sequences (many hundreds of frames) without drifting or losing track. Results from validation study

Why track a face?

Well, being able to track a face from images contributes toward the ability to monitor a user's attention and reactions automatically and without intrusion, and has obvious benefits in human-machine interaction. This is the subject of interaction research underway at THE VILLAGE.

Publications

There are a number of available on-line publications describing this work (in chronological order):

• Douglas DeCarlo and Dimitris Metaxas.
The Integration of Optical Flow and Deformable Models with Applications to Human Face Shape and Motion Estimation
In Proceedings CVPR '96, pp. 231-238, 1996.
[PDF (559K)]

This paper describes a model-based approach (using deformable models) to face tracking where a 3D face model is used to estimate the shape and motion of a face in a sequence of images using a combination of optical flow information (as a constraint), and edge information.

• Douglas DeCarlo and Dimitris Metaxas.
Optical Flow Constraints on Deformable Models with Applications to Face Tracking.
Univ. of Pennsylvania CIS technical report MS-CIS-97-23, 1997.

(A revised version of this paper appears in IJCV, combined with the CVPR '99 paper, and is listed below).

• Douglas DeCarlo and Dimitris Metaxas.
Deformable Model-Based Shape and Motion Analysis from Images using Motion Residual Error.
In Proceedings ICCV '98, pp. 113-119, 1998.
[PDF (373K)]

This paper presents a method for updating the shape of a model, based on analysis of the least-squares residuals from the model-based optical flow computation. This analysis allows the shape of a tracked face to be computed with much less data (fewer frames) than the above method which uses only edge information.

(a longer version of this paper has been submitted to PAMI; ask me for a copy if you're interested)

• Douglas DeCarlo and Dimitris Metaxas.
Combining Information using Hard Constraints.
In CVPR '99, pp. 132-138.
[PDF (443K)]

The reason why the framework described in the CVPR '96 paper worked as well as it did, was due to edge-to-model alignment optimization problem being constrained by the model-based optical flow solution. This constraint projected away parts of the search space, making the edge-model alignment easier (faster, and more likely to converge to the true answer). Experiments are presented which suggest that it was indeed the constraint that improved performance.

• Douglas DeCarlo and Dimitris Metaxas.
Optical Flow Constraints on Deformable Models with Applications to Face Tracking.
In IJCV, July 2000, 38(2), pp. 99-127. [PDF (695K)]

This paper provides a more detailed look at the framework from the CVPR 96 paper, and also describes how a Kalman filter is incorporated, and how it is used to combine the optical flow and edges in an efficient way (as described in the CVPR 99 paper). It also contains a number of validation experiments.

Links

HOME