Broad Area Colloquium For AI-Geometry-Graphics-Robotics-Vision
(CS 528)

Shape-From-Silhouette Across Time

Simon Baker
Research Scientist
The Robotics Institute
Carnegie Mellon University
Monday, Nov. 17, 2003, 4:15PM
TCSeq 200


Shape-From-Silhouette (SFS) is a well-studied, and commonly used, method of reconstructing the 3D shape of an object from multiple cameras. First, the silhouettes of the objects are extracted in the input images. Then, the shape of the object is approximated by intersecting the volumes created by projecting the silhouettes into the scene using the known camera geometry. The result of SFS, known as the Visual Hull, is an upper bound on the shape of the object. Although SFS has a number of advantageous properties, the main limitation of the method is that the estimate of the shape can be very coarse when only a small number of cameras (10-20) are used.

In this talk I will present Shape-From-Silhouette Across Time(SFS-AT), a way of combining multiple silhouettes captured from multiple cameras as the object moves in time to obtain an improved estimate of the Visual Hull. I will first show that the problem is inherently ambiguous given only the silhouette information. I will then show how the geometric silhouette information can be combined in a natural way with color information to yield two SFS-AT algorithms: (1) the first for a single rigidly moving object, and (2) the second for an articulated object, or a collection of rigidly moving objects.

I will proceed to show how these two algorithms can be combined to build a system to estimate a 3D kinematic model of a human consisting of: (1) 3D shape, (2) 3D joint locations, (3) a segmentation of the model into the various body parts. I will also describe an extension of the articulated SFS-AT algorithm to track the motion of the human in a novel set of video sequences using the 3D kinematic model, thereby creating a marker-less motion capture system. Finally, I will show how these two systems can be combined to perform marker-less motion transfer.

This research is joint work with German Cheung and Takeo Kanade.

About the Speaker

Simon Baker is a Research Scientist in the Robotics Institute at Carnegie Mellon University, where he conducts research in Computer Vision. Before joining the Robotics Institute in September 1998 as a Postdoc, he was a Graduate Research Assistant at Columbia University, where he obtained his Ph.D. in the Department of Computer Science. He also spent a summer visiting the Vision Technology Group at Microsoft Research. He received a B.A. in Mathematics from Trinity College, Cambridge University in 1991, an M.Sc. in Computer Science from the University of Edinburgh in 1992, and an M.A. in Mathematics from Trinity College, Cambridge University in 1995. His current research interests include, face analysis (recognition, tracking, model building, and resolution enhancement), 3D reconstruction and vision for graphics, vision theory, vision for automotive applications, and projector-camera systems. For more details of his research, see his webpage:


Back to the Colloquium Page