ALGORITHM

Image Segmentation

  1. Blob tracking

    If we wish to track the entire body as a single blob, we only need segment out the diver's body from the frames. For this, we are planning on using our code from Homework 3 on skin-color segmentation for the initial segmentation of the diver from the background since it is readily available and not the main focus of our project. In addition, the divers will be wearing swimsuits, so most of their bodies will be skin color (especially on a male diver).

    So, we will use maximum likelihood estimation and expectation maximization to segment divers from the background. First, we will train our system to recognize skin color. Then we will feed our footage of divers into the system and it will give us the probability that the pixel corresponds to skin at every pixel at every frame. In this way, we compute a silhouette of the figure.

  2. Cardboard people

    This algorithm assumes that each part of the body can be estimated by a planar patch, so we now need to separately segment parts of the body. To do this, we define a 2D model of the diver on the first frame in which a rectangle encloses each part of his body separately and each rectangle shares at least one edge with another rectangle.

  3. Exponential Maps and Twists

    In this algorithm, we use a 3D model to approximate the shape of body segments. First, the 3D model is projected onto the image plane of the first frame in the pose and angular configuration specified by the viewer.

    Then we use Expectation-Maximization to refine our segmentation. For each body segment in the model, we define a matte consisting of zeros and ones that corresponds to the projection of that portion of the model onto the image plane. The estimate of how the body moves to the next frame is computed, and then for each pixel and for each body segment (and the background) we compute the probability that it complies with our motion estimate. The matte is then refined by normalizing the sum of all probabilities per pixel location to one.

Tracking
  1. Blob tracking

    In a simple blob tracker, we track the silhouette of the diver by superimposing silhouettes taken at each frame onto a single image. These silhouettes are time-stamped by different intensities. We start with a black image (i.e., intensity everywhere equal to 0). The silhouette from the current frame is given the highest intensity and is superimposed onto the black image. The silhouette immediately previous to that is then given a slightly lower intensity and superimposed onto the image. We continue in this manner until we have an image with several silhouettes. The gradient of this image defines the global motion vector.

    This technique can be refined to track motion of regions of interest. We find a boundary pixel on the most recent silhouette and travel along the boundary looking outside for a recent unmarked silhouette. If we find one, we mark it by performing a floodfill. The algorithm produces segmented motion masks which describe the motion of a particular portion of the footage.

  2. Cardboard people

    We use the articulated motion between two frames to predict the location of that body segment patch in the next frame. The location of the patch is then updated by applying the planar motion to it.

    The articulated motion is computed by simultaneously minimizing the total energy of the motions of each patch and adding in an articulation constraint that each patch must retain its original connectivity to its neighbor patches.

  3. Exponential Maps and Twists

    We compute the spatiotemporal gradients between frames and then esimate the motion using the mattes derived in the segmentation portion plus an equation based on the properties of kinematic chains given by [Bregler, Malik].

Disparity Calculation

    Now we get to the heart of the matter: given two divers, how do we describe how synchronized their motion is? To be honest, we're still unsure as to how we're going to do this part. Given a single global orientation vector, one obvious way of comparing synchronization might be to compare their semi-parabolic trajectories by deriving a mapping from the first curve to the second curve and comparing that mapping to the identity matrix.

WORST-CASE/BEST-CASE PLANS