Jitendra Malik
Computer Science Division
University of California at Berkeley


I shall argue that early and intermediate level visual processing be modeled as a three stage process. The first stage is a measurement stage carried out with spatiotemporal receptive fields tuned to orientation, spatial frequency, opponent color, and short-range motion. The second stage is a grouping stage resulting in the formation of regions of coherent brightness, color and texture. Call these 'proto-surfaces'. The third stage results in the formation of surfaces/objects with attached properties such as lightness, object motion, occlusion relationships (figure-ground), depth, slant-tilt etc and is based on the combined operation of Gestalt grouping factors, shape cues, and can be partially influenced by knowledge of familiar configurations.

The first stage can and has been modeled by many researchers using tools of linear system analysis. We offer a novel approach to the second stage by modeling it as the process of finding a partition of the image into regions such that there is high similarity within a region and low similarity across regions. This is made precise as the 'Normalized cut' criterion which can be optimized by solving an eignevalue problem. The resulting eigenvectors provide a herarchical partitioning of the image into regions ordered according to salience. Brightness, color, texture, motion similarity, proximity and good continuation can all be encoded into this framework. We show results on complex images of natural scenes which demonstrate the significant superiority of this technique over classical approaches such as those based on edge detection, MRFs etc. Phenomena such as subjective contours emerge as side consequences.

Our work on the third stage is preliminary; I shall argue on computational and psychophysical grounds that modular shape processing should be abandoned, and that grouping driven by ecological statistics is as crucial as shape cues driven by ecological optics.

This is joint work with Jianbo Shi, Serge Belongie and Thomas Leung.

Eyal Amir
Last modified: Wed Apr 1 01:00:34 PST 1998