(a)	(b)

(c)	(d)

Using synthetic aperture photography and confocal imaging to see through partially occluded environments. (a) shows one view of a figurine partially obscured by a plant. Summing 16 different views produces an image (b) with a wide synthetic aperture, hence a shallow depth of field, blurring out the plant.

Alternatively, applying confocal imaging, by capturing a sequence of images under patterned illumination as described in the paper, produces (c), in which the plant has become dark. Combining these two techniques yields (d), in which the plant is both blurry and dark, effectively disappearing.

Using confocal imaging to enhance visibility in weakly scattering environments. An AT&T calling card is placed in a tank filled with dilute milk. By reflecting a video projector through an array of 16 mirrors, we create a virtual projector having a synthetic aperture 1 meter wide. We use this arrangement to illuminate the calling card with 16 converging beams (visible at the right side of this photograph). We scan these beams across the card, and for each beam position extract only those pixels inside a tile (bright rectangle at left) where the beams intersect the card. Assembling these tiles yields a composite image exhibiting less backscatter - hence better contrast - than if the scene were floodlit. The results are in the paper.

Synthetic aperture confocal imaging

Marc Levoy

Stanford (CS)

Billy Chen

Stanford (CS)

Vaibhav Vaish

Stanford (CS)

Mark Horowitz

Stanford (EE)

Ian McDowall

Fakespace Labs

Mark Bolas

Fakespace Labs

Proc. Siggraph 2004

Abstract:

Confocal microscopy is a family of imaging techniques that employ focused patterned illumination and synchronized imaging to create cross-sectional views of 3D biological specimens. In this paper, we adapt confocal imaging to large-scale scenes by replacing the optical apertures used in microscopy with arrays of real or virtual video projectors and cameras. Our prototype implementation uses a video projector, a camera, and an array of mirrors. Using this implementation, we explore confocal imaging of partially occluded environments, such as foliage, and weakly scattering environments, such as murky water. We demonstrate the ability to selectively image any plane in a partially occluded environment, and to see further through murky water than is otherwise possible. By thresholding the confocal images, we extract mattes that can be used to selectively illuminate any plane in the scene.

Additional information available:

PDF file (3.1 MB)
MPEG-1 video (5 minutes, 86 MB)
MPEG-4 video (same content but more heavily compressed, 8 MB)
SIGGRAPH Powerpoint presentation (7 MB, the embedded video is available separately, see previous link)

Confocal imaging versus separation of direct and global reflections in 3D scenes

written by Marc Levoy
October 13, 2006

Many people have commented on the similarity between the enhanced underwater visibility reported in our SIGGRAPH 2004 paper (linked to this web page) and the removal of volumetric scattering effects reported in Nayar et al.'s SIGGRAPH 2006 paper [1]. Even the discussions of illumination patterns are similar in the two papers. However, the techniques proposed in the two papers are different, as are their capabilities, and an analysis of these differences is instructive.

The techniques described in our paper are based on confocal microscopy. In particular, we describe two confocal imaging protocols. The first is based on illumination of the scene by a scanned sequence of focused spots [2], and the second on illumination of the scene by a sequence of focused random binary patterns [3] followed by capture of one image under full illumination. A third protocol that has been proposed in the microscopy literature is illumination of the scene by a sequence of three sinusoidal patterns, shifted with respect to one another by 1/3 of the pattern's wavelength [4,5]. In this protocol a confocal image is formed by simple additions and subtractions of these three images, without the necessity of extracting specific tiles.

We now make a number of observations about these protocols - observations that we unfortunately overlooked when writing our SIGGRAPH paper:

The lack of a tile extraction step in the third protocol means that, although the illumination source and camera are assumed to be focused at the same depth, no particular lateral alignment needs to be assumed between the camera's lines of sight and the projected patterns. This protocol is the basis for modern structured-light microscopy systems [6].
Restricting ourselves to synthetic aperture confocal systems, in which focusing is accomplished by an array of video projectors and cameras rather than by a single optical aperture, and considering only the second protocol (random binary patterns), although a minimum of two video projectors are required to obtain confocal imaging when the camera is coaxial with one of the projectors (figure 3f in our paper), only one projector is required in the non-coaxial case (figure 3e). Indeed, in experiments conducted in a large water tank at the Woods Hole Oceanographic Institution subsequent to writing the paper (see below), we occasionally used only one projector - still obtaining an improvement in visibility.
All three of these protocols perform optical sectioning, meaning that they remove light reflected directly to the camera by points off the focal plane. However, the third protocol (subtractions of shifted sinusoids) also performs descattering. In particular, it removes light reflected to the camera, whether by points on or off the focal plane, after multiple bounces. By contrast, the scanning protocol removes none of this multiply-scattered light, and the protocol based on random binary patterns removes some but not all of it. A full analysis of the descattering ability of these various protocols is beyond the scope of this technical note. Interestingly, this author has been unable to find such an analysis in the confocal imaging literature, where the assumption is usually made that the medium exhibits minimal multiple scattering.

Now let's put these pieces together. Observation #1 says that confocal imaging can be performed without calibration of the camera to the projector. Combining this with observation #2, which says that one video projector suffices, means that synthetic aperture confocal imaging can be applied to non-planar scenes, in which no fixed relationship exists between the camera's lines of sight and those of the projector. In those scenes the system will perform descattering (as defined in observation #3) but not optical sectioning. In other words, it can be used to remove global illumination effects from 3D scenes. Unfortunately, we did not realize this in 2004. Nayar et al. did realize it, although they approached the problem from the perspective of structured illumination rangefinding rather than confocal imaging, and they therefore expressed these observations differently than we would have. Applying these ideas, they employed one projector and a protocol of three shifted sinusoids to remove volumetric scattering effects from opaque non-planar objects immersed in a weakly scattering medium (like the kitchen sink example in figure 6e of their paper).

In summary, although both of these papers can be described in the language of confocal imaging, the different choices of illumination protocols made by the two papers led to very different capabilities. In particular, Nayar et al.'s protocol based on a single projector and shifted sinusoids performs descattering but no optical sectioning, whereas our protocol based on multiple projectors and scanning or random binary patterns performs optical sectioning but little or no descattering. If in our system we had used shifted sinusoids, we could have performed both optical sectioning and descattering.

Note added June 16, 2008: In a subsequent paper [7], we modified Nayar et al's technique to employ multiple video projectors and cameras. This effectively combines his 2006 technique [1] and our 2004 technique (this web page), thereby performing both optical sectioning and descattering of macroscopic scenes. Interested readers are referred to the paper for details.

References

[1] Nayar, S.K., Krishnan, G., Grossberg, M.D., Raskar, R., Fast Separation of Direct and Global Components of a Scene using High Frequency Illumination, Proc. SIGGRAPH 2006.
[2] Corle, T.R.. Kino, G.S. Confocal Scanning Optical Microscopy and Related Imaging Systems, Academic Press, 1996.
[3] Wilson, T., Juskaitis, R., Neil, M., Kozubek, M., Confocal Microscopy by Aperture Correlation, Optics Letters, Vol. 21, No. 3, 1996.
[4] Neil, M.A.A., Juskaitis, R., Wilson, T., Method of obtaining optical sectioning by using structured light in a conventional microscope, Optics Letters, Vol. 22, No. 24, 1997.
[5] Wilson, T., Neil, M.A.A., Juskaitis, R., Real-time three-dimensional imaging of macroscopic structures, Journal of Microscopy, Vol. 191, Issue 2, August 1998, pp. 113-220.
[6] Mitic, J., Anhut, T., Serov, Al, Lasser, T., Real-time optically sectioned wide-field microscopy employing structured light illumination and a CMOS detector, Three-dimensional and multidimensional microscopy: Image acquisition and Processing X, Proc. SPIE, Vol. 4964, January, 2003.
[7] Fuchs, C., Heinz, M., Levoy, M., Seidel, H.-P., Lensch, H.P., Combining Confocal Imaging and Descattering, Eurographics Symposium on Rendering (EGSR) 2008.

Experiment in a large water tank at the Woods Hole Oceanographic Institution

In our SIGGRAPH paper, we applied synthetic confocal imaging to two problems: seeing through partially occluded environments such as foliage and crowds, and seeing through turbid water. However, in the latter application we tested our techniques only in a 10-gallon tank, with illumination and viewing distances of 10-30 cm. At that scale, backscatter of projected light to the camera dominated over attenuation. In most practical applications of underwater imaging, e.g. remotely operated vehicles (ROVs), illumination and imaging distances are an order of magnitude larger, making attenuation more important. Moreover, at the turbidities we employed in the 10-gallon tank, multiple scattering was a significant factor, and as we point out in our paper, confocal imaging performs poorly in the presence of multiple scattering. Finally, as we explain in the note earlier on this web page about confocal imaging versus separation of direct and global reflections, one projector suffices to achieve confocal imaging if the projector is not coaxial with the camera - a point we failed to emphasize in our paper.

To address these problems, in the Summer of 2004 (and consequently after we published our SIGGRAPH paper), we repeated our underwater confocal imaging experiment in a large water tank at the Woods Hole Oceanographic Institution (WHOI). In this experiment, we used from 1 to 5 projectors. The "beauty shot" reproduced above shows five projectors focusing a narrow beam (in green) through turbid water (white haze) at a test target (not visible). Our most important result from this experiment was that for oblique illumination from a single projector, one obtains roughly an order of magnitude improvement in SNR when imaging through turbid water if scanned confocal imaging is employed instead of floodlighting. The experiment is described, and its results analyzed, in this technical memo, co-authored with Hanumant Singh of the WHOI Deep Submergence Laboratory. Although not described in our SIGGRAPH paper, these results were presented during our SIGGRAPH 2004 talk, and they are included in the Powerpoint slide set linked above.