This web page contains links to all my papers back to 1990, and selected ones beyond that. The list is sorted by topic, and then in reverse chronological order within each topic. A complete list may be found in my CV. For some of the older papers, PDFs have been created from optical scans of the original publications. The entries for some papers include links to software, data, other papers, or historical notes about the paper. The visualization at right below was created from the words on this page (with minor editing) using http://www.wordle.net.

Computational microscopy  
Enhancing the performance of the light field microscope using wavefront coding
Noy Cohen, Samuel Yang, Aaron Andalman, Michael Broxton, Logan Grosenick, Karl Deisseroth, Mark Horowitz, Marc Levoy Optics Express, Vol. 22, Issue 20 (2014). 
Light field microscopy has been proposed as a new highspeed volumetric computational imaging method that enables reconstruction of 3D volumes from captured projections of the 4D light field. Recently, a detailed physical optics model of the light field microscope has been derived, which led to the development of a deconvolution algorithm that reconstructs 3D volumes with high spatial resolution. However, the spatial resolution of the reconstructions has been shown to be nonuniform across depth, with some z planes showing high resolution and others, particularly at the center of the imaged volume, showing very low resolution. In this paper, we enhance the performance of the light field microscope using wavefront coding techniques. By including phase masks in the optical path of the microscope we are able to address this nonuniform resolution limitation. We have also found that superior control over the performance of the light field microscope can be achieved by using two phase masks rather than one, placed at the objective's back focal plane and at the microscope's native image plane. We present an extended optical model for our wavefront coded light field microscope and develop a performance metric based on Fisher information, which we use to choose adequate phase masks parameters. We validate our approach using both simulated data and experimental resolution measurements of a USAF 1951 resolution target; and demonstrate the utility for biological applications with in vivo volumetric calcium imaging of larval zebrafish brain.  
Wave Optics Theory and 3D Deconvolution for the Light Field Microscope
Michael Broxton, Logan Grosenick, Samuel Yang, Noy Cohen, Aaron Andalman, Karl Deisseroth, Marc Levoy Optics Express, Vol. 21, Issue 21, pp. 2541825439 (2013). 
Light field microscopy is a new technique for highspeed volumetric imaging of weakly scattering or fluorescent specimens. It employs an array of microlenses to trade off spatial resolution against angular resolution, thereby allowing a 4D light field to be captured using a single photographic exposure without the need for scanning. The recorded light field can then be used to computationally reconstruct a full volume. In this paper, we present an optical model for light field microscopy based on wave optics, instead of previously reported ray optics models. We also present a 3D deconvolution method for light field microscopy that is able to reconstruct volumes at higher spatial resolution, and with better optical sectioning, than previously reported. To accomplish this, we take advantage of the dense spatioangular sampling provided by a microlens array at axial positions away from the native object plane. This dense sampling permits us to decode aliasing present in the light field to reconstruct highfrequency information. We formulate our method as an inverse problem for reconstructing the 3D volume, which we solve using a GPUaccelerated iterative algorithm. Theoretical limits on the depthdependent lateral resolution of the reconstructed volumes are derived. We show that these limits are in good agreement with experimental results on a standard USAF 1951 resolution target. Finally, we present 3D reconstructions of pollen grains that demonstrate the improvements in fidelity made possible by our method.  
Recording and controlling the 4D light field in a microscope
Marc Levoy, Zhengyun Zhang, Ian McDowall Journal of Microscopy, Volume 235, Part 2, 2009, pp. 144162. Cover article. 
By inserting a microlens array at the intermediate image plane of an optical microscope, one can record 4D light fields of biological specimens in a single snapshot. Unlike a conventional photograph, light fields permit manipulation of viewpoint and focus after the snapshot has been taken, subject to the resolution of the camera and the diffraction limit of the optical system. By inserting a second microlens array and video projector into the microscope's illumination path, one can control the incident light field falling on the specimen in a similar way. In this paper we describe a prototype system we have built that implements these ideas, and we demonstrate two applications for it: simulating exotic microscope illumination modalities and correcting for optical aberrations digitally.  
Light Field Microscopy
Marc Levoy, Ren Ng, Andrew Adams, Matthew Footer, Mark Horowitz ACM Transactions on Graphics 25(3), Proc. SIGGRAPH 2006 An additional technical memo containing optical recipes and an extension to microscopes with infinitycorrected optics. 
By inserting a microlens array into the optical train of a conventional microscope, one can capture light fields of biological specimens in a single photograph. Although diffraction places a limit on the product of spatial and angular resolution in these light fields, we can nevertheless produce useful perspective views and focal stacks from them. Since microscopes are inherently orthographic devices, perspective views represent a new way to look at microscopic specimens. The ability to create focal stacks from a single photograph allows moving or lightsensitive specimens to be recorded. Applying 3D deconvolution to these focal stacks, we can produce a set of cross sections, which can be visualized using volume rendering. In this paper, we demonstrate a prototype light field microscope (LFM), analyze its optical performance, and show perspective views, focal stacks, and reconstructed volumes for a variety of biological specimens. We also show that synthetic focusing followed by 3D deconvolution is equivalent to applying limitedangle tomography directly to the 4D light field. 
Computational photography 
(except papers on light fields) 

Simulating the Visual Experience of Very Bright and Very Dark Scenes,
David E. Jacobs, Orazio Gallo, Emily A. Cooper, Kari Pulli, Marc Levoy ACM Transactions on Graphics 34(3), April 2015. 
The human visual system can operate in a wide range of illumination levels, due to several adaptation processes working in concert. For the most part, these adaptation mechanisms are transparent, leaving the observer unaware of his or her absolute adaptation state. At extreme illumination levels, however, some of these mechanisms produce perceivable secondary effects, or epiphenomena. In bright light, these include bleaching afterimages and adaptation afterimages, while in dark conditions these include desaturation, loss of acuity, mesopic hue shift, and the Purkinje effect. In this work we examine whether displaying these effects explicitly can be used to extend the apparent dynamic range of a conventional computer display. We present phenomenological models for each effect, we describe efficient computer graphics methods for rendering our models, and we propose a gazeadaptive display that injects the effects into imagery on a standard computer monitor. Finally, we report the results of psychophysical experiments, which reveal that while mesopic epiphenomena are a strong cue that a stimulus is very dark, afterimages have little impact on perception that a stimulus is very bright.  
GyroBased MultiImage Deconvolution for Removing Handshake Blur,
Sung Hee Park, Marc Levoy Proc. CVPR 2014 Click here for the associated tech report on handling moving objects and overexposed regions. 
Image deblurring to remove blur caused by camera shake has been intensively studied. Nevertheless, most methods are brittle and computationally expensive. In this paper we analyze multiimage approaches, which capture and combine multiple frames in order to make deblurring more robust and tractable. In particular, we compare the performance of two approaches: alignandaverage and multiimage deconvolution. Our deconvolution is nonblind, using a blur model obtained from real camera motion as measured by a gyroscope. We show that in most situations such deconvolution outperforms alignandaverage. We also show, perhaps surprisingly, that deconvolution does not benefit from increasing exposure time beyond a certain threshold. To demonstrate the effectiveness and efficiency of our method, we apply it to stillresolution imagery of natural scenes captured using a mobile camera with flexible camera control and an attached gyroscope.  
WYSIWYG Computational Photography via Viewfinder Editing,
Jongmin Baek, Dawid Pająk, Kihwan Kim, Kari Pulli, Marc Levoy ACM Transactions on Graphics (Proc. SIGGRAPH Asia 2013) 
Digital cameras with electronic viewfinders provide a relatively faithful depiction of the final image, providing a WYSIWYG experience. If, however, the image is created from a burst of differently captured images, or nonlinear interactive edits significantly alter the final outcome, then the photographer cannot directly see the results, but instead must imagine the postprocessing effects. This paper explores the notion of viewfinder editing, which makes the viewfinder more accurately reflect the final image the user intends to create. We allow the user to alter the local or global appearance (tone, color, saturation, or focus) via strokebased input, and propagate the edits spatiotemporally. The system then delivers a realtime visualization of these modifications to the user, and drives the camera control routines to select better capture parameters.  
Applications of MultiBucket Sensors to Computational Photography,
Gordon Wan, Mark Horowitz, Marc Levoy Stanford Computer Graphics Laboratory Technical Report 20122 
Many computational photography techniques take the form, "Capture a burst of images varying camera setting X (exposure, gain, focus, lighting), then align and combine them to produce a single photograph exhibiting better Y (dynamic range, signaltonoise, depth of field). Unfortunately, these techniques may fail on moving scenes because the images are captured sequentially, so objects are in different positions in each image, and robust local alignment is difficult to achieve. To overcome this limitation, we propose using multibucket sensors, which allow the images to be captured in timesliceinterleaved fashion. This interleaving produces images with nearly identical positions for moving objects, making alignment unnecessary. To test our proposal, we have designed and fabricated a 4bucket, VGAresolution CMOS image sensor, and we have applied it to high dynamic range (HDR) photography. Our sensor permits 4 different exposures to be captured at once with no motion difference between the exposures. Also, since our protocol employs nondestructive analog addition of time slices, it requires less total capture time than capturing a burst of images, thereby reducing total motion blur. Finally, we apply our multibucket sensor to several other computational photography applications, including flash/noflash, multiflash, and flash matting.  
Focal stack compositing for depth of field control,
David E. Jacobs, Jongmin Baek, Marc Levoy Stanford Computer Graphics Laboratory Technical Report 20121 
Many cameras provide insufficient control over depth of field. Some have a fixed aperture; others have a variable aperture that is either too small or too large to produce the desired amount of blur. To overcome this limitation, one can capture a focal stack, which is a collection of images each focused at a different depth, then combine these slices to form a single composite that exhibits the desired depth of field. In this paper, we present a theory of focal stack compositing, and algorithms for computing images with extended depth of field, shallower depth of field than the lens aperture naturally provides, or even freeform (nonphysical) depth of field. We show that while these composites are subject to halo artifacts, there is a principled methodology for avoiding these artifacts  by feathering a slice selection map according to certain rules before computing the composite image.  
Decoupling algorithms from schedules for easy optimization of image processing pipelines,
Jonathan RaganKelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, Fredo Durand ACM Transactions on Graphics 31(4) (Proc. SIGGRAPH 2012) Click here for more information on the Halide language. Its compiler is open source and actively supported. You might also be interested in our SIGGRAPH 2014 paper on Darkroom: compiling a Halidelike language into hardware pipelines. 
Using existing programming tools, writing highperformance image processing code requires sacrificing readability, portability, and modularity. We argue that this is a consequence of conflating what computations define the algorithm, with decisions about storage and the order of computation. We refer to these latter two concerns as the schedule, including choices of tiling, fusion, recomputation vs. storage, vectorization, and parallelism. We propose a representation for feedforward imaging pipelines that separates the algorithm from its schedule, enabling highperformance without sacrificing code clarity. This decoupling simplifies the algorithm specification: images and intermediate buffers become functions over an infinite integer domain, with no explicit storage or boundary conditions. Imaging pipelines are compositions of functions. Programmers separately specify scheduling strategies for the various functions composing the algorithm, which allows them to efficiently explore different optimizations without changing the algorithmic code. We demonstrate the power of this representation by expressing a range of recent image processing applications in an embedded domain specific language called Halide, and compiling them for ARM, x86, and GPUs. Our compiler targets SIMD units, multiple cores, and complex memory hierarchies. We demonstrate that it can handle algorithms such as a camera raw pipeline, the bilateral grid, fast local Laplacian filtering, and image segmentation. The algorithms expressed in our language are both shorter and faster than stateoftheart implementations.  
CMOS Image Sensors With MultiBucket Pixels for Computational Photography,
Gordon Wan, Xiangli Li, Gennadiy Agranov, Marc Levoy, Mark Horowitz IEEE Journal of SolidState Circuits, Vol. 47, No. 4, April, 2012, pp. 10311042. 
This paper presents new image sensors with multibucket pixels that enable timemultiplexed exposure, an alternative imaging approach. This approach deals nicely with scene motion, and greatly improves high dynamic range imaging, structured light illumination, motion corrected photography, etc. To implement an inpixel memory or a bucket, the new image sensors incorporate the virtual phase CCD concept into a standard 4transistor CMOS imager pixel. This design allows us to create a multibucket pixel which is compact, scalable, and supports true correlated double sampling to cancel kTC noise. Two image sensors with dual and quadbucket pixels have been designed and fabricated. The dualbucket sensor consists of a array of 5.0 m pixel in 0.11 m CMOS technology while the quadbucket sensor comprises array of 5.6 m pixel in 0.13 m CMOS technology. Some computational photography applications were implemented using the two sensors to demonstrate their values in eliminating artifacts that currently plague computational photography.  
Digital Video Stabilization and Rolling Shutter Correction using Gyroscopes,
Alexandre Karpenko, David E. Jacobs, Jongmin Baek, Marc Levoy Stanford Computer Science Tech Report CSTR 201103, September, 2011. Click here for the source code. 
In this paper we present a robust, realtime video stabilization and rolling shutter correction technique based on commodity gyroscopes. First, we develop a unified algorithm for modeling camera motion and rolling shutter warping. We then present a novel framework for automatically calibrating the gyroscope and camera outputs from a single video capture. This calibration allows us to use only gyroscope data to effectively correct rolling shutter warping and to stabilize the video. Using our algorithm, we show results for videos featuring large moving foreground objects, parallax, and lowillumination. We also compare our method with commercial imagebased stabilization algorithms. We find that our solution is more robust and computationally inexpensive. Finally, we implement our algorithm directly on a mobile phone. We demonstrate that by using the phone's inbuilt gyroscope and GPU, we can remove camera shake and rolling shutter artifacts in realtime.  
Experimental Platforms for Computational Photography
Marc Levoy IEEE Computer Graphics and Applications, Vol. 30, No. 5, September/October, 2010, pp. 8187. If you're looking for our SIGGRAPH 2010 paper on the Frankencamera, it's the next paper on this web page. 
Although interest in computational photography has steadily increased among graphics and vision researchers, few of these techniques have found their way into commercial cameras. In this article I offer several possible explanations, including barriers to entry that arise from the current structure of the photography industry, and an incompleteness and lack of robustness in current computational photography techniques. To begin addressing these problems, my laboratory has designed an open architecture for programmable cameras (called Frankencamera), an API (called FCam) with bindings for C++, and two reference implementations: a Nokia N900 smartphone with a modified software stack and a custom camera called the Frankencamera F2. Our shortterm goal is to standardize this architecture and distribute our reference platforms to researchers and students worldwide. Our longterm goal is to help create an opensource camera community, leading eventually to commercial cameras that accept plugins and apps. I discuss the steps that might be needed to bootstrap this community, including scaling up the world's educational programs in photographic technology. Finally, I talk about some of future research challenges in computational photography.  
The Frankencamera: An Experimental Platform for Computational Photography
Andrew Adams, EinoVille (Eddy) Talvala, Sung Hee Park, David E. Jacobs, Boris Ajdin, Natasha Gelfand, Jennifer Dolson, Daniel Vaquero, Jongmin Baek, Marius Tico, Henrik P.A. Lensch, Wojciech Matusik, Kari Pulli, Mark Horowitz, Marc Levoy Proc. SIGGRAPH 2010. Reprinted in CACM, November 2012, with an introductory technical perspective by Richard Szeliski If you're looking for our release of the FCam API for the camera on the Nokia N900 smartphone, click here. 
Although there has been much interest in computational photography within the research and photography communities, progress has been hampered by the lack of a portable, programmable camera with sufficient image quality and computing power. To address this problem, we have designed and implemented an open architecture and API for such cameras: the Frankencamera. It consists of a base hardware specification, a software stack based on Linux, and an API for C++. Our architecture permits control and synchronization of the sensor and image processing pipeline at the microsecond time scale, as well as the ability to incorporate and synchronize external hardware like lenses and flashes. This paper specifies our architecture and API, and it describes two reference implementations we have built. Using these implementations we demonstrate six computational photography applications: HDR viewfinding and capture, lowlight viewfinding and capture, automated acquisition of extended dynamic range panoramas, foveal imaging, IMUbased hand shake detection, and rephotography. Our goal is to standardize the architecture and distribute Frankencameras to researchers and students, as a step towards creating a community of photographerprogrammers who develop algorithms, applications, and hardware for computational cameras.  
Gaussian KDTrees for Fast HighDimensional Filtering
Andrew Adams Natasha Gelfand, Jennifer Dolson, Marc Levoy ACM Transactions on Graphics 28(3), Proc. SIGGRAPH 2009 A followon paper, which filters in highD using a permutohedral lattice, was runnerup for best paper at Eurographics 2010. 
We propose a method for accelerating a broad class of nonlinear filters that includes the bilateral, nonlocal means, and other related filters. These filters can all be expressed in a similar way: First, assign each value to be filtered a position in some vector space. Then, replace every value with a weighted linear combination of all val ues, with weights determined by a Gaussian function of distance between the positions. If the values are pixel colors and the posi tions are (x, y) coordinates, this describes a Gaussian blur. If the positions are instead (x, y, r, g, b) coordinates in a fivedimensional spacecolor volume, this describes a bilateral filter. If we instead set the positions to local patches of color around the associated pixel, this describes nonlocal means. We describe a MonteCarlo kd tree sampling algorithm that efficiently computes any filter that can be expressed in this way, along with a GPU implementation of this technique. We use this algorithm to implement an accelerated bilat eral filter that respects full 3D color distance; accelerated nonlocal means on single images, volumes, and unaligned bursts of images for denoising; and a fast adaptation of nonlocal means to geome try. If we have n values to filter, and each is assigned a position in a ddimensional space, then our space complexity is O(dn) and our time complexity is O(dn log n), whereas existing methods are typically either exponential in d or quadratic in n.  
Spatially Adaptive Photographic Flash
Rolf Adelsberger, Remo Ziegler, Marc Levoy, Markus Gross Technical Report 612, ETH Zurich, Institute of Visual Computing, December 2008. 
Using photographic flash for candid shots often results in an unevenly lit scene, in which objects in the back appear dark. We describe a spatially adaptive photographic flash system, in which the intensity of illumination varies depending on the depth and reflectivity of features in the scene. We adapt to changes in depth using a singleshot method, and to changes in reflectivity using a multishot method. The singleshot method requires only a depth image, whereas the multishot method requires at least one color image in addition to the depth data. To reduce noise in our depth images, we present a novel filter that takes into account the amplitudedependent noise distribution of observed depth values. To demonstrate our ideas, we have built a prototype consisting of a depth camera, a flash light, an LCD and a lens. By attenuating the flash using the LCD, a variety of illumination effects can be achieved.  
Veiling Glare in High Dynamic Range Imaging
EinoVille (Eddy) Talvala, Andrew Adams, Mark Horowitz, Marc Levoy ACM Transactions on Graphics 26(3), Proc. SIGGRAPH 2007 
The ability of a camera to record a high dynamic range image, whether by taking one snapshot or a sequence, is limited by the presence of veiling glare  the tendency of bright objects in the scene to reduce the contrast everywhere within the field of view. Veiling glare is a global illumination effect that arises from multiple scattering of light inside the camera's optics, body, and sensor. By measuring separately the direct and indirect components of the intracamera light transport, one can increase the maximum dynamic range a particular camera is capable of recording. In this paper, we quantify the presence of veiling glare and related optical artifacts for several types of digital cameras, and we describe two methods for removing them: deconvolution by a measured glare spread function, and a novel directindirect separation of the lens transport using a structured occlusion mask. By physically blocking the light that contributes to veiling glare, we attain significantly higher signal to noise ratios than with deconvolution. Finally, we demonstrate our separation method for several combinations of cameras and realistic scenes.  
Interactive Design of MultiPerspective Images
for Visualizing Urban Landscapes
Augusto Román, Gaurav Garg, Marc Levoy Proc. Visualization 2004 This project was the genesis of Google's StreetView; see this historical note for details. In a followon paper in EGSR 2006, Augusto Román and Hendrik Lensch describe an automatic way to compute these multiperspective panoramas. 
Multiperspective images are a useful representation of extended, roughly planar scenes such as landscapes or city blocks. However, constructing effective multiperspective images is something of an art. In this paper, we describe an interactive system for creating multiperspective images composed of serially blended crossslits mosaics. Beginning with a sidewayslooking video of the scene as might be captured from a moving vehicle, we allow the user to interactively specify a set of crossslits cameras, possibly with gaps between them. In each camera, one of the slits is defined to be the camera path, which is typically horizontal, and the user is left to choose the second slit, which is typically vertical. The system then generates intermediate views between these cameras using a novel interpolation scheme, thereby producing a multiperspective image with no seams. The user can also choose the picture surface in space onto which viewing rays are projected, thereby establishing a parameterization for the image. We show how the choice of this surface can be used to create interesting visual effects. We demonstrate our system by constructing multiperspective images that summarize city blocks, including corners, blocks with deep plazas and other challenging urban situations. 
Light fields 
(except papers on camera arrays) 

Unstructured Light Fields
Abe Davis, Fredo Durand, Marc Levoy Computer Graphics Forum (Proc. Eurographics), Volume 31, Number 2, 2012. 
We present a system for interactively acquiring and rendering light fields using a handheld commodity camera. The main challenge we address is assisting a user in achieving good coverage of the 4D domain despite the challenges of handheld acquisition. We define coverage by bounding reprojection error between viewpoints, which accounts for all 4 dimensions of the light field. We use this criterion together with a recent Simultaneous Localization and Mapping technique to compute a coverage map on the space of viewpoints. We provide users with realtime feedback and direct them toward undersampled parts of the light field. Our system is lightweight and has allowed us to capture hundreds of light fields. We further present a new rendering algorithm that is tailored to the unstructured yet dense data we capture. Our method can achieve piecewisebicubic reconstruction using a triangulation of the captured viewpoints and subdivision rules applied to reconstruction weights.  
Wigner Distributions and How They Relate to the Light Field
Zhengyun Zhang, Marc Levoy IEEE International Conference on Computational Photography (ICCP) 2009 Best Paper award 
In wave optics, the Wigner distribution and its Fourier dual, the ambiguity function, are important tools in optical system simulation and analysis. The light field fulfills a similar role in the computer graphics community. In this paper, we establish that the light field as it is used in computer graphics is equivalent to a smoothed Wigner distribution and that these are equivalent to the raw Wigner distribution under a geometric optics approximation. Using this insight, we then explore two recent contributions: Fourier slice photography in computer graphics and wavefront coding in optics, and we examine the similarity between explanations of them using Wigner distributions and explanations of them using light fields. Understanding this longsuspected equivalence may lead to additional insights and the productive exchange of ideas between the two fields.  
Flexible Multimodal Camera Using a Light Field Architecture
Roarke Horstmeyer, Gary Euliss, Ravindra Athale, Marc Levoy IEEE International Conference on Computational Photography (ICCP) 2009 
We present a modified conventional camera that is able to collect multimodal images in a single exposure. Utilizing a light field architecture in conjunction with multiple filters placed in the pupil plane of a main lens, we are able to digitally reconstruct synthetic images containing specific spectral, polarimetric, and other optically filtered data. The ease with which these filters can be exchanged and reconfigured provides a high degree of flexibility in the type of information that can be collected with each image. This paper explores the various tradeoffs involved in implementing a pinhole array in parallel with a pupilplane filter array to measure multidimensional optical data from a scene. It also examines the design space of a pupilplane filter array layout. Images are shown from different multimodal filter layouts, and techniques to maximize resolution and minimize error in the synthetic images are proposed.  
Combining Confocal Imaging and Descattering
Christian Fuchs, Michael Heinz, Marc Levoy, Hendrik P.A. Lensch Eurographics Symposium on Rendering (EGSR) 2008 
In translucent objects, light paths are affected by multiple scattering, which is polluting any observation. Confocal imaging reduces the inï¬uence of such global illumination effects by carefully focusing illumination and viewing rays from a large aperture to a speciï¬c location within the object volume. The selected light paths still contain some global scattering contributions, though. Descattering based on high frequency illumination serves the same purpose. It removes the global component from observed light paths. We demonstrate that confocal imaging and descattering are orthogonal and propose a novel descattering protocol that analyzes the light transport in a neighborhood of light transport paths. In combination with confocal imaging, our descattering method achieves optical sectioning in translucent media with higher contrast and better resolution.  
General Linear Cameras with Finite Aperture
Andrew Adams and Marc Levoy Eurographics Symposium on Rendering (EGSR) 2007 
A pinhole camera selects a twodimensional set of rays from the fourdimensional light field. Pinhole cameras are a type of general linear camera, defined as planar 2D slices of the 4D light field. Cameras with finite apertures can be considered as the summation of a collection of pinhole cameras. In the limit they evaluate a twodimensional integral of the fourdimensional light field. Hence a general linear camera with finite aperture factors the 4D light field into two integrated dimensions and two imaged dimensions. We present a simple framework for representing these slices and integral projections, based on certain eigenspaces in a twoplane parameterization of the light field. Our framework allows for easy analysis of focus and perspective, and it demonstrates their dual nature. Using our framework, we present analogous taxonomies of perspective and focus, placing within them the familiar perspective, orthographic, crossslit, and bilinear cameras; astigmatic and anastigmatic focus; and several other varieties of perspective and focus.  
Light Fields and Computational Imaging
Marc Levoy IEEE Computer, August 2006 Includes links to the other four feature articles in that issue, which was devoted to computational photography 
A survey of the theory and practice of light field imaging, emphasizing the devices researchers in computer graphics and computer vision have built to capture light fields photographically and the techniques they have developed to compute novel images from them.  
Symmetric Photography : Exploiting Datasparseness in Reflectance Fields
Gaurav Garg, EinoVille (Eddy) Talvala, Marc Levoy, Hendrik P.A. Lensch Proc. 2006 Eurographics Symposium on Rendering 
We present a novel technique called symmetric photography to capture real world reflectance fields. The technique models the 8D reflectance field as a transport matrix between the 4D incident light field and the 4D exitant light field. It is a challenging task to acquire this transport matrix due to its large size. Fortunately, the transport matrix is symmetric and often datasparse. Symmetry enables us to measure the light transport from two sides simultaneously, from the illumination directions and the view directions. Datasparseness refers to the fact that subblocks of the matrix can be well approximated using lowrank representations. We introduce the use of hierarchical tensors as the underlying data structure to capture this datasparseness, specifically through local rank1 factorizations of the transport matrix. Besides providing an efficient representation for storage, it enables fast acquisition of the approximated transport matrix and fast rendering of images from the captured matrix. Our prototype acquisition system consists of an array of mirrors and a pair of coaxial projector and camera. We demonstrate the effectiveness of our system with scenes rendered from reflectance fields that were captured by our system. In these renderings we can change the viewpoint as well as relight using arbitrary incident light fields.  
Dual Photography
Pradeep Sen, Billy Chen, Gaurav Garg, Steve Marschner, Mark Horowitz, Marc Levoy, Hendrik Lensch ACM Transactions on Graphics 24(3), Proc. SIGGRAPH 2005 
We present a novel photographic technique called dual photography, which exploits Helmholtz reciprocity to interchange the lights and cameras in a scene. With a video projector providing structured illumination, reciprocity permits us to generate pictures from the viewpoint of the projector, even though no camera was present at that location. The technique is completely imagebased, requiring no knowledge of scene geometry or surface properties, and by its nature automatically includes all transport paths, including shadows, interreflections and caustics. In its simplest form, the technique can be used to take photographs without a camera; we demonstrate this by capturing a photograph using a projector and a photoresistor. If the photoresistor is replaced by a camera, we can produce a 4D dataset that allows for relighting with 2D incident illumination. Using an array of cameras we can produce a 6D slice of the 8D reflectance field that allows for relighting with arbitrary light fields. Since an array of cameras can operate in parallel without interference, whereas an array of light sources cannot, dual photography is fundamentally a more efficient way to capture such a 6D dataset than a system based on multiple projectors and one camera. As an example, we show how dual photography can be used to capture and relight scenes.  

Light Field Photography with a HandHeld Plenoptic Camera
Ren Ng, Marc Levoy, Mathieu Brédif, Gene Duval, Mark Horowitz, Pat Hanrahan Stanford University Computer Science Tech Report CSTR 200502, April 2005 The refocusing performance of this camera is analyzed in Ren Ng's SIGGRAPH 2005 paper, Fourier Slice Photography. Ren's PhD dissertation, "Digital Light Field Photography," won the 2006 ACM Doctoral dissertation Award. See also his startup company, Lytro. 
This paper presents a camera that samples the 4D light field on its sensor in a single photographic exposure. This is achieved by inserting a microlens array between the sensor and main lens, creating a plenoptic camera. Each microlens measures not just the total amount of light deposited at that location, but how much light arrives along each ray. By resorting the measured rays of light to where they would have terminated in slightly different, synthetic cameras, we can compute sharp photographs focused at different depths. We show that a linear increase in the resolution of images under each microlens results in a linear increase in the sharpness of the refocused photographs. This property allows us to extend the depth of field of the camera without reducing the aperture, enabling shorter exposures and lower image noise. Especially in the macrophotography regime, we demonstrate that we can also compute synthetic photographs from a range of different viewpoints. These capabilities argue for a different strategy in designing photographic imaging systems. To the photographer, the plenoptic camera operates exactly like an ordinary handheld camera. We have used our prototype to take hundreds of light field photographs, and we present examples of portraits, highspeed action and macro closeups.  
Interactive Deformation of Light Fields
Billy Chen, Eyal Ofek, Harry Shum, Marc Levoy Proc. Symposium on Interactive 3D Graphics and Games (I3D) 2005 
We present a software pipeline that enables an animator to deform light fields. The pipeline can be used to deform complex objects, such as furry toys, while maintaining photorealistic quality. Our pipeline consists of three stages. First, we split the light field into sublight fields. To facilitate splitting of complex objects, we employ a novel technique based on projected light patterns. Second, we deform each sublight field. To do this, we provide the animator with controls similar to volumetric freeform deformation. Third, we recombine and render each sublight field. Our rendering technique properly handles visibility changes due to occlusion among sublight fields. To ensure consistent illumination of objects after they have been deformed, our light fields are captured with the light source fixed to the camera, rather than being fixed to the object. We demonstrate our deformation pipeline using synthetic and photographically acquired light fields. Potential applications include animation, interior design, and interactive gaming.  

Synthetic aperture confocal imaging
Marc Levoy, Billy Chen, Vaibhav Vaish, Mark Horowitz, Ian McDowall, Mark Bolas ACM Transactions on Graphics 23(3), Proc. SIGGRAPH 2004 About the relationship between confocal imaging and separation of direct and global reflections in 3D scenes.
An additional test
of
underwater confocal imaging performed in a large
water tank at the Woods Hole Oceanographic Institution.

Confocal microscopy is a family of imaging techniques that employ focused patterned illumination and synchronized imaging to create crosssectional views of 3D biological specimens. In this paper, we adapt confocal imaging to largescale scenes by replacing the optical apertures used in microscopy with arrays of real or virtual video projectors and cameras. Our prototype implementation uses a video projector, a camera, and an array of mirrors. Using this implementation, we explore confocal imaging of partially occluded environments, such as foliage, and weakly scattering environments, such as murky water. We demonstrate the ability to selectively image any plane in a partially occluded environment, and to see further through murky water than is otherwise possible. By thresholding the confocal images, we extract mattes that can be used to selectively illuminate any plane in the scene.  
Light Field Rendering
Marc Levoy and Pat Hanrahan Proc. SIGGRAPH 1996
About the
similarity between this paper and the Lumigraph paper.

A number of techniques have been proposed for flying through scenes by redisplaying previously rendered or digitized views. Techniques have also been proposed for interpolating between views by warping input images, using depth information or correspondences between multiple images. In this paper, we describe a simple and robust method for generating new views from arbitrary camera positions without depth information or feature matching, simply by combining and resampling the available images. The key to this technique lies in interpreting the input images as 2D slices of a 4D function  the light field. This function completely characterizes the flow of light through unobstructed space in a static scene with fixed illumination. We describe a sampled representation for light fields that allows for both efficient creation and display of inward and outward looking views. We have created light fields from large arrays of both rendered and digitized images. The latter are acquired using a video camera mounted on a computercontrolled gantry. Once a light field has been created, new views may be constructed in real time by extracting slices in appropriate directions. Since the success of the method depends on having a high sample rate, we describe a compression system that is able to compress the light fields we have generated by more than a factor of 100:1 with very little loss of fidelity. We also address the issues of antialiasing during creation, and resampling during slice extraction. 
Camera arrays  
Reconstructing Occluded Surfaces
using Synthetic Apertures: Stereo, Focus and Robust Measures
Vaibhav Vaish, Richard Szeliski, C.L. Zitnick, Sing Bing Kang, Marc Levoy Proc. CVPR 2006. 
Most algorithms for 3D reconstruction from images use cost functions based on SSD, which assume that the surfaces being reconstructed are visible to all cameras. This makes it difcult to reconstruct objects which are partially occluded. Recently, researchers working with large camera arrays have shown it is possible to see through occlusions using a technique called synthetic aperture focusing. This suggests that we can design alternative cost functions that are robust to occlusions using synthetic apertures. Our paper explores this design space. We compare classical shape from stereo with shape from synthetic aperture focus. We also describe two variants of multiview stereo based on color medians and entropy that increase robustness to occlusions. We present an experimental comparison of these cost functions on complex light fields, measuring their accuracy against the amount of occlusion.  
High Performance Imaging Using Large Camera Arrays
Bennett Wilburn, Neel Joshi, Vaibhav Vaish, EinoVille (Eddy) Talvala, Emilio Antunez, Adam Barth, Andrew Adams, Marc Levoy, Mark Horowitz ACM Transactions on Graphics 24(3), Proc. SIGGRAPH 2005 
The advent of inexpensive digital image sensors, and the ability to create photographs that combine information from a number of sensed images, is changing the way we think about photography. In this paper, we describe a unique array of 100 custom video cameras that we have built, and we summarize our experiences using this array in a range of imaging applications. Our goal was to explore the capabilities of a system that would be inexpensive to produce in the future. With this in mind, we used simple cameras, lenses, and mountings, and we assumed that processing large numbers of images would eventually be easy and cheap. The applications we have explored include approximating a conventional single center of projection video camera with high performance along one or more axes, such as resolution, dynamic range, frame rate, and/or large aperture, and using multiple cameras to approximate a video camera with a large synthetic aperture. This permits us to capture a video light eld, to which we can apply spatiotemporal view interpolation algorithms in order to digitally simulate time dilation and camera motion. It also permits us to create video sequences using custom nonuniform synthetic apertures.  

Synthetic Aperture Focusing using a
ShearWarp Factorization of the Viewing Transform
Vaibhav Vaish, Gaurav Garg, EinoVille (Eddy) Talvala, Emilio Antunez, Bennett Wilburn, Mark Horowitz, Marc Levoy Proc. Workshop on Advanced 3D Imaging for Safety and Security (A3DISS) 2005 (in conjunction with CVPR 2005) 
Synthetic aperture focusing consists of warping and adding together the images in a 4D light field so that objects lying on a specified surface are aligned and thus in focus, while objects lying off this surface are misaligned and hence blurred. This provides the ability to see through partial occluders such as foliage and crowds, making it a potentially powerful tool for surveillance. If the cameras lie on a plane, it has been previously shown that after an initial homography, one can move the focus through a family of planes that are parallel to the camera plane by merely shifting and adding the images. In this paper, we analyze the warps required for tilted focal planes and arbitrary camera configurations. We characterize the warps using a new rank1 constraint that lets us focus on any plane, without having to perform a metric calibration of the cameras. We also show that there are camera configurations and families of tilted focal planes for which the warps can be factorized into an initial homography followed by shifts. This homography factorization permits these tilted focal planes to be synthesized as efficiently as frontoparallel planes. Being able to vary the focus by simply shifting and adding images is relatively simple to implement in hardware and facilitates a realtime implementation. We demonstrate this using an array of 30 videoresolution cameras; initial homographies and shifts are performed on percamera FPGAs, and additions and a final warp are performed on 3 PCs.  
High Speed Video Using a Dense Camera Array
Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Marc Levoy, Mark Horowitz Proc. CVPR 2004 
We demonstrate a system for capturing multithousand framepersecond (fps) video using a dense array of cheap 30fps CMOS image sensors. A benefit of using a camera array to capture high speed video is that we can scale to higher speeds by simply adding more cameras. Even at extremely high frame rates, our array architecture supports continuous streaming to disk from all of the cameras. This allows us to record unpredictable events, in which nothing occurs before the event of interest that could be used to trigger the beginning of recording. Synthesizing one high speed video sequence using images from an array of cameras requires methods to calibrate and correct those cameras' varying radiometric and geometric properties. We assume that our scene is either relatively planar or is very far away from the camera and that the images can therefore be aligned using projective transforms. We analyze the errors from this assumption and present methods to make them less visually objectionable. We also present a method to automatically color match our sensors. Finally, we demonstrate how to compensate for spatial and temporal distortions caused by the electronic rolling shutter, a common feature of lowend CMOS sensors.  
Using Plane + Parallax for Calibrating Dense Camera Arrays
Vaibhav Vaish, Bennett Wilburn, Neel Joshi, Marc Levoy Proc. CVPR 2004 
A light field consists of images of a scene taken from different viewpoints. Light fields are used in computer graphics for imagebased rendering and synthetic aperture photography, and in vision for recovering shape. In this paper, we describe a simple procedure to calibrate camera arrays used to capture light fields using a plane + parallax framework. Specifically, for the case when the cameras lie on a plane, we show (i) how to estimate camera positions up to an affine ambiguity, and (ii) how to reproject light field images onto a family of planes using only knowledge of planar parallax for one point in the scene. While planar parallax does not completely describe the geometry of the light field, it is adequate for the first two applications which, it turns out, do not depend on having a metric calibration of the light field. Experiments on acquired light fields indicate that our method yields than better results than full metric calibration. 
Polygon meshes  
Geometrically Stable Sampling for the ICP Algorithm
Natasha Gelfand, Leslie Ikemoto, Szymon Rusinkiewicz, and Marc Levoy Proc. 3DIM 2003 
The Iterative Closest Point (ICP) algorithm is a widely used method for aligning threedimensional point sets. The quality of alignment obtained by this algorithm depends heavily on choosing good pairs of corresponding points in the two datasets. If too many points are chosen from featureless regions of the data, the algorithm converges slowly, finds the wrong pose, or even diverges, especially in the presence of noise or miscalibration in the input data. In this paper, we describe a method for detecting uncertainty in pose, and we propose a point selection strategy for ICP that minimizes this uncertainty by choosing samples that constrain potential unstable transformations.  
A Hierarchical Method for Aligning Warped Meshes
Leslie Ikemoto, Natasha Gelfand, and Marc Levoy Proc. 3DIM 2003 
Current alignment algorithms for registering range data captured from a 3D scanner assume that the range data depicts identical geometry taken from different views. However, in the presence of scanner calibration errors, the data will be slightly warped. These warps often cause current alignment algorithms to converge slowly, find the wrong alignment, or even diverge. In this paper, we present a method for aligning warped range data represented by polygon meshes. Our strategy can be characterized as a coarsetofine hierarchical approach, where we assume that since the warp is global, we can compensate for it by treating each mesh as a collection of smaller piecewise rigid sections, which can translate and rotate with respect to each other. We split the meshes subject to several constraints, in order to ensure that the resulting sections converge reliably.  
Filling holes in complex surfaces using volumetric diffusion
James Davis, Steve Marschner, Matt Garr, and Marc Levoy First International Symposium on 3D Data Processing, Visualization, Transmission, June, 2002. Download our Volfill software. 
We address the problem of building watertight 3D models from surfaces that contain holes  for example, sets of range scans that observe most but not all of a surface. We specifically address situations in which the holes are too geometrically and topologically complex to fill using triangulation algorithms. Our solution begins by constructing a signed distance function, the zero set of which defines the surface. Initially, this function is defined only in the vicinity of observed surfaces. We then apply a diffusion process to extend this function through the volume until its zero set bridges whatever holes may be present. If additional information is available, such as knownempty regions of space inferred from the lines of sight to a 3D scanner, it can be incorporated into the diffusion process. Our algorithm is simple to implement, is guaranteed to produce manifold noninterpenetrating surfaces, and is efficient to run on large datasets because computation is limited to areas near holes.  
Efficient Variants of the ICP Algorithm
Szymon Rusinkiewicz and Marc Levoy Proc. 3DIM 2001 
The ICP (Iterative Closest Point) algorithm is widely used for geometric alignment of threedimensional models when an initial estimate of the relative pose is known. Many variants of ICP have been proposed, affecting all phases of the algorithm from the selection and matching of points to the minimization strategy. We enumerate and classify many of these variants, and evaluate their effect on the speed with which the correct alignment is reached. In order to improve convergence for nearlyflat meshes with small features, such as inscribed surfaces, we introduce a new variant based on uniform sampling of the space of normals. We conclude by proposing a combination of ICP variants optimized for high speed. We demonstrate an implementation that is able to align two range images in a few tens of milliseconds, assuming a good initial guess. This capability has potential application to realtime 3D model acquisition and modelbased tracking.  
Fitting Smooth Surfaces to Dense Polygon Meshes
Venkat Krishnamurthy and Marc Levoy Proc. SIGGRAPH 1996 Winner of a 2001 Technical Academy Award. 
Recent progress in acquiring shape from range data permits the acquisition of seamless millionpolygon meshes from physical models. In this paper, we present an algorithm and system for converting dense irregular polygon meshes of arbitrary topology into tensor product Bspline surface patches with accompanying displacement maps. This choice of representation yields a coarse but efficient model suitable for animation and a fine but more expensive model suitable for rendering. The first step in our process consists of interactively painting patch boundaries over a rendering of the mesh. In many applications, interactive placement of patch boundaries is considered part of the creative process and is not amenable to automation. The next step is gridded resampling of each bounded section of the mesh. Our resampling algorithm lays a grid of springs across the polygon mesh, then iterates between relaxing this grid and subdividing it. This grid provides a parameterization for the mesh section, which is initially unparameterized. Finally, we fit a tensor product Bspline surface to the grid. We also output a displacement map for each mesh section, which represents the error between our fitted surface and the spring grid. These displacement maps are images; hence this representation facilitates the use of image processing operators for manipulating the geometric detail of an object. They are also compatible with modern photorealistic rendering systems. Our resampling and fitting steps are fast enough to surface a million polygon mesh in under 10 minutes  important for an interactive system.  
A Volumetric Method for Building Complex Models from Range Images
Brian Curless and Marc Levoy Proc. SIGGRAPH 1996
Download the VripPack library.

A number of techniques have been developed for reconstructing surfaces by integrating groups of aligned range images. A desirable set of properties for such algorithms includes: incremental updating, representation of directional uncertainty, the ability to fill gaps in the reconstruction, and robustness in the presence of outliers. Prior algorithms possess subsets of these properties. In this paper, we present a volumetric method for integrating range images that possesses all of these properties. Our volumetric representation consists of a cumulative weighted signed distance function. Working with one range image at a time, we first scanconvert it to a distance function, then combine this with the data already acquired using a simple additive scheme. To achieve space efficiency, we employ a runlength encoding of the volume. To achieve time efficiency, we resample the range image to align with the voxel grid and traverse the range and voxel scanlines synchronously. We generate the final manifold by extracting an isosurface from the volumetric grid. We show that under certain assumptions, this isosurface is optimal in the least squares sense. To fill gaps in the model, we tessellate over the boundaries between regions seen to be empty and regions never observed. Using this method, we are able to integrate a large number of range images (as many as 70) yielding seamless, highdetail models of up to 2.6 million triangles.  
Zippered Polygon Meshes from Range Images
Greg Turk and Marc Levoy Proc. SIGGRAPH 1994
Download the ZipPack library.

Range imaging offers an inexpensive and accurate means for digitizing the shape of threedimensional objects. Because most objects self occlude, no single range image suffices to describe the entire object. We present a method for combining a collection of range images into a single polygonal mesh that completely describes an object to the extent that it is visible from the outside. The steps in our method are: 1) align the meshes with each other using a modified iterated closestpoint algorithm, 2) zipper together adjacent meshes to form a continuous surface that correctly captures the topology of the object, and 3) compute local weighted averages of surface positions on all meshes to form a consensus surface geometry. Our system differs from previous approaches in that it is incremental; scans are acquired and combined one at a time. This approach allows us to acquire and combine large numbers of scans with minimal storage overhead. Our largest models contain up to 360,000 triangles. All the steps needed to digitize an object that requires up to 10 range scans can be performed using our system with five minutes of user interaction and a few hours of compute time. We show two models created using our method with range data from a commercial rangefinder that employs laser stripe technology. 
3D scanning  
RealTime 3D Model Acquisition
Szymon Rusinkiewicz, Olaf HallHolt, and Marc Levoy ACM Transactions on Graphics 21(3), Proc. SIGGRAPH 2002 
The digitization of the 3D shape of real objects is a rapidly expanding field, with applications in entertainment, design, and archaeology. We propose a new 3D model acquisition system that permits the user to rotate an object by hand and see a continuouslyupdated model as the object is scanned. This tight feedback loop allows the user to find and fill holes in the model in real time, and determine when the object has been completely covered. Our system is based on a 60 Hz. structuredlight rangefinder, a realtime variant of ICP (iterative closest points) for alignment, and pointbased merging and rendering algorithms. We demonstrate the ability of our prototype to scan objects faster and with greater ease than conventional model acquisition pipelines.  
An Assessment of Laser Range Measurement of Marble Surfaces
Guy Godin, J.Angelo Beraldin, Marc Rioux, Marc Levoy, Luc Cournoyer, and Francois Blais Fifth Conference on optical 3D measurement techniques, 2001. 
An important application of laser range sensing is found in the 3D scanning and modelling of heritage collections, and of sculptures in particular. Since a significant proportion of the statues in the world" s museums is composed of marble, the optical properties of this material under laser range sensing need to be understood. Marble's translucency and heterogeneous structure produce significant bias and increased noise in the geometric measurements. Experiments on a sample of Carrara Statuario marble highlight the relationship between the laser spot diameter and the estimated noise levels in the geometric measurements. A bias in the depth measurement is also observed. These phenomena are believed to result from scattering on the surface of small crystals at or near the surface.  
Better Optical Triangulation Through Spacetime Analysis
Brian Curless and Marc Levoy Proc. ICCV 1995 
Optical triangulation range scanners are finding wide usage in industrial inspection, metrology, medicine, and computer graphics. The standard methods for extracting range data from structured light reflecting off of an object are accurate only for planar surfaces of uniform reflectance illuminated by an incoherent source. Using these methods, curved surfaces, discontinuous surfaces, and surfaces of varying reflectance cause systematic distortions of the range data. Coherent light sources such as lasers introduce speckle artifacts that further degrade the data. We present a new ranging method based on analyzing the time evolution of the structured light reflections. Using our spacetime analysis, we can correct for each of these artifacts, thereby attaining significantly higher accuracy using existing technology. We present results that demonstrate the validity of our method using a commercial laser stripe triangulation scanner. 
Cultural heritage  

Fragments of the City: Stanford's Digital Forma Urbis Romae Project
David Koller, Jennifer Trimble, Tina Najbjerg, Natasha Gelfand, Marc Levoy Proc. Third Williams Symposium on Classical Architecture, Journal of Roman Archaeology supplement, 2006.
Here's a web site about the project.

In this article, we summarize the Stanford Digital Forma Urbis Project work since it began in 1999 and discuss its implications for representing and imaging Rome. First, we digitized the shape and surface of every known fragment of the Severan Marble Plan using laser range scanners and digital color cameras; the raw data collected consists of 8 billion polygons and 6 thousand color images, occupying 40 gigabytes. These range and color data have been assembled into a set of 3D computer models and highresolution photographs  one for each of the 1,186 marble fragments. Second, this data has served in the development of fragment matching algorithms; to date, these have resulted in over a dozen highly probable, new matches. Third, we have gathered the Project's 3D models and color photographs into a relational database and supported them with archaeological documentation and an uptodate scholarly apparatus for each fragment. This database is intended to be a public, webbased, research and study tool for scholars, students and interested members of the general public alike; as of this writing, 400 of the surviving fragments are publicly available, and the full database is scheduled for release in 2005. Fourth, these digital and archaeological data, and their availability in a hypertext format, have the potential to broaden the scope and type of research done on this ancient map by facilitating a range of typological, representational and urbanistic analyses of the map, some of which are proposed here. In these several ways, we hope that this Project will contribute to new ways of imaging Rome.  
Computeraided Reconstruction and New Matches in the Forma Urbis Romae
David Koller and Marc Levoy Proc. Formae Urbis Romae  Nuove Scoperte, Bullettino Della Commissione Archeologica Comunale di Roma, 2006. See the extra links listed in the next paper up. 
In this paper, we describe our efforts to apply computeraided reconstruction algorithms to find new matches and positionings among the fragments of the Forma Urbis Romae. First, we review the attributes of the fragments that may be useful clues for automated reconstruction. Then, we describe several different specific methods that we have developed which make use of geometric computation capabilities and digital fragment representations to suggest new matches. These methods are illustrated with a number of new proposed fragment joins and placements that have been generated from our computeraided reconstruction process.  
Unwrapping and Visualizing Cuneiform Tablets
Sean Anderson and Marc Levoy IEEE Computer Graphics and Applications, Vol. 22, No. 6, November/December, 2002, pp. 8288. 
Thousands of historically revealing cuneiform clay tablets, which were inscribed in Mesopotamia millenia ago, still exist today. Visualizing cuneiform writing is important when deciphering what is written on the tablets. It is also important when reproducing the tablets in papers and books. Unfortunately, scholars have found photographs to be an inadequate visualization tool, for two reasons. First, the text wraps around the sides of some tablets, so a single viewpoint is insufficient. Second, a raking light will illuminate some textual features, but will leave others shadowed or invisible because they are either obscured by features on the tablet or are nearly aligned with the lighting direction. We present solutions to these problems by first creating a highresolution 3D computer model from laser range data, then unwrapping and flattening the inscriptions on the model to a plane, allowing us to represent them as a scalar displacement map, and finally, rendering this map nonphotorealistically using accessibility and curvature coloring. The output of this semiautomatic process enables all of a tablet's text to be perceived in a single concise image. Our technique can also be applied to other types of inscribed surfaces, including basreliefs.  
The Digital Michelangelo Project: 3D scanning of large statues,
Marc Levoy, Kari Pulli, Brian Curless, Szymon Rusinkiewicz, David Koller, Lucas Pereira,
Matt Ginzton,
Sean Anderson, James Davis, Jeremy Ginsberg, Jonathan Shade, and Duane Fulk
Proc. SIGGRAPH 2000 Other papers on this project were published in 3DIM '99, Eurographics '99, EVA '99, and as a chapter in the book Exploring David, Giunti Press, March 2004. See also this web page about the book.
Here's a web site about the project.

We describe a hardware and software system for digitizing the shape and color of large fragile objects under nonlaboratory conditions. Our system employs laser triangulation rangefinders, laser timeofflight rangefinders, digital still cameras, and a suite of software for acquiring, aligning, merging, and viewing scanned data. As a demonstration of this system, we digitized 10 statues by Michelangelo, including the wellknown figure of David, two building interiors, and all 1,163 extant fragments of the Forma Urbis Romae, a giant marble map of ancient Rome. Our largest single dataset is of the David  2 billion polygons and 7,000 color images. In this paper, we discuss the challenges we faced in building this system, the solutions we employed, and the lessons we learned. We focus in particular on the unusual design of our laser triangulation scanner and on the algorithms and software we developed for handling very large scanned models.  
Digitizing the Forma Urbis Romae
Marc Levoy Siggraph Digital Campfire on Computers and Archeology, April, 2000.
Here's a web site about the project.

Recent improvements in laser rangefinder technology, together with algorithms for combining multiple range and color images, allow us to reliably and accurately digitize the external shape and surface characteristics of many physical objects. Examples include machine parts, design models, toys, and artistic and cultural artifacts. As an application of this technology, I and a team of 30 faculty, staff, and students from Stanford University and the University of Washington spent the 199899 academic year in Italy scanning the sculptures and architecture of Michelangelo. During our year abroad, we also became involved in several side projects. One of these was the digitization of the Forma Urbis Romae... 
Texture synthesis  
OrderIndependent Texture Synthesis
LiYi Wei and Marc Levoy Technical Report TR200201, Computer Science Department, Stanford University, April, 2002 
Searchbased texture synthesis algorithms are sensitive to the order in which texture samples are generated; different synthesis orders yield different textures. Unfortunately, most polygon rasterizers and ray tracers do not guarantee the order with which surfaces are sampled. To circumvent this problem, textures are synthesized beforehand at some maximum resolution and rendered using texture mapping. We describe a searchbased texture synthesis algorithm in which samples can be generated in arbitrary order, yet the resulting texture remains identical. The key to our algorithm is a pyramidal representation in which each texture sample depends only on a fixed number of neighboring samples at each level of the pyramid. The bottom (coarsest) level of the pyramid consists of a noise image, which is small and predetermined. When a sample is requested by the renderer, all samples on which it depends are generated at once. Using this approach, samples can be generated in any order. To make the algorithm efficient, we propose storing texture samples and their dependents in a pyramidal cache. Although the first few samples are expensive to generate, there is substantial reuse, so subsequent samples cost less. Fortunately, most rendering algorithms exhibit good coherence, so cache reuse is high.  
Texture Synthesis over Arbitrary Manifold Surfaces
LiYi Wei and Marc Levoy Proc. SIGGRAPH 2001 
Algorithms exist for synthesizing a wide variety of textures over rectangular domains. However, it remains difficult to synthesize general textures over arbitrary manifold surfaces. In this paper, we present a solution to this problem for surfaces defined by dense polygon meshes. Our solution extends Wei and Levoy's texture synthesis method by generalizing their definition of search neighborhoods. For each mesh vertex, we establish a local parameterization surrounding the vertex, use this parameterization to create a small rectangular neighborhood with the vertex at its center, and search a sample texture for similar neighborhoods. Our algorithm requires as input only a sample texture and a target model. Notably, it does not require specification of a global tangent vector field; it computes one as it goes  either randomly or via a relaxation process. Despite this, the synthesized texture contains no discontinuities, exhibits low distortion, and is perceived to be similar to the sample texture. We demonstrate that our solution is robust and is applicable to a wide range of textures.  
Fast Texture Synthesis using Treestructured Vector Quantization
LiYi Wei and Marc Levoy Proc. SIGGRAPH 2000 
Texture synthesis is important for many applications in computer graphics, vision, and image processing. However, it remains difficult to design an algorithm that is both efficient and capable of generating high quality results. In this paper, we present an efficient algorithm for realistic texture synthesis. The algorithm is easy to use and requires only a sample texture as input. It generates textures with perceived quality equal to or better than those produced by previous techniques, but runs two orders of magnitude faster. This permits us to apply texture synthesis to problems where it has traditionally been considered impractical. In particular, we have applied it to constrained synthesis for image editing and temporal texture generation. Our algorithm is derived from Markov Random Field texture models and generates textures through a deterministic searching process. We accelerate this synthesis process using treestructured vector quantization. 
Image synthesis  
A Practical Model for Subsurface Light Transport
Henrik Wann Jensen, Steve Marschner, Marc Levoy, and Pat Hanrahan Proc. SIGGRAPH 2001 Winner of a 2004 Technical Academy Award. Although I helped initiate this research, I contributed less to the final paper than Henrik, Steve, and Pat, so I asked the academy not to cite me in the award. 
This paper introduces a simple model for subsurface light transport in translucent materials. The model enables efficient simulation of effects that BRDF models cannot capture, such as color bleeding within materials and diffusion of light across shadow boundaries. The technique is efficient even for anisotropic, highly scattering media that are expensive to simulate using existing methods. The model combines an exact solution for single scattering with a dipole point source diffusion approximation for multiple scattering. We also have designed a new, rapid imagebased measurement technique for determining the optical properties of translucent materials. We validate the model by comparing predicted and measured values and show how the technique can be used to recover the optical properties of a variety of materials, including milk, marble, and skin. Finally, we describe sampling techniques that allow the model to be used within a conventional ray tracer.  
Synthetic Texturing Using Digital Filters
Eliot Feibush, Marc Levoy, Robert Cook Proc. SIGGRAPH 1980 
Aliasing artifacts are eliminated from computer generated images of textured polygons by equivalently filtering both the texture and the edges of the polygons. Different filters can be easily compared because the weighting functions that define the shape of the filters are precomputed and stored in lookup tables. A polygon subdivision algorithm removes the hidden surfaces so that the polygons are rendered sequentially to minimize accessing the texture definition files. An implementation of the texture rendering procedure is described. 
Volume rendering and medical imaging 

Application of Zernike polynomials towards accelerated adaptive focusing
of transcranial high intensity focused ultrasound,
Elena A. Kaye, Yoni Hertzberg, Michael Marx, Beat Werner, Gil Navon, Marc Levoy, and Kim Butts Pauly, Journal of Medical Physics, Vol. 39, No. 6254 (2012). 
Purpose: To study the phase aberrations produced by human skulls during
transcranial magnetic resonance imaging guided focused ultrasound surgery
(MRgFUS), to demonstrate the potential of Zernike polynomials (ZPs) to
accelerate the adaptive focusing process, and to investigate the benefits of
using phase corrections obtained in previous studies to provide the initial
guess for correction of a new data set.
Conclusions: The application of ZPs to phase aberration correction was shown to be beneficial for adaptive focusing of transcranial ultrasound. The skullbased phase aberrations were found to be well approximated by the number of ZP modes representing only a fraction of the number of ele ments in the hemispherical transducer. Implementing the initial phase aberration estimate together with Zernikebased algorithm can be used to improve the robustness and can potentially greatly increase the viability of MRARFIbased focusing for a clinical transcranial MRgFUS therapy.  
FeatureBased Volume Metamorphosis
Apostolos Lerios, Chase D. Garfinkle, and Marc Levoy Proc. SIGGRAPH 1995 
Image metamorphosis, or image morphing, is a popular technique for creating a smooth transition between two images. For synthetic images, transforming and rendering the underlying threedimensional (3D) models has a number of advantages over morphing between two prerendered images. In this paper we consider 3D metamorphosis applied to volumebased representations of objects. We discuss the issues which arise in volume morphing and present a method for creating morphs. Our morphing method has two components: first a warping of the two input volumes, then a blending of the resulting warped volumes. The warping component, an extension of Beier and Neely's image warping technique to 3D, is featurebased and allows fine user control, thus ensuring realistic looking intermediate objects. In addition, our warping method is amenable to an efficient approximation which gives a 50 times speedup and is computable to arbitrary accuracy. Also, our technique corrects the ghosting problem present in Beier and Neely's technique. The second component of the morphing process, blending, is also under user control; this guarantees smooth transitions in the renderings.  
Fast Volume Rendering Using a ShearWarp Factorization of the Viewing Transformation
Philippe Lacroute and Marc Levoy Proc. SIGGRAPH 1994
Download the VolPack library.

Several existing volume rendering algorithms operate by factoring the viewing transformation into a 3D shear parallel to the data slices, a projection to form an intermediate but distorted image, and a 2D warp to form an undistorted final image. We extend this class of algorithms in three ways. First, we describe a new objectorder rendering algorithm based on the factorization that is significantly faster than published algorithms without loss of image quality. The algorithm achieves its speed by exploiting coherence in the volume data and the intermediate image. The shearwarp factorization permits us to traverse both the volume and the intermediate image data structures in synchrony during rendering, using both types of coherence to reduce work. Our implementation running on an SGI Indigo workstation renders a 256^3 voxel medical data set in one second. Our second extension is a derivation of the factorization for perspective viewing transformations, and we show how our rendering algorithm can support this extension. Third, we introduce a data structure for encoding spatial coherence in unclassified volumes (i.e. scalar fields with no precomputed opacity). When combined with our shearwarp rendering algorithm this data structure allows us to classify and render a 256^3 voxel volume in three seconds. Our algorithms employ runlength encoding, minmax pyramids, and multidimensional summed area tables. The method extends readily to support mixed volumes and geometry.  
Frequency Domain Volume Rendering
Takashi Totsuka and Marc Levoy Proc. SIGGRAPH 1993 
The Fourier projectionslice theorem allows projections of volume data to be generated in O(n^2 log n) time for a volume of size n^3. The method operates by extracting and inverse Fourier transforming 2D slices from a 3D frequency domain representation of the volume. Unfortunately, these projections do not exhibit the occlusion that is characteristic of conventional volume renderings. We present a new frequency domain volume rendering algorithm that replaces much of the missing depth and shape cues by performing shading calculations in the frequency domain during slice extraction. In particular, we demonstrate frequency domain methods for computing linear or nonlinear depth cueing and directional diffuse reflection. The resulting images can be generated an order of magnitude faster than volume renderings and may be more useful for many applications.  
Volume Rendering using the Fourier ProjectionSlice Theorem
Marc Levoy Proc. Graphics Interface 1992 
The Fourier projectionslice theorem states that the inverse transform of a slice extracted from the frequency domain representation of a volume yields a projection of the volume in a direction perpendicular to the slice. This theorem allows the generation of attenuationonly renderings of volume data in O(n^2 log N) time for a volume of size n^3. In this paper, we show how more realistic renderings can be generated using a class of shading models whose terms are Fourier projections. Models are derived for rendering depth cueing by linear attenuation of variable energy emitters and for rendering directional shading by Lambertian reflection with hemispherical illumination. While the resulting images do not exhibit the occlusion that is characteristic of conventional volume rendering, they provide sufficient depth and shape cues to give a strong illusion that occlusion exists.  
A Hybrid Ray Tracer for Rendering Polygon and Volume Data
Marc Levoy IEEE Computer Graphics and Applications, Vol. 10, No. 2, March, 1990, pp. 3340. 
Volume rendering is a technique for visualizing sampled functions of three spatial dimensions by computing 2D projections of a colored semitransparent volume. This paper addresses the problem of extending volume rendering to handle polygonally defined objects. The solution proposed is a hybrid ray tracing algorithm. Rays are simultaneously cast through a set of polygons and a volume data array, samples of each are drawn at equally spaced intervals along the rays, and the resulting colors and opacities are composited together in depthsorted order. To avoid aliasing of polygonal edges at modest computational expense, a form of selective supersampling is employed. To avoid errors in visibility at polygonvolume intersections, volume samples lying immediately in front of and behind polygons are given special treatment. The cost, image quality, and versatility of the algorithm are evaluated using data from 3D medical imaging applications.  
Volume Rendering by Adaptive Refinement
Marc Levoy The Visual Computer, Vol. 6, No. 1, February, 1990, pp. 27. 
Volume rendering is a technique for visualizing sampled scalar functions of three spatial dimensions by computing 2D projections of a colored semitransparent gel. This paper presents a volume rendering algorithm in which image quality is adaptively refined over time. An initial image is generated by casting a small number of rays into the data, less than one ray per pixel, and interpolating between the resulting colors. Subsequent images are generated by alternately casting more rays and interpolating. The usefulness of these rays is maximized by distributing them according to measures of local image complexity. Examples from two applications are given: molecular graphics and medical imaging.  
Efficient Ray Tracing of Volume Data
Marc Levoy ACM Transactions on Graphics, Vol. 9, No. 3, July, 1990, pp. 245261. 
Volume rendering is a family of techniques for visualizing sampled scalar or vector fields of three spatial dimensions without fitting geometric primitives to the data. A subset of these techniques generate images by computing 2D projections of a colored semitransparent volume, where the color and opacity at each point is derived from the data using local operators. Since all voxels participate in the generation of each image, rendering time grows linearly with the size of the dataset. This paper presents a fronttoback imageorder volume rendering algorithm and discusses two techniques for improving its performance. The first technique employs a pyramid of binary volumes to encode spatial coherence present in the data, and the second technique uses an opacity threshold to adaptively terminate ray tracing. Although the actual time saved depends on the data, speedups of an order of magnitude have been observed for datasets of useful size and complexity. Examples from two applications are given: medical imaging and molecular graphics.  
Volume Rendering in Radiation Treatment Planning
Marc Levoy, Henry Fuchs, Stephen M. Pizer, Julian Rosenman, Edward L. Chaney, George W. Sherouse, Victoria Interrante, and Jeffrey Kiel First Conference on Visualization in Biomedical Computing, IEEE, May, 1990 
Successful treatment planning in radiation therapy depends in part on understanding the spatial relationship between patient anatomy and the distribution of radiation dose. We present several visualizations based on volume rendering that offer potential solutions to this problem. The visualizations employ region boundary surfaces to display anatomy, polygonal meshes to display treatment beams, and isovalue contour surfaces to display dose. To improve perception of spatial relationships, we use metallic shading, surface and solid texturing, synthetic fog, shadows, and other artistic devices. Also outlined is a method based on 3D mip maps for efficiently generating perspective volume renderings and beam'seye views. To evaluate the efficacy of these visualizations, we are building a radiotherapy planning system based on a Cray YMP and the PixelPlanes 5 raster display engine. The system will allow interactive manipulation of beam geometry, dosimetry, shading, and viewing parameters, and will generate volume renderings of anatomy and dose in real time.  
Display of Surfaces from Volume Data
Marc Levoy PhD Dissertation, Tech Report TR89022, University of North Carolina at Chapel Hill, May, 1989. 
Volume rendering is a technique for visualizing sampled scalar fields of three spatial dimensions without fitting geometric primitives to the data. A color and a partial transparency are computed for each data sample, and images are formed by blending together contributions made by samples projecting to the same pixel on the picture plane. Quantization and aliasing artifacts are reduced by avoiding thresholding during data classification and by carefully resampling the data during projection. This thesis presents an imageorder volume rendering algorithm, demonstrates that it generates images of comparable quality to existing objectorder algorithms, and offers several improvements. In particular, methods are presented for displaying isovalue contour surfaces and region boundary surfaces, for rendering mixtures of analytically defined geometry and sampled fields, and for adding shadows and textures. Three techniques for reducing rendering cost are also presented: hierarchical spatial enumeration, adaptive termination of ray tracing, and adaptive image sampling. Case studies from two applications are given: medical imaging and molecular graphics.  
Display of Surfaces from Volume Data
Marc Levoy IEEE Computer Graphics and Applications, Vol. 8, No. 3, May, 1988 About the error in this paper. 
The application of volume rendering techniques to the display of surfaces from sampled scalar functions of three spatial dimensions is explored. Fitting of geometric primitives to the sampled data is not required. Images are formed by directly shading each sample and projecting it onto the picture plane. Surface shading calculations are performed at every voxel with local gradient vectors serving as surface normals. In a separate step, surface classification operators are applied to obtain a partial opacity for every voxel. Operators that detect isovalue contour surfaces and region boundary surfaces are presented. Independence of shading and classification calculations insures an undistorted visualization of 3D shape. Nonbinary classification operators insure that small or poorly defined features are not lost. The resulting colors and opacities are composited from back to front along viewing rays to form an image. The technique is simple and fast, yet displays surfaces exhibiting smooth silhouettes and few other aliasing artifacts. The use of selective blurring and supersampling to further improve image quality is also described. Examples from two applications are given: molecular graphics and medical imaging. 
Pointbased rendering  
Streaming QSplat: A Viewer for Networked Visualization of Large, Dense Models
Szymon Rusinkiewicz and Marc Levoy Proc. 2001 Symposium on Interactive 3D Graphics 
Steady growth in the speeds of network links and graphics accelerator cards has brought increasing interest in streaming transmission of threedimensional data sets. We demonstrate how streaming visualization can be made practical for data sets containing hundreds of millions of samples. Our system is based on QSplat, a multiresolution rendering system for dense polygon meshes that employs a bounding sphere hierarchy data structure and splat rendering. We show how to incorporate viewdependent progressive transmission into QSplat, by having the client request visible portions of the model in order from coarse to fine resolution. In addition, we investigate interaction techniques for improving the effectiveness of streaming data visualization. In particular, we explore colorcoding streamed data by resolution, examine the order in which data should be transmitted in order to minimize visual distraction, and propose tools for giving the user fine control over download order.  
QSplat: A Multiresolution Point Rendering System for Large Meshes
Szymon Rusinkiewicz and Marc Levoy Proc. SIGGRAPH 2000
Download our QSplat software.

Advances in 3D scanning technologies have enabled the practical creation of meshes with hundreds of millions of polygons. Traditional algorithms for display, simplification, and progressive transmission of meshes are impractical for data sets of this size. We describe a system for representing and progressively displaying these meshes that combines a multiresolution hierarchy based on bounding spheres with a rendering system based on points. A single data structure is used for view frustum culling, backface culling, levelofdetail selection, and rendering. The representation is compact and can be computed quickly, making it suitable for large data sets. Our implementation, written for use in a largescale 3D digitization project, launches quickly, maintains a usersettable interactive frame rate regardless of object complexity or camera position, yields reasonable image quality during motion, and refines progressively when idle to a high final image quality. We have demonstrated the system on scanned models containing hundreds of millions of samples.  
The Use of Points as a Display Primitive
Marc Levoy and Turner Whitted UNCChapel Hill Computer Science Technical Report #85022, January, 1985 Here is a book chapter I wrote outlining the early history of pointbased graphics, and discussing the pros and cons of using points as a display primitive. The chapter appears in revised form in Markus Gross's PointBased Graphics, Morgan Kaufmann, 1987. 
As the visual complexity of computer generated scenes continues to increase, the use of classical modeling primitives as display primitives becomes less appealing. Customization of display algorithms, the conflict between object order and image order rendering and the reduced usefulness of object coherence in the presence of extreme complexity are all contributing factors. This paper proposes to decouple the modeling geometry from the rendering process by introducing the notion of points as a universal metaprimitive. We first demonstrate that a discrete array of points arbitrarily displaced in space using a tabular array of perturbations can be rendered as a continuous threedimensional surface. This solves the longstanding problem of producing correct silhouette edges for bump mapped textures. We then demonstrate that a wide class of geometrically defined objects, including both flat and curved surfaces, can be converted into points. The conversion can proceed in object order, facilitating the display of procedurally defined objects. The rendering algorithm is simple and requires no coherence in order to be efficient. It will also be shown that the points may be rendered in random order, leading to several interesting and unexpected applications of the technique. 
Systems and architectures  

Protected Interactive 3D Graphics Via Remote Rendering
David Koller, Michael Turitzin, Marc Levoy, Marco Tarini, Giuseppe Croccia, Paolo Cignoni, Roberto Scopigno ACM Transactions on Graphics 23(3), Proc. SIGGRAPH 2004 A shortened version of this paper was the cover article in the June 2005 issue of Communications of the ACM (CACM). Download our ScanView software. 
Valuable 3D graphical models, such as highresolution digital scans of cultural heritage objects, may require protection to prevent piracy or misuse, while still allowing for interactive display and manipulation by a widespread audience. We have investigated techniques for protecting 3D graphics content, and we have developed a remote rendering system suitable for sharing archives of 3D models while protecting the 3D geometry from unauthorized extraction. The system consists of a 3D viewer client that includes lowresolution versions of the 3D models, and a rendering server that renders and returns images of highresolution models according to client requests. The server implements a number of defenses to guard against 3D reconstruction attacks, such as monitoring and limiting request streams, and slightly perturbing and distorting the rendered images. We consider several possible types of reconstruction attacks on such a rendering server, and we examine how these attacks can be defended against without excessively compromising the interactive experience for nonmalicious users.  
PolygonAssisted JPEG and MPEG Compression of Synthetic Images
Marc Levoy Proc. SIGGRAPH 1995 
Recent advances in realtime image compression and decompression hardware make it possible for a highperformance graphics engine to operate as a rendering server in a networked environment. If the client is a lowend workstation or settop box, then the rendering task can be split across the two devices. In this paper, we explore one strategy for doing this. For each frame, the server generates a highquality rendering and a lowquality rendering, subtracts the two, and sends the difference in compressed form. The client generates a matching low quality rendering, adds the decompressed difference image, and displays the composite. Within this paradigm, there is wide latitude to choose what constitutes a highquality versus lowquality rendering. We have experimented with textured versus untextured surfaces, fine versus coarse tessellation of curved surfaces, Phong versus Gouraud interpolated shading, and antialiased versus nonantialiased edges. In all cases, our polygonassisted compression looks subjectively better for a fixed network bandwidth than compressing and sending the highquality rendering. We describe a software simulation that uses JPEG and MPEG1 compression, and we show results for a variety of scenes.  
Parallel Visualization Algorithms: Performance and Architectural Implications
Jaswinder Pal Singh, Anoop Gupta, and Marc Levoy IEEE Computer, Vol. 27, No. 7, July 1994 
Several recent algorithms have substantially sped up complex and timeconsuming visualization tasks. In particular, novel algorithms for radiosity computation [1] and volume rendering [2][3] have demonstrated performance far superior to earlier methods. Despite these advances, visualization of complex scenes or data sets remains computationally expensive. Rendering a 256by256by256 voxel volume data set takes about 5 seconds per frame on a 100 MHz Silicon Graphics Indigo workstation using the raycasting algorithm in [2], and about a second per frame using a new shearwarp algorithm [3]. These times are much larger than the 0.03 seconds per frame required for realtime rendering or the 0.1 seconds per frame required for interactive rendering. Realistic radiosity and ray tracing computations are much more timeconsuming...  
Volume Rendering on Scalable SharedMemory MIMD Architectures
Jason Nieh and Marc Levoy Proc. 1992 Workshop on Volume Visualization 
Volume rendering is a useful visualization technique for understanding the large amounts of data generated in a variety of scientific disciplines. Routine use of this technique is currently limited by its computational expense. We have designed a parallel volume rendering algorithm for MIMD architectures based on ray tracing and a novel task queue image partitioning technique. The combination of ray tracing and MIMD architectures allows us to employ algorithmic optimizations such as hierarchical opacity enumeration, early ray termination, and adaptive image sampling. The use of task queue image partitioning makes these optimizations efficient in a parallel framework. We have implemented our algorithm on the Stanford DASH Multiprocessor, a scalable sharedmemory MIMD machine. Its single addressspace and coherent caches provide programming ease and good performance for our algorithm. With only a few days of programming effort, we have obtained nearly linear speedups and near realtime frame update rates on a 48 processor machine. Since DASH is constructed from Silicon Graphics multiprocessors, our code runs on any Silicon Graphics workstation without modification. 
User interfaces  
3D Painting on Scanned Surfaces
Maneesh Agrawala, Andrew Beers, and Marc Levoy Proc. 1995 Symposium on Interactive 3D Graphics 
We present an intuitive interface for painting on unparameterized threedimensional polygon meshes using a 6D Polhemus space tracker as an input device. Given a physical object we first acquire its surface geometry using a Cyberware scanner. We then treat the sensor of the space tracker as a paintbrush. As we move the sensor over the surface of the physical object we color the corresponding locations on the scanned mesh. The physical object provides a natural forcefeedback guide for painting on the mesh, making it intuitive and easy to accurately place color on the mesh.  
Spreadsheets for Images
Marc Levoy Proc. SIGGRAPH 1994 
We describe a data visualization system based on spreadsheets. Cells in our spreadsheet contain graphical objects such as images, volumes, or movies. Cells may also contain widgets such as buttons, sliders, or curve editors. Objects are displayed in miniature inside each cell. Formulas for cells are written in a generalpurpose programming language (Tcl) augmented with operators for array manipulation, image processing, and rendering. Compared to flow chart visualization systems, spreadsheets are more expressive, more scalable, and easier to program. Compared to conventional numerical spreadsheets, spreadsheets for images pose several unique design problems: larger formulas, longer computation times, and more complicated intercell dependencies. In response to these problems, we have extended the spreadsheet paradigm in three ways: formulas can display their results anywhere in the spreadsheet, cells can be selectively disabled, and multiple cells can be edited at once. We discuss these extensions and their implications, and we also point out some unexpected uses for our spreadsheets: as a visual database browser, as a graphical user interface builder, as a smart clipboard for the desktop, and as a presentation tool.  
GazeDirected Volume Rendering
Marc Levoy and Ross Whitaker Proc. 1990 Symposium on Interactive 3D Graphics 
We direct our gaze at an object by rotating our eyes or head until the object's projection falls on the fovea, a small region of enhanced spatial acuity near the center of the retina. In this paper. we explore methods for encororating gaze direction into rendering algorithms. This approach permits generation of images exhibiting continuously varying resolution, and allows these images to be displayed on conventional television monitors. Specifically. we describe a ray tracer for volume data in which the number of rays cast per unit area on the image plane and the number of samples drawn per unit length along each ray are functions of local retinal acuity. We also describe an implementation using 2D and 3D mip maps, an eye tracker, and the PixelPlanes 5 massively parallel raster display system. Pending completion of PixelPlanes 5 in the spring of 1990. we have written a simulator on a Stellar graphics supercomputer. Preliminary results indicate that while users are aware of the variableresolution structure of the image, the highresolution sweet spot follows their gaze well and promises to be useful in practice. 
Cartoon animation 

Merging and Transformation of Raster Images for Cartoon Animation
Bruce A. Wallace Proc. SIGGRAPH 1981
About the role
of this paper in the history of digital compositing.

The task of assembling drawings and backgrounds together for each frame of an animated sequence has always been a tedious undertaking using conventional animation camera stands, and has contributed to the high cost of animation production. In addition, the physical limitations that these camera stands place on the manipulation of the individual artwork levels restricts the total imagemaking possibilities afforded by traditional cartoon animation. Documents containing all frame assembly information must also be maintained. This paper presents several computer methods for assisting in the production of cartoon animation, both to reduce expense and to improve the overall quality. Merging is the process of combining levels of artwork into a final composite frame using digital computer graphics. The term "level" refers to a single painted drawing (cel) or background. A method for the simulation of any hypothetical animation camera setup is introduced. A technique is presented for reducing the total number of merges by retaining merged groups consisting of individual levels which do not change over successive frames. Lastly, a sequenceediting system, which controls precise definition of an animated sequence, is described. Also discussed is the actual method for merging any two adjacent levels and several computational and storage optimizations to speed the process.  
Area Flooding Algorithms
Marc Levoy SIGGRAPH 1981 TwoDimensional Computer Animation course notes. 
This paper describes the area flooder (equivalent to the paint bucket in Adobe Photoshop) used in the HannaBarbera Productions ComputerAssisted Animation System, which was in production from the mid1980s until 1996. Descriptions are included in the paper of both a hardedged flooder (stop when the color changes) and a softedge flooder (stop when the color gradient exceeds a given threshold). At the time this was the fastest area flooder known, at least to the small community of programmers who worked in this area. This speed was due partly to the algorithm itself (so it's still a fast algorithm), and partly due to having been coded directly in machine language. Here is a longer description of this project. The version linked here was optically scanned from the SIGGRAPH 1982 course notes. It is identical to the 1981 version but includes corrections made by hand on the manuscript in 1982.  
A Color Animation System Based on the Multiplane Technique
Marc Levoy Proc. SIGGRAPH 1977 
This paper describes an animation package currently under development at the Cornell Program of Computer Graphics.The basic algorithm employed is linear or nonlinear interpolation between successive pairs of key frames. These key frames are composed of artwork input by the animator on a graphic tablet and displayed on either a black and white vector scope or a color halftone CRT. The initial working environment is twodimensional, and the individual images are combined using a multiplane cel animation technique to produce depth and motion illusions. Realtime film previewing, utilizing an onthefly interpolation algorithm, provides the artist with instant playback of animated sequences.  