Image-Based Lighting

Introduction and Motivation

The purpose of this project was to examine image-based lighting, as explained in Paul Debevec's Siggraph 1998 paper entitled Rendering Synthetic Objects into Real Scenes. Image-based lighting is becoming increasingly important in the film industry as it provides a practical means of seamlessly integrating 3D objects into an existing 2D image.

Having worked in the video and film post-production industry for a couple of years, my primary goal was to explore the image-based lighting pipeline, from data acquistion to creating the final composited images.

My initial project proposal can be found here.

Task Breakdown

My work was roughly broken down into the following tasks (click on the links for more details and pictures):

  1. Data Acqusition: To be able to do image-based lighting, I first needed a high dynamic range image of the area I intended to enhance with my 3D object. In my case, this was obtained by photographing a mirrored ball at multiple exposures.
  2. Computing the Radiance Map: Once I had photographs of the light probe, my next task was computing a radiance map from this data. Fortunately, Paul Debevec has a very well written paper on this subject, Recovering High Dynamic Range Radiance Maps from Photographs, from which I implemented a tool. The tool takes as input a series of light probe images and creates a high-dynamic range image and radiance map from them. It also acts as a high-dynamic range image viewer. When computing the radiance map, the light probe is assumed to be small relative to the environment and camera. The data on the spherical light probe is saved into a 2D map referenced by latitute/longitude. Bilinear interpolation was applied where appropriate. The tool also uses code from Numerical Recipes in C to do singular value decomposition.
  3. Camera Calibration: The next step was to build a 3D representation of my scene. As I did not have access to any sort of sophisticated modeling software, I stuck with a simple geometric plane in a perspective view. I had rough measurements of the camera position and angle; then, I wrote a simple OpenGL utility that allowed me to input the various parameters and interactively adjust them until I had a reasonably good approximation of the perspective and object parameters. The output of this tool was a RIB file that could be directly rendered using lrt.
  4. Colour Calibration: For optimal compositing results, the colour of the local scene objects should be matched as closely as possible to the original photograph. In his paper, Paul Debevec outlined a method whereby the colours can be computed by comparing the output of a rendered image with the original image. Again, I wrote a simple tool that takes as input a RIB file, an image of that RIB rendered by lrt, and the original image. The output is an updated RIB file. This process can be repeated until the colour matches the original image, although it is rare that more than one or two iterations are required, depending on how far off the initial estimate was.
  5. Rendering the 3D Scene: I wrote a couple of new lrt modules to create the necessary renderings for the final compositing step. The first module, globalillum, is a new integrator that uses the radiance map computed from step 2 to light the scene. I assumed that the environment is large compared to the rendered scene, and thus only the direction is used in retrieving the radiance from the radiance map. Again, bilinear interpolation was applied where appropriate. Multiple importance sampling was required to reduce noise: the samples are generated using stratified sampling (via LatinHypercube) and then weighted using a combination of the probability distribution function and the brightness of the light in that direction. Irradiance caching is used to estimate indirect lighting effects. The second module I wrote was a trivial integrator that produces only the alpha channel of a scene. This image is required as a matte for the final compositing step.
  6. Image Compositing: The final step in this process is to composite the 3D image created by lrt onto the original photograph. To do this, I used Paul Debevec's differential rendering technique, which takes the difference between a rendered image with and without the new 3D objects and applies that to the original photograph. This technique results in a seamless integration of the 3D objects with the 2D scene without requiring an accurate representation of the objects that interact with the newly added ones.

Problems and Challenges

Here is a brief summary of the various technical hurdles I encountered:

Results

Here are the pretty pictures. This first set shows some (real) pictures of various random objects on the kitchen counter in my studio.

Next are the objects replaced by 3D generated ones (click to enlarge).

And finally, here are a couple of cars rendered into some pictures I took of the Stanford Quad on an overcast morning. The car on the left is a Chrysler/Dodge Viper RT/10, and the car on the right is a Lamborghini Diablo.

These above two scenes show how my image-based lighting implementation functions under two very different lighting conditions. The studio scene has four main light sources of different colours: two windows with sunlight shining in, and two overhead lights. This results in a various shadows in multiple directions, and multiple specular highlights on the plastic sphere and dragon. The Quad scene was taken on an overcast day, so the shadows are much softer and spread out. Both scenes would probably be reasonably difficult to reproduce without image-based lighting techniques. The zoomed in car scene, with the largest poly count of 40k, took the longest to render -- approximately 90 minutes on a 1.79 GHz Athlon. The zoomed out car scene took about 80 minutes, and the studio scenes took about 90 minutes. The artifacts on the spheres in the studio scene are caused by inaccuracies in the irradiance caching method -- this can be reduced by increasing the number of irradiance cache samples at the cost of (much) longer render times. All scenes were rendered for a 1280x960 background plate at 16 samples per pixel, and 64 radiance map samples per pixel sample. The irradiance cache threshold was set at 0.2 for the car scenes and 0.05 for the studio scenes.

Conclusions and Stuff-I-Would-Have-Done-If-I-Had-More-Time

I have mainly concentrated on establishing a complete pipeline from data acquistion to the final compositing of image-based lighting images. There are a variety of improvements that can be made in each step of the process, which I did not have time to explore. Some examples include:

Acknowledgements

I would like to thank the following people/groups for their contributions:

References


Copyright © 2003 Eric Lee. Send comments to erl@stanford.edu.
Last updated on June 9, 2003