Assignment 3 Camera Simulation

Mattias Bergbom

Date submitted: ?? May 2006

Code emailed: ?? May 2006

Compound Lens Simulator

Description of implementation approach and comments

Just as suggested I tried going about this assignment in a slightly more methodical fashion than the previous one. I use a Lens class which holds a std::vector of LensSurface structs, that each describe all properties that are required when tracing rays both from film to world and vice versa. This structure makes it very easy to iterate back and forth through the elements, and by tagging the aperture(s) I can add them into the vector as LensSurfaces as well and treat them transparently. For extra clarity this would probably have been a good place for some genuine OO, but for simplicity I opted for an if statement in the LensSurface::Refract method instead.

Tracing rays through the lens system now boils down to iterating through the LensSurface vector and calling Refract() to do Snell's law on each element, discarding the ray (by returning weight 0) if it goes out of bounds or reflects.

Coming straight out of CS223b, right from the start I decided to use OpenCV's drawing facilities to visualize the rays going through the lens, indicating various events (such as aperture hits, reflections etc.) with different figures (see figure 1). This proved extremely useful during the rest of the assignment, since a lot of the more subtle bugs are way easier to detect visually than by stepping through code and checking numbers. By using preprocessor instructions I was able to completely hide all the visualization code from the release build, to improve performance and avoid build errors on other machines.

Fig 1a: Visualization of ray tracing through a Gauss lens

Fig 1b: Tracing parallel incoming rays to determine the thick lens
approximation of a Gauss lens

One bug that I wasn't able to find by visual means was a faulty Snell's law implementation. It manifested itself as unwanted spherical abberation and a smeared out appearance of the images in general. After advise from the TA I was finally able to pinpoint the problem and switch to Heckbert & Hanrahan's method instead.

Final Images Rendered with 512 samples per pixel

My Implementation

Reference

Telephoto

hw3telephoto_512

Double Gausss

hw3dgauss_512

Wide Angle

hw3wide_512

Double Gausss

hw3fisheye_512

Final Images Rendered with 4 samples per pixel

My Implementation

Reference

Telephoto

hw3telephoto_4

Double Gausss

hw3dgauss_4

Wide Angle

hw3wide_4

Fisheye

hw3fisheye_4

Experiment with Exposure

Image with aperture full open

Image with half radius aperture

Observation and Explanation

Just as expected, stopping the lens down to half the aperture radius lets less light get in (or 'out'), increasing the exposure and introducing more noise. In this case the exposure time is constant, forcing us to increase the sensitivity of the sensor (i.e. going to higher ASA), which in the classical sense would demand grainier film, and in the modern sense would cause a noisier signal. Also, the depth of field increases as the aperture shrinks, since the peripheral rays that earlier caused wider circles of confusion now are prevented from entering/leaving the lens.

Autofocus Simulation

Description of implementation approach and comments

(not complete yet)

Since derivatives of any order are way too sensitive to the Monte Carlo noise, I quite quickly opted for a DCT (Discrete Cosine Transform) based approach for the 'focusness' measure instead. Any frequency domain method is generally better for noisy data, since high frequencies so easily can be filtered out, and [Kristan 05] presents a very elegant method based on simple 8x8 DCT and Bayes Spectral Entropy. The reasoning behind this approach is in broad strokes that the normalized spectrum of an unfocused image will exibit strong low frequency modes, while a focused image will have its frequencies more uniformly distributed. This property can be captured by using a Bayes entropy function, which is essentially a sum of squares divided by a square of sums:

M(f) = 1 - SUMu v(F(u,v)2)/(SUMu v|F(u,v)|)2

where F(u,v) is the frequency domain function, e.g. the DCT. Kristan also shows that using a regular 8x8 DCT and simply averaging the focus measure over all 8x8 blocks in the focus area is sufficient for robustness, meaning I could go ahead and use any of the abundance of implementations purposed for classic MPEG/JPEG image compression available out there. I settled for a version of the Winograd DCT [Guidon 06]. Also, cutting off some unnecessary frequency content by limiting u+v <= t proved crucial to noise tolerance, and was a key reason that I could step down from the initial 256 samples per pixel to somewhere around 64-128 and maintain robustness.

Fig 2: Comparison between DCT-based and derivative-based
approaches to measuring focus at high noise levels.

For search, I first did a very naive version that first finds the focal point by shooting parallel rays into the lens and solving for their point of intersection, and starting from there then basically does a number of linear searches with increasing granularity and shrinking search intervals, centered around the maximum in the previous search. Due to the peformance gain of the DCT focus measure method I still land somewhere around 60-100 sec. total with reasonable robustness, which accounts for less than 10% of the total rendering time, but I'm sure there's far more efficient approaches.

Fig 3: Having several possible focus depths within an AF zone
ruins the unimodality of the focus measure function.

Finding an alternate search algorithm proved way more difficult than I first expected. Although figure 2 suggests unimodality, an AF zone might cover several focal depths, giving rise to more than one peak. For example in figure 3, hw3.afdgauss_bg.pbrt causes two maxima; one at the desired depth (see below) and one at infinity, due to the background being partly included. This quickly eliminated my implementation of Golden Section search. All in all, searching a non-unimodal, noisy function for the global maximum is a problem that as far as I can tell still remains unsolved, at least without significant knowledge of the function's properties.

(diagrams to be added)

References:
M. Kristan, J. Pers, M. Perse, S. Kovacic. "Bayes Spectral Entropy-Based Measure of Camera Focus". Faculty of Electrical Engineering, University of Ljubljana. 2005.
Y. Guidon. "8-tap Winograd DCT optimised for FC0". <http://f-cpu.seul.org/whygee/dct_fc0/dct_fc0.html>. 2002.

Final Images Rendered with 512 samples per pixel

Adjusted film distance

My Implementation

Reference

Double Gauss 1

61.5 mm

hw3afdgauss_closeup

Double Gauss 2

39.6 mm

hw3afdgauss_bg

Telephoto

116.2 mm

hw3aftelephoto

MattiasBergbom/Assignment3 (last edited 2006-05-09 06:27:41 by MattiasBergbom)