Assignment 3 Camera Simulation

Mattias Bergbom

Date submitted: ?? May 2006

Code emailed: ?? May 2006

Compound Lens Simulator

Description of implementation approach and comments

Just as suggested I tried going about this assignment in a slightly more methodical fashion than the previous one. I use a Lens class which holds a std::vector of LensSurface structs, that each describe all properties that are required when tracing rays both from film to world and vice versa. This structure makes it very easy to iterate back and forth through the elements, and by tagging the aperture(s) I can add them into the vector as LensSurfaces as well and treat them transparently. For extra clarity this would probably have been a good place for some genuine OO, but for simplicity I opted for an if statement in the LensSurface::Refract method instead.

Tracing rays through the lens system now boils down to iterating through the LensSurface vector and calling Refract() to do Snell's law on each element, discarding the ray (by returning weight 0) if it goes out of bounds or reflects.

Coming straight out of CS223b, right from the start I decided to use OpenCV's drawing facilities to visualize the rays going through the lens, indicating various events (such as aperture hits, reflections etc.) with different figures (see figure 1). This proved extremely useful during the rest of the assignment, since a lot of the more subtle bugs are way easier to detect visually than by stepping through code and checking numbers. By using preprocessor instructions I was able to completely hide all the visualization code from the release build, to improve performance and avoid build errors on other machines.


Fig 1a: Visualization of ray tracing through a Gauss lens	Fig 1b: Tracing parallel incoming rays to determine the thick lens approximation of a Gauss lens

One bug that I wasn't able to find by visual means was a faulty Snell's law implementation. It manifested itself as unwanted spherical abberation and a smeared out appearance of the images in general. After advise from the TA I was finally able to pinpoint the problem and switch to Heckbert & Hanrahan's method instead.

Final Images Rendered with 512 samples per pixel

	My Implementation	Reference
Telephoto
Double Gausss
Wide Angle
Double Gausss

Experiment with Exposure

Image with aperture full open	Image with half radius aperture

Observation and Explanation

Just as expected, stopping the lens down to half the aperture radius lets less light get in (or 'out'), increasing the exposure and introducing more noise. In this case the exposure time is constant, forcing us to increase the sensitivity of the sensor (i.e. going to higher ASA), which in the classical sense would demand grainier film, and in the modern sense would cause a noisier signal. Also, the depth of field increases as the aperture shrinks, since the peripheral rays that earlier caused wider circles of confusion now are prevented from entering/leaving the lens.

Autofocus Simulation

Description of implementation approach and comments

(not complete yet)

Since derivatives of any order are way too sensitive to the Monte Carlo noise, I quite quickly opted for a DCT (Discrete Cosine Transform) based approach for the 'focusness' measure instead. Any frequency domain method is generally better for noisy data, since high frequencies so easily can be filtered out, and [Kristan 05] presents a very elegant method based on simple 8x8 DCT and Bayes Spectral Entropy. The reasoning behind this approach is in broad strokes that the normalized spectrum of an unfocused image will exibit strong low frequency modes, while a focused image will have its frequencies more uniformly distributed. This property can be captured by using a Bayes entropy function, which is essentially a sum of squares divided by a square of sums:

M(f) = 1 - SUM_{u v}(F(u,v)²)/(SUM_{u v}|F(u,v)|)²

where F(u,v) is the frequency domain function, e.g. the DCT. Kristan also shows that using a regular 8x8 DCT and simply averaging the focus measure over all 8x8 blocks in the focus area is sufficient for robustness, meaning I could go ahead and use any of the abundance of implementations purposed for classic MPEG/JPEG image compression available out there. I settled for a version of the Winograd DCT [Guidon 06]. Also, cutting off some unnecessary frequency content by limiting u+v <= t proved crucial to noise tolerance, and was a key reason that I could step down from the initial 256 samples per pixel to somewhere around 64-128 and maintain robustness.

attachment:comparison.png

Fig 2: Comparison between DCT based and derivative-based approaches to measuring focus.

For search, I first did a very naive version that first finds the focal point by shooting parallel rays into the lens and solving for their point of intersection, and starting from there then basically does a number of linear searches with increasing granularity and shrinking search intervals. Due to the peformance gain of the DCT focus measure method I still land somewhere around 60-100 sec. total with reasonable robustness, which accounts for less than 10% of the total rendering time.

References:
M. Kristan, J. Pers, M. Perse, S. Kovacic. "Bayes Spectral Entropy-Based Measure of Camera Focus". Faculty of Electrical Engineering, University of Ljubljana. 2005.
Y. Guidon. "8-tap Winograd DCT optimised for FC0". <http://f-cpu.seul.org/whygee/dct_fc0/dct_fc0.html>. 2002.

Final Images Rendered with 512 samples per pixel

	Adjusted film distance	My Implementation	Reference
Double Gauss 1	61.5 mm
Double Gauss 2	39.6 mm
Telephoto	116.2 mm