A report by Georg Petschnigg, June 2002
| Noisy ISO1600 Image | Noise Removed using FlashSharpen |
Taking images in low light environments, where the scene illumination is part of the appeal of the shot is tricky without a tripod. Imagine a candle light dinner, a sculpture garden at night, a thoughtfully lit interior or a glowing holiday tree. Using a flash for these pictures ruins the ambient lighting and tripods are cumbersome to use. As digital camera technology improves, cameras are able to take snapshots with higher ISO settings. Most of these so called high ISO images however exhibit significant noise. FlashSharpen is an algorithm that uses a flash light image's noise characteristics to improve a noisy natural light image. Given a camera that can take two images in rapid succession, one with flash and one without, this algorithm can be used to improve low light photography.
DescriptionThe motivation behind FlashSharpen was developed here in my Computer Vision Project. The basic idea of FlashSharpen is to combine a flash images high frequency components with a non-flash image's low frequency components.

Essentially FlashSharpen uses the high frequency components of the flash image to construct a plausible and noise reduce high frequency band for the non flash image. In this way we can express the compositing operation as follows:
noflash_new_highfrequency = noflash_highfrequency + (flash_highfrequency - noflash_highfrequency) * alpha;
or written as the OVER operator for frequency images:
noflash_new_highfrequency = noflash_highfrequency * (1- alpha) + flash_highfrequency * alpha;
The final image is constructed as follows:
noflash_noise_enhanced = noflash_new_highfrequency + noflash_lowfrequency
The main intellectual challenge becomes how to compute the blending operator alpha, so that the result looks pleasant to the human eye. The difficulty herein lies that shadows and speculars present in the flash image create unwanted edges. Those translate into unwanted high frequencies. I spent almost the entire quarter attempting to solve this problem, using Bayesian techniques and color segmentation, none of which yielded desirable result. Most of these efforts where misguided by a desire to "detect" artifacts such as shadows and speculars and to prevent the compositing in those regions. The current solution performs a frequency domain analysis and does not combine the frequency bands where they strongly disagree.
The algorithm works as follows:
1. Split the two images into their respective low and high pass
components
% Low Frequencies
f_l = blur_using_gaussian(img_flash, SIGMA);
nf_l = blur_using_gaussian(img_noflash, SIGMA);
% High Frequencies
f_hf = img_flash - f_l;
nf_hf = img_noflash - nf_l;
function G = gauss(sig)
x = floor(-3*sig):ceil(3*sig);
G = exp(-0.5*x.^2/sig^2);
G = G/sum(G);
The main challenge here is how to pick the filter width SIGMA for the Gaussian. The noisier an image, the greater SIGMA should be, because more of the non-flash image's high frequency band have to be removed. I found a value of 2 works for most images, and you need larger values as the image quality degrades. See the Very Noisy Image, SIGMA = 10, and SIGMA = 30. SIGMA corresponds to the low pass filter's cut-off frequency.
2. Compute the cost function
Now I compute a cost function which prevents disagreeing frequency components to
be combined. My proposed measure is the absolute difference of each pixels high
frequency component with respect to its counter in the other image.
%Cost function
cost = abs(f_hf - nf_hf);
The cost is then scale to the range 0 - 1.
max_difference = max(cost (:)); % Find largest cost
cost = cost ./ max_difference; % Normalize to 0 - 1 range
cost = 1 - cost ; % Invert for compositing step
Finally we have to weight the cost, by how badly it would look to the human eye
if we were to blend a pixel at that particular location. Ideally this would be a
function that has some basis in the physiology of vision, but for the time being
I use an exponential function. This exponential function will weight small
difference more favorable than large ones. The exponent also determines which
image should be favored over another. An exponent greater than one favor the
flash image, less than one the non flash image. In my case I automatically
generate an exponent which attempts to bring the cost mean to 0.6. This cost
directly translates into alpha.
mean_cost = mean(cost (:));
exponent = log(0.6) / log(mean_cost);
cost = cost .^ exponent;
3. Final Composition using cost function
In the final step we combine the images using the OVER operator describe
above.
nf_new_hf = nf_hf + (f_hf - nf_hf) .* cost;
fused_image = nf_l + nf_new_hf;
Since this operation is not energy preserving, out of bound values have to be set to zero or one respective.
fused_image(find( fused_image < 0)) = 0.0;
fused_image(find( fused_image > 1)) = 1.0;
After spending so much time on this problem I am particularly pleased by the
results. See them here.
Download Matlab Code here (coming soon).
If the presented algorithm where implemented in a digital camera, flash
could be used as an active sensor to denoise low ambient light images. With the
arrival of JPEG200 which stores images in a wavelet tree the compositing
operation described above could be performed in real-time, on compressed data
with direct feedback to users. This would facilitate picking a blending exponent
(paired with a Look Up Table) via user interaction.
This algorithm relies on semi decent looking input images, but scales fully as
hardware and image quality improves.
Copyright © 2002 Georg Petschnigg