Assignment 7 - Wavelet Image Compression
Due Date: Thursday March 8th, 11:59PM
Questions? Check out the Assignment 7 FAQ and discussion page.
In this assignment you will write code to perform a simple Haar wavelet transform in order to compress an image.
|
|
|
Source |
Transformed & Quantized |
Reconstructed |
The Problem
Given an input image we want to transform it to make it more amenable to compression. There are 3 key steps to this process:
- Transformation - transform the data
- Quantization - the main lossy step, quantize the data which is an approximation of the original data
- Coding - store the data efficiently
For this assignment we're going to use a simple wavelet transform to make the data more amenable to compression.
Since the quantization step is the main lossy step, we're going to add a user controlled parameter to control the quantization, which, in turn, controls the quality and compressed data size.
For this assignment we're not actually going to perform the coding. Instead, we'll calculate the entropy of the input and the quantized data, which is a close approximation to the encoded data size.
Download and build the starter code
1. Begin by downloading the Assignment 7 starter code and latest version of the libST source here. Libst has not changed in this assignment, but we are bundling it in with the assignment zip file just like in all the previous assignments.
2. Build the code. In this project, you will be modifying assignment7/wavelet.cpp. In the assignment7/ subdirectory we have placed a couple of sample image files. These files are losslessly compressed. You may use other files as well, but if they were compressed with a lossy scheme already, the results may not be what you would expect.
Review of the Haar Wavelet Transform
As discussed in lecture, wavelets are a tool for multi-resolution analysis. The Haar wavelet transform is the simplest wavelet and and straightforward to implement.
Each step of the wavelet transform takes an input signal of length n and generates a smoothed (lower resolution) version of length n/2 and a high frequency detail signal of length n/2. We can then continue to apply the same transform to the half size lower resolution signal to get an even lower resolution image and lower frequency details.
The Haar transform in particular is simple because it generates these smoothed and detail signals in a simple manner. Given an input signal, each pair of terms input[2*i] and input[2*i+1] form a single smoothed value smooth[i] and a single detail value detail[i], where i is the range [0,length/2).
Given the input values A and B, the smoothed and detail values we output are:
S = (A + B) / 2 D = (A - B) / 2
This works well because it always produces average values that are in the range [0,1], so they still represent a valid, subsampled image. The differences are always in the range [-.5,.5] (consider the possible values of A and B to see why this is the case). To simplify quantization, we can just add .5 to all the detail values to make all values be in the range [0,1]:
S = (A + B) / 2 D = (A - B) / 2 + .5
The inverse transform, given S and D values, is given by
A = S + (D - .5) B = S - (D - .5)
You can easily verify this by substitution.
The 2D version of the wavelet transform can be expressed as a series of 1D transforms on the rows and columns of the image. We'll use the non-standard decomposition, which is expressed, as pseudo-code, by the following:
for wavelet level = log_2(image_size) to 1 { for each row i { wavelet_transform_step(row(i)); } for each column j { wavelet_transform_step(column(j)); } }
The inverse 2D transform is similar, but works backwards from the lowest to highest level and performs inverse wavelet transform steps.
The Code
The code breaks the wavelet transform into simple, easy to implement parts. You'll need to implement both the forward and inverse wavelet transformations and the quantization and reverse quantization steps. Note that these transforms are only defined for images with power of 2 dimensions.
There are only 3 displays for this assignment: the original image, the quantized image, and the reconstructed image. The entropy of the input and transformed, quantized image will be printed to the console when you adjust the compression quality parameter.
Forward Transform: 1D
Because you can perform a 2D wavelet transform as a series of 1D transform steps we first want to implement a 1D wavelet transform step.
You need to implement the function
void forward_haar_1d(STColor* input, STColor* output, int length)
It should do one step of the forward transform specified above. It should compute the averages and differences and store them (averages in the first half, differences in the second) in the output array. The last parameter specifies the length of the transform, so you should get a total of length/2 averages and length/2 differences.
Forward Transform: 2D
The forward Haar transform for an image is performed by the function
void forward_haar_2d(STHDRImage& src, STHDRImage& dest, int transform_steps)
It should be performed using the non-standard decomposition - perform transform steps alternating between rows and columns, reducing the size of the transform by half each time until you are left with only a 1x1 subimage.
We've implemented these functions
void read_input_row(STHDRImage& src_img, int row, int length, STColor* tmp_input); void read_input_col(STHDRImage& src_img, int col, int length, STColor* tmp_input); void write_output_row(STHDRImage& dest_img, int row, int length, STColor* tmp_output); void write_output_col(STHDRImage& dest_img, int col, int length, STColor* tmp_output);
to help you. They copy data to and from images and temporary arrays. Using these functions and the 1D transform function it should be relatively simple to break down the 2D Haar transform. There are global STColor* tmp_input and STColor* tmp_output arrays which are allocated for you that you can pass to these functions and the 1D transform function.
Quantization
Given an STColor that we calculated in the previous steps we need to quantize it to store it efficiently. Because of the way we constructed the wavelet transform, all the detail terms are in the range [0,1]. Because we've guaranteed our wavelet transformed data is all in the range [0,1] the easiest way to quantize is to simply convert back to an STPixel. STPixels use unsigned chars for storage, so we are quantizing to 8 bits. However, we want some control over the quality. As discussed in class, we can devote fewer bits to higher subbands (higher frequencies) by dividing those terms by larger values so that they take on a smaller range of values. One example of a quantization factor you can use is:
output_i = p_i / (level^(qfactor^3))
where output_i is the output STColor, p_i is the input STColor, level is the level or subband of the wavelet term, and qfactor is a user defined value which controls the quality. Notice that when qfactor = 0 this new factor is always 1 and has no effect. level is 0 for the scaling function (the one average term that remains when the transform is complete) and increases with higher frequency details. Note that you often have to special case the scaling function because its defined to be at level 0, which breaks some formulas (like the one above).
You need to fill in the
void quantize(STHDRImage& src, STImage& dest)
function to convert the input floating point image src}} into the quantized {{{dest image. We've provided a helper function
int get_wavelet_level(int c, int r)
which, for a given column and row, gives the wavelet subband that pixel is in. These levels increase in value with higher frequency detail coefficients.
Feel free to use any quantization scheme that works, but this is a good starting point. Other schemes may give more intuitive control over quality. This scheme is by no means the best, but is simple and easy to implement. Another example of a simple to implement quantization scheme is thresholding: use the qfactor parameter to control which terms are dropped to .5 (which is equivalent to 0 since we made the adjustment when calculating differences).
if (abs(p_i - .5) < qfactor) output_i = .5 else output_i = p_i
This behaves very differently from the other method described above. Also note that you don't necessarily want your function to behave linearly with qfactor, you may want to use powers of qfactor to get more intuitive control of quality. For example, in this thresholding scheme to get more even changes in quality you probably want to compare to qfactor3 or qfactor4. Please change the initial value for qfactor and its increment/decrement size to make sense for the scheme you implement.
At this point you can check the resulting transformed and quantized image against the sample at the top of the page. The exact image you get will depend on your quantization scheme, but it should look something like the one given above: with a high quality factor, details will be visible in the higher subbands, but much of the image will be some shade of gray. As quality goes down, less detail should be visible. If you just want to test the wavelet transform, you can just use ToPixel() without any scaling for the quantization step and you should get a result very similar to the one at the top of this page.
Reverse Quantization
In this step we just reverse the quantization step we did above. We can simply convert back to an STColor and do the inverse of the previous function, so the two functions should look very similar.
Inverse Transform: 1D
This is similar to the forward 1D transform but you are taking as input the averages and differences and outputting the reconstructed values.
Inverse Transform: 2D
This is similar to the forward 2D transform, but reversed.
At this point, with the default quantization setting, you should get almost exactly the original image back.
Additional hints and tips
- The forward and inverse functions are so similar that you may want to copy the code from the forward to the inverse and then make the necessary changes.
- Because we have positive and negative values the quantized values will be centered around 128. This will make your quantized image be gray in most places. Note, however, that this doesn't affect compression since an entropy coder isn't affected by which symbols are most probable.
- Remember that the scaling function (i.e. the average of the entire image, which gets stored at (0,0)) needs to be handled differently during quantization.
- Note the data types of the inputs and outputs of the various steps and be careful about the range of values you output.
- It can sometimes be hard to find errors in your code because of the repeated application of the functions. If you are having trouble debugging your code, try forcing your loop to perform only one step of the 2D transform. Once the quantized image looks right after one step, work on getting multiple steps to work.
Submission Instructions
We would like the following items to be placed in a zip file called LastnameFirstnameAssgn7.zip as handin for this assignment.
Your modified copy of wavelet.cpp
- Answers to the following questions:
- What kind of errors are caused by this lossy compression?
Are the low frequency detail basis images really low frequency? (Think about the graphs of the basis functions.)
- Why is the Haar basis not the best choice for image compression?
- What quantization method did you finally decide to use? Can you justify using the one you chose instead of another?
Please email this zip file to cs148-win0607-staff@lists.stanford.edu before the deadline. Please make the subject line of this email "CS148 Assignment 7 Handin".
Grading
- 1 point: Answered questions
- 2 points: Answered questions + forward transform (with simplest quantization so it can be verified visually)
- 3 points: Answered questions + forward transform + quantization
- 4 points: Answered questions + forward transform + quantization + inverse transform
We'll give 1 point of extra credit if you implement a higher order wavelet filter, for example Daubechies 4. The basic structure is the same. In fact, replacing the forward and inverse 1D transform functions is all you need to do, but larger filters require you to handle borders carefully.