Assignment 2 Lazy KD-Tree
Julie Tung
Date submitted: 26 Apr 2006
Code emailed: 26 Apr 2006
Description of implementation approach and comments
Storage strategies:
A lot of temporary storage is needed (to store edges and the prims0 and prims1 arrays) for the splitting that occurs during the rendering stage now, because it's building the kd-tree lazily. At first I just allocated the exact size that I needed for each of these data structures, used them in the split, and then freed them. However, since this was done for every time a lazy node was split into an interior node, this became extremely time-wasting. Instead, I created global temporary data structures, first initializing them to a particular size. Before each use, I check to make sure the size is sufficient; if so, I proceed, if not, then I reallocate the structures and delete the old ones. Using this scheme, the structures are reallocated very few times, and we just keep reusing the same space, although we're frequently holding more space than we need, it's worth it to avoid the malloc/free costs each time. Also, since the data structures are global, all the kd-trees can share this same temporary storage. This temporary storage optimization saved my rendering time about 20% on the killeroos-1 image.
For the KdAccelNode, I was able to keep the storage for the struct to 8 bytes by doing the following: 1. Since we can no longer position one child next to the parent node, instead position the two children next to each other, so that you only have to keep the index for one child, and the other's position is that + 1. 2. In the flags variable, which distinguishes between interior and leaf nodes, I marked lazy nodes with the same flag as the leaf (having 1's in the last 2 bits). Then to distinguish between leaf and lazy nodes, I used the least significant bit in the pointer that points to the MailboxPrims in the leaf case or the extra info in the lazy node case. Since pointers are byte-aligned, the last two bits don't matter. So every time that I access the variable as a pointer, I mask out those last two bits, and when I need to distinguish between the two node types, I check that last bit. For the lazy node, all the storage is contained in dynamically allocated storage pointed to by the info pointer, which contains the BBox (bounding box), number of badRefines, and the indices of the primitives contained in the lazy node.
My implementation also lazily refines the primitives by not fully refining in the beginning. When it needs to intersect a particular primitive, it refines it one step and creates a new kd-tree from that somewhat-refined list of primitives, so that nodes in the top-level kd-trees contain kd-trees as primitives themselves.
Final Images Rendered with my implementation of heightfield.cpp
killeroos-view1.pbrt (Killeroos visible)
|
KD Tree |
Lazy KD Tree |
Ratio |
build time (secs) |
19.8 |
0.0 |
0% |
total time (secs) |
30.9 |
28.0 |
90.6% |
Num of nodes made |
2.754M |
1359.9k |
49.4% |
Triangle ray intersections |
673.2k |
826.8k |
122.8% |
"killeroos-view2.pbrt (Killeroos invisible)"
|
KD Tree |
Lazy KD Tree |
Ratio |
build time (secs) |
19.1 |
0.0 |
0% |
total time (secs) |
28.2 |
9.9 |
35.1% |
Num of nodes made |
2.754M |
47 |
.002% |
Triangle ray intersections |
758.4k |
776.1k |
102.3% |
"killeroos-view3.pbrt (close-up)"
|
KD Tree |
Lazy KD Tree |
Ratio |
build time (secs) |
20.1 |
0.0 |
0% |
total time (secs) |
32.2 |
20.6 |
63.9% |
Num of nodes made |
2.754M |
373.2k |
13.5% |
Triangle ray intersections |
644.2k |
1.018M |
158.1% |
"plants-view1.pbrt"
|
KD Tree |
Lazy KD Tree |
Ratio |
build time (secs) |
49.9 |
0.1 |
0.2% |
total time (secs) |
493.8 |
844.6 |
171% |
Num of nodes made |
14.838M |
4.044M |
27% |
Triangle ray intersections |
20.264M |
27.614M |
136.3% |
"plants-view2.pbrt"
|
KD Tree |
Lazy KD Tree |
Ratio |
build time (secs) |
49.8 |
1.2 |
2.4% |
total time (secs) |
784.3 |
1472.1 |
187.7% |
Num of nodes made |
14.838M |
5.157M |
34.8% |
Triangle ray intersections |
25.894M |
31.217M |
120.5% |