Assignment 2: Lazy K-D Tree
Due: Thursday April 26th, 11:59PM
Please link to your final Assignment 2 writeup wiki page from Assignment2Writeups.
Description
In this assignment, you will improve pbrt's implementation of the k-d tree accelerator. As described in Chapter 4 of the textbook, before rendering, pbrt constructs a k-d tree containing all geometric primitives in the scene. This approach can be inefficient for several reasons. First, as you will observe in this assignment, building a complete k-d tree for a scene containing many geometric elements can be very costly. Depending on the camera's location and how objects in the scene occlude others from view, it is possible that many k-d tree nodes created in the initial build process may never be hit by any rays. In this case, both computation spent building these nodes and, more importantly, the memory used to store them is wasted. Your objective in this assignment is to modify pbrt's k-d tree accelerator to lazily build the acceleration structure as rays are shot through the scene.
In the images above show the same scene rendered from three different camera positions. The scene contains 76 killeroos and each killeroo is a subdivision surface shape with a base mesh consisting of 8316 triangles. Six of these killeroos (the baby killeroos) undergo one level of subdivision, so the resulting meshes contain 33262 triangles. In total, the killeroos constitute 775,692 triangles. Regardless of the camera position used, pbrt preprocesses the scene into a k-d tree containing 9.7 million nodes. pbrt's statistics for the scene are given below:
Geometry Total shapes created 781.9k Triangle Ray Intersections 656.7k:4838.6k (13.57%) Triangles created 781.7k Kd-Tree Accelerator Avg. number of primitives in leaf nodes 12.148M:4.886M (2.49x) Interior kd-tree nodes made 4.886M Leaf kd-tree nodes made 4.886M Maximum number of primitives in leaf node 395
This preprocess can expensive. On my laptop the k-d tree build takes about 30 seconds, while rendering the scene using the k-d tree requires approximately 25 seconds. Additionally, it consumes a large amount of memory. Even with pbrt's compact 8 byte representation of a tree node, the 9.7 million nodes consume 77MB of memory (plus an additional 48MB to store the references to primitives from tree leaf nodes). One solution is to build the tree on demand while tracing rays through the scene. Delayed computation or lazy evaluation is a common technique used in many computer science algorithms. The point of this assignment is to implement a lazy kd-tree to improve pbrt's performance when rendering complex scenes whose geometry extends well beyond the space traversed by the rays generated to render an image, such as in images at center and at right. Your implementation must be lazy in two key ways: (1) kd-tree nodes will be constructed dynamically as needed, and (2) scene objects will be refined into their primitive components only when required.
Step 1: Understanding pbrt's k-d tree
It is critical that you first obtain a detailed understanding of the current k-d tree implementation in pbrt. Read Section 4.4 of the textbook and make sure you can follow the code in kdtree.cpp, located in the accelerator directory of the code base. You should be able to answer the following questions about the current implementation (we do not expect you to hand in answers to these questions).
- Describe the memory layout of the k-d tree representation. pbrt takes takes great care to minimize the storage required by a single k-d tree node. Since k-d trees may have millions of nodes, optimizing the memory footprint of nodes is important for both performance and memory utilization. In particular, notice how pbrt saves space by cleverly positioning child nodes in relation to a parent node. This scheme may need to change in your lazy implementation.
- Describe the heuristics of choosing the splitting plane and the implementation of the cost function. You may choose to alter the cost function in this assignment.
- What input data is required to build a new k-d tree node? Your implementation will need to be able to access this data dynamically as your lazy k-d tree is generated during the rendering process.
Understand the use of the badrefines variable in kdtree.cpp.
Understand how ray traversal through the k-d tree structure works (in the method KdTreeAccel::Intersect). Your lazy imeplementation will need to modify this algorithm to generate new k-d tree nodes while traversing.
Notice that the current k-d tree implementation refines scene objects into their geometric primitives before building the complete k-d tree. What problems does this approach present for lazy evaluation? In particular, look at the implementation of Refine in the loop subdivision shape. What are the benefits of delaying the refine of such objects.
Step 2: Time pbrt's original k-d tree build
The rendering time reported by pbrt does not include the time spent building the acceleration structure. Use the ProgressReporter class to measure the amount of time it takes pbrt to create a k-d tree (see the CreateAccelerator function in kdtree.cpp. You'll need to wrap the creation of the KDtreeAccel like this:
ProgressReporter progress(1, "Building KDTree"); KdTreeAccel* accel = new KdTreeAccel(prims, isectCost, travCost, emptyBonus, maxPrims, maxDepth); progress.Update(); progress.Done();
Step 3: Implement a Lazy Kd-tree
Begin by downloading the Assignment 2 pbrt scene files located at http://graphics.stanford.edu/courses/cs348b-07/assignment2/assignment2.zip.
manykilleroos.pbrt is the scene file you will render in this assignment. This main scene file includes the files killeroo_subdiv0.pbrt, killeroo_subdiv1.pbrt, and killeroo_row.pbrt. At the top of manykilleroos.pbrt you will notice 3 different camera positions specified by different LookAt transforms which correspond to the images shown above.
Given what you have learned from studying the existing pbrt k-d tree implementation, modify kdtree.cpp so that the k-d tree accelerator that uses lazy evaluation to gain efficiency for complex scenes. Be sure to keep a copy of the original k-d tree accelerator implementation around for use in later assignments and for debugging ans test in this assignment. (Alternatively you may want to implement your lazy k-d tree accelerator as an entirely different class in pbrt. This is more work, as it requires modification of the pbrt project and Makefiles, and some surrounding pbrt code, but would allow you to select between your and the original k-d tree implementation by only changing the scene file, not recompiling the k-d tree accelerator module. Ask for help if you cannot figure out how to do this on your own, and wish to do so).
This assignment is extremely open ended, and there is more than one way to correctly implement the assignment. Your grade on this assignment will depend largely on your ability to describe the strategies you tried and your evaluation of them on the test scene. Here are some suggestions to get you started:
- Think about how much of the tree you want to build during the initial build process. You may want to begin with a single node or initially build the tree out to an certain depth. Note that in the limit, the entire tree is constructed in the preprocess, and you have returned to pbrt's original k-d tree implementation.
- You will need to have a mechanism to mark a node as a 'lazy' node, ie. a node that will need to be refined further when intersected by a ray. Hint: Lazy nodes will need a means to access all the data necessary to construct the remaining subtree.
- It is likely that your implementation will need to manage a number of temporary data structures (see previous comment). The performance of your implementation will depend on our choices related to memory management.
- Consider ways to be lazy about the refinement of objects into their geometric primitives (triangles in the case of this assignment). It may be more efficient to build a tree of conglomerate objects and then refine and build k-d trees for these objects later. As discussed in class, a k-d tree need not contain only base primitives, it might contain sub k-d trees as well.
Advanced Suggestions
- Pbrt currently assumes a constant intersection cost for all primitives when building a k-d tree. This is reasonable since it fully refines all primitives before insertion into the tree. The presence of unrefined conglomerate objects means that the intersection cost of each object in the k-d tree might vary widely. How might you account for this in your implementation?
- Tune your kd-tree layout for optimal efficiency (minimize storage, improve layout for cache performance, etc). The current pbrt implementation uses 8 bytes per k-d tree node.
- Experiment with different heuristics for making node splitting decisions. Do view dependent heuristics work well?
Step 4: Evaluation Part 1
The evaluation of the quality of your implementation is an important part of this assignment. Test your lazy k-d tree implementation by rendering all 3 views of the killeroo scene and compare your results with the regular pbrt k-d tree implementation. Write up a report that addresses the following:
- Precisely describe your lazy kd-tree implementation.
- What was your node representation?
- What did you compute in the preprocess? How many levels did you build at the start?
- How many levels did you build when dynamically generating the tree?
- When did you refine scene objects?
- Describe all heuristics you changed or tried?
- Fill out the following table for each of the three views:
|
pbrt KD Tree |
my Lazy KD Tree |
Ratio |
build time (secs) |
- |
- |
- |
total time (secs) |
- |
- |
- |
nodes made |
- |
- |
- |
Triangle ray intersections |
- |
- |
- |
- Is the render time (not counting k-d tree build) using your lazy k-d tree as fast as it was when using the regular pbrt implementation. If there are differences in performance, describe why you think this is the case.
- Describe how any heuristics or advanced optimizations you tried effected the above numbers.
Step: 5 Render on a BIG scene
Lastly, we'd like you increase subdivision on all 76 killeroos in the scene to be at least level 1. To do this, modify killeroo_row.pbrt so that the file includes copies of killeroo_subdiv1.pbrt instead of killeroo_subdiv0.pbrt. Now the scene, fully refined, contains 2.5 million triangles. Render the scene from the close up viewpoint using your lazy k-d tree implementation, and compare statistics with the original pbrt implementation 'if possible. Note that on systems with only about a gig of RAM a non-lazy k-d tree implementation will cause pbrt to swap. We hope your system can complete a rendering using pbrt's original k-d tree implementation. Note that if you lazily built a tree, but did not implement lazy refinement, your memory footprint will continue to be very large (why is this the case?). This test highlights the importance of lazy refinement and building. It makes rendering very large scenes possible when the entire scene is not in view. Include the results of this test in your writeup.
Step 6: Submission
Please create a wiki page called FirstnameLastname/Assignment2 for your writeup and link to this page from the Assignment2Writeups page.
- Prevent other users from seeing this page by adding the following line to the top of the page (this line is case and whitespace sensitive)
#acl YourWikiUsername:read,write,admin All:
Upload the 4 required image files (in EXR or png format, rendered at 400x400) as attachments to this wiki page. The four required images are the 3 views of the scene + the close up view using all killeroos at subdivision level 1 (described in step 5). Note that you can link to the images to display them on your wikipage using attachment:filename.
Upload your code (kdtree.cpp) as an attachment to the wiki page. If you modified additional files include all your code in a single zip file.
Grading
This assignment (as well as all future assignments in the class) will be graded on a 4 point scale:
- 1 point: code does not work correctly: either produces incorrect visual results or is not lazy
- 2 points: Functional implementation of lazy build but no lazy refinement
- 3 points: Implementation of lazy build and lazy refinement, code can produce all 4 images correctly and outperforms a non lazy build. Minimal writeup.
- 4 points: Completion of all requirements for 3 points + clear writeup addressing points discussed in steps 4 and 5. Optional: experimentation and documentation of new heuristics or advanced techniques.
Extra credit will be given for exceptional experimentation with heuristics or techniques to improve the performance or quality (for example: the number of nodes created) of your lazy k-d tree implementation.