Assignment 2: Lazy K-D Tree

Due: Thursday April 26th, 11:59PM

Please link to your final Assignment 2 writeup wiki page from Assignment2Writeups.

Answers to common questions are posted on Assignment2Discussion.


In this assignment, you will improve pbrt's implementation of the k-d tree accelerator. As described in Chapter 4 of the textbook, before rendering, pbrt constructs a k-d tree containing all geometric primitives in the scene. This approach can be inefficient for several reasons. First, as you will observe in this assignment, building a complete k-d tree for a scene containing many geometric elements can be very costly. Depending on the camera's location and how objects in the scene occlude others from view, it is possible that many k-d tree nodes created in the initial build process may never be hit by any rays. In this case, both computation spent building these nodes and, more importantly, the memory used to store them is wasted. Your objective in this assignment is to modify pbrt's k-d tree accelerator to lazily build the acceleration structure as rays are shot through the scene.

topview mediumrange closeup

The images above show the same scene rendered from three different camera positions. The scene contains 76 killeroos and each killeroo is a subdivision surface shape with a base mesh consisting of 8316 triangles. Six of these killeroos (the baby killeroos) undergo one level of subdivision, so the resulting meshes contain 33262 triangles. In total, the killeroos constitute 775,692 triangles. Regardless of the camera position used, pbrt preprocesses the scene into a k-d tree containing 9.7 million nodes. pbrt's statistics for the scene are given below:

    Total shapes created                                    781.9k
    Triangle Ray Intersections                              656.7k:4838.6k (13.57%)
    Triangles created                                       781.7k
Kd-Tree Accelerator
    Avg. number of primitives in leaf nodes                 12.148M:4.886M (2.49x)
    Interior kd-tree nodes made                             4.886M
    Leaf kd-tree nodes made                                 4.886M
    Maximum number of primitives in leaf node               395

This preprocess can expensive. On my laptop the k-d tree build takes about 30 seconds, while rendering the scene using the k-d tree requires approximately 25 seconds. Additionally, it consumes a large amount of memory. Even with pbrt's compact 8 byte representation of a tree node, the 9.7 million nodes consume 77MB of memory (plus an additional 48MB to store the references to primitives from tree leaf nodes). Imagine the amount of memory required if we were rendering a battle scene from Lord of the Rings!

One solution to this problem is to build the tree only when needed. In the case of ray tracing, we only build a tree if a ray enters the region of space occupied by the model, this case one of the killaroos. Delayed computation or lazy evaluation is a common technique used in many computer science algorithms. The point of this assignment is to implement a lazy kd-tree to improve pbrt's performance when rendering complex scenes whose geometry extends well beyond the space traversed by the rays generated to render an image, such as in images at center and at right. Your implementation must be lazy in two key ways: (1) kd-tree nodes will be constructed dynamically as needed, and (2) scene objects will be refined into their primitive components only when required.

Step 1: Understanding pbrt's k-d tree

It is critical that you first obtain a detailed understanding of pbrt's current k-d tree implementation. Read Section 4.4 of the textbook and make sure you can follow the code in kdtree.cpp, located in the accelerator directory of the code base. You should be able to answer the following questions about the current implementation (we do not expect you to hand in answers to these questions).

  • Describe the memory layout of the k-d tree representation. pbrt takes takes great care to minimize the storage required by a single k-d tree node. Since k-d trees may have millions of nodes, optimizing the memory footprint of nodes is important for both performance and memory utilization (higher memory utilization increases cache misses while tracing rays and therefore decreases performance). In particular, notice how pbrt saves space by cleverly positioning child nodes in relation to a parent node. This scheme may need to change in your lazy implementation.
  • Describe the heuristics of choosing the splitting plane and the implementation of the cost function. You may choose to alter the cost function in this assignment.
  • What input data is required to build a new k-d tree node? Your implementation will need to be able to access this data dynamically as your lazy k-d tree is generated during the rendering process.
  • Understand the use of the badrefines variable in kdtree.cpp.

  • Understand how ray traversal through the k-d tree structure works (in the method KdTreeAccel::Intersect). Your lazy implementation will need to modify this algorithm to generate new k-d tree nodes while traversing.

  • Notice that the current k-d tree implementation refines scene objects into base geometric primitives before building the complete k-d tree. What problems does this approach present for lazy evaluation? In particular, look at the implementation of Refine in the loop subdivision shape. What are the benefits of delaying the refine of such objects?

Step 2: Time pbrt's original k-d tree build

The rendering time reported by pbrt does not include the time spent building the acceleration structure. Use the ProgressReporter class to measure the amount of time it takes pbrt to create a k-d tree (see the CreateAccelerator function in kdtree.cpp. Modify the code to time the creation of the KDtreeAccel like this:

  ProgressReporter progress(1, "Building KDTree");

  KdTreeAccel* accel = new KdTreeAccel(prims, isectCost, travCost,
                                       emptyBonus, maxPrims, maxDepth);


Step 3: Implement a Lazy K-D Tree

Begin by downloading the Assignment 2 pbrt scene files located at

manykilleroos.pbrt is the scene you will render in this assignment. This main scene file includes the files killeroo_subdiv0.pbrt, killeroo_subdiv1.pbrt, and killeroo_row.pbrt. At the top of manykilleroos.pbrt you will notice 3 different camera positions specified by different LookAt transforms which correspond to the 3 camera position in the images shown above.

Modify kdtree.cpp so that the k-d tree accelerator uses lazy evaluation to gain efficiency for complex scenes. Be sure to keep a copy of the original k-d tree accelerator implementation around for use in later assignments and for debugging and test in this assignment. (Alternatively, you may want to implement your lazy k-d tree accelerator as a new class in pbrt. This is more work, as it requires modification of the pbrt project/Makefiles, and some of the surrounding pbrt code, but would allow you to select between your and the original k-d tree implementation by only changing the scene file, not recompiling the k-d tree accelerator module. Ask for help if you wish to do this, and cannot figure out how to do so on your own).

This assignment is extremely open ended, and there are many ways to correctly implement the assignment. In addition to the correctness of your code, your grade will depend largely on your ability to describe the strategies you tried and your evaluation of them on the test scene. Here are some suggestions to get you started:

  • Think about how much of the tree you want to build during the initial build process. You may want to begin with a single node or initially build the tree out to an certain depth. Note that in the limit, the entire tree is constructed in the preprocess, and you have returned to pbrt's original k-d tree implementation.
  • You will need to have a mechanism to mark a node as a 'lazy' node, ie. a node that will need to be refined further when intersected by a ray. Hint: Lazy nodes will need a means to access all the data necessary to construct the remaining subtree.
  • It is likely that your implementation will need to manage a number of temporary data structures (see previous comment). The performance of your implementation will depend on our choices related to memory management.
  • Consider ways to be lazy about the refinement of objects into their base geometric primitives (triangles in the case of this assignment). It may be more efficient to build a tree of conglomerate objects and then refine and build k-d trees for these objects later. As discussed in class, a k-d tree need not contain only base primitives, it might contain sub k-d trees as well.

Advanced Suggestions

  • Pbrt currently assumes a constant intersection cost for all primitives when building a k-d tree. This is reasonable since it fully refines all primitives before insertion into the tree. The presence of unrefined conglomerate objects means that the intersection cost of each object in the k-d tree might vary widely. How might you account for this in your implementation?
  • Tune your k-d tree layout for optimal efficiency (minimize storage, improve layout for cache performance, etc). The current pbrt implementation uses 8 bytes per k-d tree node.
  • Experiment with different heuristics for making node splitting decisions. Do view-dependent heuristics work better than the view-independent surface heuristic?

Debugging suggestions

  • We've included a file killeroo_stub.pbrt in the assignment zip file. Replacing killeroo_row.pbrt's references to killeroo_subdiv0.pbrt with killeroo_stub.pbrt will replace all the big killeroos in the scene with similarly sized spheres. This will significantly decrease the scene's k-d tree build and rendering times, which might be useful to you during debugging and test.

  • Also remember tricks such as decreasing the image size, decreasing sampling rates, and using cropwindow to speed up your rendering times.

Step 4: Evaluation Part 1

The evaluation of the quality of your implementation is an important part of this assignment. Test your lazy k-d tree implementation by rendering all 3 views of the killeroo scene and compare your results with the regular pbrt k-d tree implementation. Write up a report that addresses the following:

  • Precisely describe your lazy k-d tree implementation.
    • Describe your node representation? How many bytes are required to store a node?
    • What did you compute in the preprocess? How many levels did you build at the start?
    • How many levels do you build when dynamically generating new nodes in the tree?
    • When do you refine scene objects?
    • Describe all heuristics you changed or tried?
  • Fill out the following table for each of the three views:

pbrt K-D Tree

my Lazy K-D Tree

Ratio (lazy/non-lazy reference)

build time (secs)




total time (secs)




nodes made




Triangle ray intersections




  • Is the render time (not counting k-d tree build) using your lazy k-d tree as fast as it was when using the regular pbrt implementation? If there are differences in performance, describe why you think this is the case.
  • Describe how any heuristics or advanced optimizations you tried effected the above numbers.

Step: 5 Render on a BIG scene

Lastly, we'd like you increase subdivision on all 76 killeroos in the scene to be at least level 1 (the 6 baby killeroos are already subdivided to this level). To do this, modify killeroo_row.pbrt so that the file includes copies of killeroo_subdiv1.pbrt instead of killeroo_subdiv0.pbrt. Now the scene, fully refined, contains 2.5 million triangles. Render the scene from the close up viewpoint using your lazy k-d tree implementation, and compare statistics with the original pbrt implementation if possible. Note that on systems with only a gig of RAM, using the original non-lazy k-d tree implementation many cause pbrt to swap. We hope your system can complete a rendering using pbrt's original k-d tree implementation. Note that if you lazily built a tree, but did not implement lazy refinement, your memory footprint may still be very large (why is this the case?). This test highlights the importance of lazy refinement and building. It makes rendering very large scenes possible when the entire scene is not in view. Include the results of this test in your writeup.

Step 6: Submission

  • We've created wiki pages (FirstnameLastname/Assignment2) for all students in the class. Access to these pages is set up so that only you can view your page. Please compose your writeup on this page and link to it from the Assignment2Writeups page.

  • Upload the 4 required image files (in EXR or png format, rendered at 400x400) as attachments to this wiki page. The four required images are the 3 views of the scene + the close up view using all killeroos at subdivision level 1 (described in step 5). Note that you can link to the images to display them on your wikipage using attachment:filename.

  • Upload your code (kdtree.cpp) as an attachment to the wiki page. If you modified additional files include all your code in a single zip file.


This assignment (as well as all future assignments in the class) will be graded on a 4 point scale:

  • 1 point: Code does not work correctly: either produces incorrect visual results or is not lazy
  • 2 points: Functional implementation of lazy build but no lazy refinement
  • 3 points: Implementation of lazy build and lazy refinement, code can produce all 4 images correctly and outperforms a non-lazy build on the close up image (configuration 3). Writeup is minimal.
  • 4 points: Completion of all requirements for 3 points + clear writeup addressing points discussed in steps 4 and 5. Optional: experimentation and documentation of new heuristics or advanced techniques.

Extra credit will be given for exceptional experimentation with heuristics or techniques to improve the performance or quality (for example: the number of nodes created) of your lazy k-d tree implementation.