Differences between revisions 14 and 15
Deletions are marked like this. | Additions are marked like this. |
Line 31: | Line 31: |
== Run Machine == The tests were run on the machine fileserver.stanford.edu, which is a gentoo box. Intel 2.40 Gz, 1 Gig Ram, 3 Gig Swap. The swap is important because for the plants tests I ran over my memory using both the lazy and the real kdtree. |
|
Line 102: | Line 106: |
||build time (secs) || 36.3 || 5.4 || __% || ||total time (secs) || 72.7 || 60.8 || __% || ||Num of nodes made || 2.752M || 704.1k || __% || || Triangle ray intersections || 673.2k || 673.2k || __%|| |
||build time (secs) || 36.3 || 5.4 || 14.88% || ||total time (secs) || 72.7 || 60.8 || 83.63% || ||Num of nodes made || 2.752M || 704.1k || 25.59% || || Triangle ray intersections || 673.2k || 673.2k || 100%|| |
Line 113: | Line 117: |
||build time (secs) || 36.6 || 5.4 || __% || ||total time (secs) || 67.1 || 36.2|| __% || ||Num of nodes made || 2.752M || 68 || __% || || Triangle ray intersections || 758.4k || 758.4k || __%|| |
||build time (secs) || 36.6 || 5.4 || 14.75% || ||total time (secs) || 67.1 || 36.2|| 53.95% || ||Num of nodes made || 2.752M || 68 || 0.000000025% || || Triangle ray intersections || 758.4k || 758.4k || 100%|| |
Line 123: | Line 127: |
||build time (secs) || 36.7 ||5.4 || __% || ||total time (secs) || 76.7 ||54.7 || __% || ||Num of nodes made || 2.752M || 182.3k || __% || || Triangle ray intersections || 644.1k || 644.1k || __%|| |
||build time (secs) || 36.7 ||5.4 || 14.71% || ||total time (secs) || 76.7 ||54.7 || 71.31% || ||Num of nodes made || 2.752M || 182.3k || 6.62% || || Triangle ray intersections || 644.1k || 644.1k || 100%|| |
Line 133: | Line 137: |
||build time (secs) || 96.3 || __ || __% || ||total time (secs) || 1334.3 || __ || __% || ||Num of nodes made || 14.886M || __ || __% || || Triangle ray intersections || 20.292M || __ || __%|| |
||build time (secs) || 96.3 || 2.0 || 2.08% || ||total time (secs) || 1334.3 || 1320.6 || 98.97% || ||Num of nodes made || 14.886M || 8.682M || 58.32% || || Triangle ray intersections || 20.292M || 19.510M || 96.14%|| |
Line 143: | Line 147: |
||build time (secs) || 95.1 || __ || __% || ||total time (secs) || 2229.0 || __ || __% || ||Num of nodes made || 14.886M || __ || __% || || Triangle ray intersections || 25.917M || __ || __%|| |
||build time (secs) || 95.1 || 2.0 || 2.10% || ||total time (secs) || 2229.0 || 2001.5 || 89.79% || ||Num of nodes made || 14.886M || 10.874M || 73.02% || || Triangle ray intersections || 25.917M || 24.222M || 93.46%|| |
Assignment 2 Lazy KD-Tree
Paul Tarjan
Date submitted: 26 Apr 2006
Code emailed: 26 Apr 2006
Description of implementation approach and comments
I first fixed IntersectP to just call Intersect. Then I added the NULL test in Intersect to call Intersect on the subchild if isect is NULL. This made the code feel better and was easier to extend.
bool KdTreeAccel::IntersectP(const Ray &ray) const {
- return Intersect(ray, NULL);
}
I implemented this twice. The first time, I fullyRefined the primitives right at the start, trying to use some of my build cost so that I don't have to do it later. I didn't realize that some of the nodes might not have to be refinedm and I am wasting cycles (and memory) copying them between lazy nodes. I also added the bounding box, the depth, and a boat load of other data into the node. The numbers you see below are for this implementation.
I then reduced the size of the node class by 12 bytes (from 56 to 44) and noticed a preformance DECREASE. It was due to all the fancy things I was doing to store the data in unused bits and all the math that had to be done to undo the fancy things. I didn't submit this code.
Now, I did something that doesn't fully refine at the begining. It just refines as you go. If you just do this optomization to the non-lazy kdTree, you get from about 63 seconds to 56 seconds for the kangaroo-view1. This moves all the time from the build to the run too. I couldn't get this to work with my implementation of lazy nodes, but judging by the render speedup here, I don't know if it would have been worth it. The main thing that it would have granted would be the extra copies between lazy nodes of the large primitive list.
I later noticed (with the help of our lovely TA's) that you don't need the belowChild data (because it is always aboveChild+1). Also, I noticed that I could reuse a few more bytes without my funky bit shifts. After adding this, I saved 20 bytes, and the render time went down by about 10%. This was the second submission.
Count a bit off
One thing about the counts, is a lazy node becomes another node when it is made "unLazy". Should I count this as being 2 nodes created, or just one? I counted it as just 1 node to make my numbers look better
Const
I never could get the const thing to go. I made all my data mutable and it still didn't like me. I had to call build_tree somehow, but if I made that const, then I had to call initLeaf, which called the memory Area stuff. I had to use the const_cast hack. Sorry
Run Machine
The tests were run on the machine fileserver.stanford.edu, which is a gentoo box. Intel 2.40 Gz, 1 Gig Ram, 3 Gig Swap. The swap is important because for the plants tests I ran over my memory using both the lazy and the real kdtree.
Tons of Build Messages
I did as Kayvon suggested and wraped the construction of my object in building StatCounters. Then It printed:
Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (1.2s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.1s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.1s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.1s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.1s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.1s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.1s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.1s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.1s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.1s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.1s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.0s) Building: [++++++++++++++++++++++++++++++++++++++++++++++++++] (0.3s)
I just summed these to get build time. Is this right? How come I am getting so many accelerators made?
Final Images Rendered with my implementation of lz-kdtree.cpp
killeroos-view1.pbrt (Killeroos visible)
|
KD Tree |
Lazy KD Tree |
Ratio |
build time (secs) |
36.3 |
5.4 |
14.88% |
total time (secs) |
72.7 |
60.8 |
83.63% |
Num of nodes made |
2.752M |
704.1k |
25.59% |
Triangle ray intersections |
673.2k |
673.2k |
100% |
"killeroos-view2.pbrt (Killeroos invisible)"
|
KD Tree |
Lazy KD Tree |
Ratio |
build time (secs) |
36.6 |
5.4 |
14.75% |
total time (secs) |
67.1 |
36.2 |
53.95% |
Num of nodes made |
2.752M |
68 |
0.000000025% |
Triangle ray intersections |
758.4k |
758.4k |
100% |
"killeroos-view3.pbrt (close-up)"
|
KD Tree |
Lazy KD Tree |
Ratio |
build time (secs) |
36.7 |
5.4 |
14.71% |
total time (secs) |
76.7 |
54.7 |
71.31% |
Num of nodes made |
2.752M |
182.3k |
6.62% |
Triangle ray intersections |
644.1k |
644.1k |
100% |
"plants-view1.pbrt"
|
KD Tree |
Lazy KD Tree |
Ratio |
build time (secs) |
96.3 |
2.0 |
2.08% |
total time (secs) |
1334.3 |
1320.6 |
98.97% |
Num of nodes made |
14.886M |
8.682M |
58.32% |
Triangle ray intersections |
20.292M |
19.510M |
96.14% |
"plants-view2.pbrt"
|
KD Tree |
Lazy KD Tree |
Ratio |
build time (secs) |
95.1 |
2.0 |
2.10% |
total time (secs) |
2229.0 |
2001.5 |
89.79% |
Num of nodes made |
14.886M |
10.874M |
73.02% |
Triangle ray intersections |
25.917M |
24.222M |
93.46% |