Papers
 
Micropolygon Rasterization Data-Parallel Rasterization of Micropolygons with Defocus and Motion Blur
Kayvon Fatahalian, Edward Luong, Solomon Boulos, Kurt Akeley, William R. Mark and Pat Hanrahan
In Proceedings of High Performance Graphics 2009

Modern rasterization implementations are tuned for large polygons that are inefficient for micropolygon workloads. In this work, we analyze three data-parallel rasterization algorithms designed for micropolygons under defocus (depth of field) and motion blur. For situations with small defocus blur or heavy motion blur, our new algorithm based on interleaved sampling outperforms a patented Pixar algorithm by roughly 7x. Despite the high efficiency of our algorithm, micropolygon rasterization is incredibly expensive requiring nearly one teraflop for stationary micropolygons at 60 frames per second assuming 10 million micropolygons. The inclusion of blurry effects increases this cost by an order of magnitude suggesting the need for a fixed-function hardware implementation.

 
GRAMPS graphs GRAMPS: A Programming Model for Graphics Pipelines
Jeremy Sugerman, Kayvon Fatahalian, Solomon Boulos, Kurt Akeley, and Pat Hanrahan
ACM Transactions on Graphics, Volume 29, Issue 1, January 2009

With GRAMPS, we explored describing parallel applications as a computation graph connected via queues. To demonstrate the applicability of such a programming model, we built a D3D pipeline, a packet based ray tracer, and a combination of the two on top of our model. We then simulated our system on both "CPU-like" and "GPU-like" hardware configurations to measure feasibility of such a model. Overall the approach seems viable though there are many features left out that could use some future work.

 
SIMD Box Test Speedup Adaptive Ray Packet Reordering
Solomon Boulos, Ingo Wald and Carsten Benthin
Proceedings of IEEE Symposium on Interactive Ray Tracing 2008
Slides (PDF)

Modern high-performance ray tracers use large ray packets and SIMD instruction sets to decrease both the computational and bandwidth cost compared to a single ray implementation. Current global illumination renderers, however, are still based around single ray implementations and interfaces. The presumption is that while packets have been shown to work well for highly coherent rays, in the presence of less coherent secondary ray distributions the gains of both packet and SIMD techniques dwindle rapidly. With low enough coherence, performance can be reduced to being as slow as reasonable single ray code -- if not worse -- so the benefit of packets for a global illumination system is assumed to be next to none. With SIMD width expanding in future architectures, leaving SIMD units underutilized means a massive loss in performance compared to the maximum performance achievable. In this paper, we present a method for recovering packet and SIMD coherence for incoherent secondary ray distributions through demand-driven reordering of rays into more coherent packets. We demonstrate that the reordering overhead is outweighed by the increased coherence within a prototypical implementation in the Manta realtime ray tracer among a wide variety of ray distributions, including diffuse path tracing.

 
N-ary BVH Getting Rid of Packets: Efficient SIMD Single-Ray Traversal using Multi-branching BVHs
Ingo Wald, Carsten Benthin and Solomon Boulos
Proceedings of IEEE Symposium on Interactive Ray Tracing 2008

While contemporary approaches to SIMD ray tracing typically rely on traversing packets of coherent rays through a binary data structure, we instead evaluate the alternative of traversing individual rays through a bounding volume hierarchy with a branching factor of 16.Though obviously less efficient than high-performance packet techniques for primary rays, we demonstrate that for less coherent secondary ray distributions this approach is at least competitive with (and often faster than) typical packet traversal techniques.

 
View from shading point for normal (a) and prefiltered (b) BVH Raytracing Prefiltered Occlusion for Aggregate Geometry
Dylan Lacewell, Brent Burley, Solomon Boulos and Peter Shirley
Proceedings of IEEE Symposium on Interactive Ray Tracing 2008

We prefilter occlusion of aggregate geometry, e.g., foliage or hair, storing local occlusion as a directional opacity in each node of a bounding volume hierarchy (BVH). During intersection, we terminate rays early at BVH nodes based on ray differential, and composite the stored opacities. This makes intersection cost independent of geometric complexity for rays with large differentials, and simultaneously reduces the variance of occlusion estimates. These two algorithmic improvements result in significant performance gains for soft shadows and ambient occlusion. The prefiltered opacity data depends only on geometry, not lights, and can be computed in linear time based on assumptions about the statistics of aggregate geometry.

 
oRGB Gamut Visualizations oRGB: A Practical Opponent Color Space for Computer Graphics
Margarita Bratkova, Solomon Boulos, and Peter Shirley
IEEE CG&A, Volume 29, Issue 1, 2009

We present a new color model, oRGB, that is based on opponent color theory. Like HSV, it is designed specifically for computer graphics. However, it is also designed to work well for computational applications such as color transfer, where HSV falters. Despite being geared towards computation, oRGB's natural axes facilitate HSV-style color selection and manipulation. oRGB also allows for new applications such as a quantitative cool-to-warm metric, intuitive color manipulations and variations, and simple gamut mapping. This new color model strikes a balance between simplicity and the computational qualities of color spaces such as CIE L*a*b*.

Note: The version here is a pre-print in the ACM SIGGRAPH format.
 
TRaX Core TRaX: A Multi-Threaded Architecture for Real-Time Ray Tracing
Josef Spjut, Daniel Kopta, Erik Brunvand, Solomon Boulos, and Spencer Kellis
Proceedings of the IEEE Symposium on Application Specific Processors, SASP 2008

During an architecture course on multithreaded special purpose architectures I worked on my first cycle accurate simulator based on some ideas we had been throwing around for an architecture for real-time ray tracing. After leaving Utah the project certainly lived on without me and resulted in this publication. The most interesting idea in this architecture is abandoning SIMD in favor of a more "dynamic VLIW" through a crossbar mechanism.
 
RTSL Velvet Sphere RTSL: a Ray Tracing Shading Language
Steven G. Parker, Solomon Boulos, James Bigler, and Austin Robinson.
Proceedings of IEEE Symposium on Interactive Ray Tracing 2007

We present a new domain-specific programming language suitable for extending both interactive and non-interactive ray tracing systems. This language, called ``ray tracing shading language'' (RTSL), builds on the GLSL language that is a part of the OpenGL specification and familiar to GPU programmers. This language allows a programmer to implement new cameras, primitives, textures, lights, and materials that can be used in multiple rendering systems. RTSL presents a single-ray interface that is easy to program for novice programmers. Through an advanced compiler, packet-based SIMD-optimized code can be generated that is performance competitive with hand-optimized code.
 
Pulli Edit Interactive Editing and Modeling of Bidirectional Texture Functions
Jan Kautz, Solomon Boulos, and Fredo Durand.
Proceedings of ACM SIGGRAPH 2007
Video (DivX 6) | Code

Bidirectional texture functions (BTFs) extend the notion of a texture to have both lighting and view variations. While a few groups have captured a handful of high quality measured BTFs, content creators had no way to get a red wool sweater from the blue one provided. In this paper, we describe the editing operations we implemented inside of BTFShop: an interactive BTF editing system.
 
Conference Scene (DRT) Packet-based Whitted and Distribution Ray Tracing
Solomon Boulos, David Edwards, J. Dylan Lacewell, Joe Kniss, Jan Kautz, Ingo Wald and Peter Shirley
Proceedings of Graphics Interface 2007
BibTeX | Presentation (PDF)

Much progress has been made toward interactive ray tracing, but most research has focused specifically on ray casting. A common approach is to use "packets" of rays to amortize cost across sets of rays. Whether "packets" can be used to speed up the cost of reflection and refraction rays is unclear. The issue is complicated since such rays do not share common origins and often have less directional coherence than viewing and shadow rays. Since the primary advantage of ray tracing over rasterization is the computation of global effects, such as accurate reflection and refraction, this lack of knowledge should be corrected. We are also interested in exploring whether distribution ray tracing, due to its stochastic properties, further erodes the effectiveness of techniques used to accelerate ray casting. This paper addresses the question of whether packet-based ray algorithms can be effectively used for more than visibility computation. We show that by choosing an appropriate data structure and a suitable packet assembly algorithm we can extend the idea of "packets" from ray casting to Whitted-style and distribution ray tracing, while maintaining efficiency.

Note: a previous version of this work is contained in a technical report (see below).
 
Fairy Forest Ray Tracing Deformable Scenes using Dynamic Bounding Volume Hierarchies
Ingo Wald, Solomon Boulos and Peter Shirley
ACM Transaction on Graphics, Volume 26, Issue 1, January 2007
BibTeX

Ray tracing systems have traditionally been limited to static walkthroughs. The reason for this is that the acceleration structures used to speed up ray-scene queries are usually computed as an expensive offline preprocess. This work demonstrates that a bounding volume hierarchy (BVH) can instead be updated per frame to handle animated scenes. BVH construction and traversal is also outlined, so that the BVH can be made competitive with previous best practices for interactive ray tracing.
 
Hogum Image Synthesis using Adjoint Photons
R. Keith Morley, Solomon Boulos, Jared Johnson, Dave Edwards, Peter Shirley, Michael Ashikhmin and Simon Premoze
Proceedings of Graphics Interface 2006
BibTeX | Presentation

Traditional rendering systems have incorporated several layers of hacks based on simple scenes such as the Cornell Box. Techniques built for the Cornell Box, however, do not easily apply to more complicated scenes such as the figure to the left which is computed using the measured radiance of the Sun as well as Rayleigh scattering to get the blue sky. The smoky cloud is from simulation data and would traditionally slow down a renderer considerably. In our system, we present an image synthesis algorithm based on tracing photons from sensors to light sources and demonstrate that this is physically valid and easier to motivate than Kajiya's path tracing algorithm.
 
Transparent Boeing 777 An Application of Scalable Massive Model Interaction using Shared-Memory Systems
Abe Stephens, Solomon Boulos, James Bigler, Ingo Wald, and Steven G. Parker
Proceedings of the 7th Eurographics Symposium on Parallel Graphics and Visualization, May 2006
Using the Manta Interactive Ray Tracer, we developed a complete digital mockup demo for the Boeing 777 dataset. The dataset consists of color coded 350 million polygons. To aid in visualization and interaction with the dataset we added transparent rendering, cutting planes, object hiding based on part serial numbers and ambient occlusion. Using a large shared-memory system, we were able to achieve interactive performance for this demo at both SIGGRAPH 2005 and Supercomputing 2005.
 
Anisotropic Spheres The Halfway Vector Disk for BRDF Modeling
Dave Edwards, Solomon Boulos, Jared Johnson, Peter Shirley, Michael Ashkikhmin, Michael Stark and Chris Wyman
ACM Transactions on Graphics Volume 25 Issue 1, Jan 2006

This work sought to investigate energy conserving models of reflectance for computer graphics. We developed a new set of parameterizations based on a halfway vector disk, which allows us to ensure energy conservation. While our new model is not able to ensure reciprocity, we are able to fit measured data with very few coefficients (colors and exponents).

 
Lucy image Memory Sharing for Interactive Ray Tracing on Clusters
David E. DeMarle, Christiaan P. Gribble, Solomon Boulos and Steven G. Parker
Parallel Computing, Vol. 31, No. 2, pp. 221--242. 2005.

In this paper, we describe some techniques we used to reduce the number of cache misses in our distributed interactive ray tracer (dirt). Because the system is distributed over a cluster of workstations using a gigabit interconnect, the cost of a cache miss is nearly 600 milliseconds. By reordering geometric data, such as the Lucy model from Stanford, we were able to reduce cache misses by more than a factor of two when local memory available was low.

Technical Reports
 
Adaptively subdivided production scene Packet-based Ray Tracing of Catmull-Clark Subdivision Surfaces
Carsten Benthin, Solomon Boulos, Dylan Lacewell, and Ingo Wald
Updated version of Technical Report, SCI Institute, University of Utah, No UUSCI-2007-011, 2007
BibTeX

Efficient ray tracing of subdivision surfaces is an important problem in production rendering, and for interactive applications in the near future. The current hardware trends for both CPUs and GPUs suggest that compute power is outpacing bandwidth. Despite this, current approaches for ray tracing subdivision surfaces favor geometry caches or full pre-tessellation. We demonstrate that directly ray tracing subdivision surfaces using ray packets uses much less bandwidth, while still providing amortization benefits. Our proposed method performs competitively with pre-tessellation even on current hardware, outperforms a single-ray implementation by up to 16x and Pixar's PRMan 13.0 geometry caching by up to 23.1x.
 
Densely tessellated blade model SIMD Ray Stream Tracing
Ingo Wald, Christiaan Gribble, Solomon Boulos and Andrew Kensler
Technical Report, SCI Institute, University of Utah, No UUSCI-2007-012, 2007
BibTeX

After trying a lot of different techniques to sort rays before sending them through a BVH, we decided instead to do exact SIMD filtering at each node in a BVH. Naturally this leads to very high SIMD utilization.
 
An ashtray rendered with distribution ray tracing Interactive Distribution Ray Tracing
Solomon Boulos, David Edwards, J. Dylan Lacewell, Joe Kniss, Jan Kautz, Ingo Wald and Peter Shirley
Technical Report, SCI Institute, University of Utah, No UUSCI-2006-022, 2006
BibTeX | Movie | SIGGRAPH 2006 Course Presentation (PDF)

Current interactive ray tracing systems have focused on highly coherent sets of rays such as primary rays or rays shot to a single point light. In this work we investigate interactive distribution ray tracing in the spirit of Cook's original paper from 1984. Using the bounding volume hierarchy (BVH) from our previous work, we maintain interactivity while including effects such as depth of field, area light sources, motion blur, glossy reflection, refraction and volume rendering. We also demonstrate methods for stable sampling to avoid frame to frame scintillation or crawl and demonstrate how to avoid aliasing artifacts caused by repeated tiling of static sample patterns.
 
IA Example Geometric and Arithmetic Culling Methods for Entire Ray Packets
Solomon Boulos, Ingo Wald and Peter Shirley
Technical Report, School of Computing, University of Utah, No UUCS-06-10, 2006

Recent interactive ray tracing performance has been mainly derived from the use of ray packets. Larger ray packets allow for significant amortization of both computations and memory accesses; however, the majority of primitives are still intersected by each ray in a packet.

This paper discusses several methods to cull entire ray packets against common primitives (box, triangle, and sphere) that allows an arbitrary number of rays to be tested by a single test. This provides cheap ``all miss'' or ``all hit'' tests and may substantially improve the performance of an interactive ray tracer. The paper surveys current methods, provides details on three particular approaches using interval arithmetic, bounding planes, and corner rays, describes how the respective bounding primitives can be easily and efficiently constructed, and points out the relation among the different fundamental concepts.
Course Notes
 
EG07Star State of the Art in Ray Tracing Animated Scenes
Ingo Wald, William R. Mark, Johannes Gunther, Solomon Boulos, Thiago Ize, Warren Hunt, Steven G Parker, and Peter Shirley
Eurographics 2007 State of the Art Reports

There had been a lot of work in ray tracing animated scenes in the past couple of years. In this report, those of us that had done a lot of work in the area got together to compare the different approaches. It provides a fairly good overview for someone just finding out about ray tracing animated scenes and presents some interesting comparisons that wouldn't be possible in a single regular format paper.
 
Ray-BVH Intersection Notes on Efficient Ray Tracing
Solomon Boulos
SIGGRAPH 2005 Course: Introduction to Real-Time Ray Tracing

This little set of notes was largely ignored but represented what I had learned about making a ray tracer fast circa summer 2005. A lot of stuff has changed due to ray packets, but for single ray code this is still a good piece of information. Consequently, Eric Haines found it lying around on a SIGGRAPH DVD and we talked about it in the Ray Tracing News.