
KAYVON
FATAHALIAN
Assistant Professor of Computer Science
Stanford University

Gates Building 366
650-497-6043
I architect high-performance visual computing systems that enable
immersive and intelligent visual computing applications. In pursuit of
these goals, my recent research efforts can be categorized into two
main themes:
The Visual Data Analysis Engine. The
next generation of visual computing applications will require
efficient analysis and mining of large repositories of visual data
(images, videos, RGBD). But scaling visual data analysis to operate on
collections the size of all public photos and videos on Facebook, all
security video cameras in a major city, or petabytes of images in an
astronomy sky survey, presents supercomputing-scale storage and
computation challenges. Very few programmers have the capability to
operate efficiently at these scales, inhibiting the field's
exploration of advanced data-driven visual computing applications. To
meet this challenge, we are developing a distributed computing
platform -- combining ideas from high-performance image processing
languages, data analytics, and database functionality -- that
facilitates the development of applications that query, analyze and
mine image and video collections at scale.
The Graphics Engine Compiler. We are
developing new programming abstractions and compiler frameworks for
graphics engines of the future -- engines that will deliver rich
virtual worlds to platforms ranging from beefy GPUs powering VR
headsets to energy-limited operation on mobile devices. A key aspect
of this project is enabling rapid exploration of the complex space
of optimization decisions present in these systems. We're tackling
the problem using new shading languages that separate key
optimization decisions from high-level program specification and via
automatic shader optimization techniques that make approximations
(like automatic code simplification) to achieve desirable
performance-quality trade-offs. We seek to make it possible to
rapidly generate highly efficient graphics systems specialized to
application content or target machine architecture.
I'm always looking for great students
that wish to work on these topics, or bring their own great
ideas.
I recently gave
an Arch 2030 workshop
keynote on how visual computing applications will drive architectual
innovation in the year 2030. Here
is the
talk on Youtube, and
an updated version of the
slides is here.
TEACHING
- This quarter I am teaching CS248: Interactive Computer Graphics (Spring 2018)
- CS348V: Visual Computing Systems (Winter 2018)
Before moving to Stanford, I taught the following courses at CMU.
-
15-418/15-618: Parallel Computer Architecture and Programming
(Spring 2012, 2013, 2014, 2015, 2016, 2017, and at Tsinghua in Summer 2017) - 15-769: Visual Computing Systems (Fall 2013, Fall 2014, Fall 2016)
- 15-462/662: Computer Graphics (Fall 2015)
- 15-869: Graphics and Imaging Architectures (Fall 2011)
Here are a few tips on how to give clear research talks (or class project talks).
I created this talk, Do Grades
Matter, to challenge students at CMU to think bigger than just
striving to get good grades in a bunch of hard classes.
STUDENTS
- David Durst (Stanford Ph.D.)
- Yong He (CMU Ph.D.)
- Ravi Mullapudi (CMU Ph.D., co-advised with Deva Ramanan)
- Alex Poms (CMU Ph.D.)
- Evan Shimizu (CMU Ph.D.)
- Yanzhe Yang (CMU Ph.D., co-advised with Jessica Hodgins)
Former students:
- Minjae Lee (CMU CSD B.S., now Ph.D. at Stanford)
- Chenxi Liu (CSD M.S., now Ph.D. at UBC)
- Krishna Kumar Singh (CMU RI M.S., now Ph.D. at UC Davis)
- Will Crichton (CMU CSD B.S., now Ph.D. at Stanford)
- Karima Ma (CMU CSD M.S., now Ph.D. at UC Berkeley)
PUBLICATIONS
Scanner: Efficient Video Analysis at Scale
Alex Poms, William Crichton, Pat Hanrahan, Kayvon Fatahalian
SIGGRAPH 2018 (to appear)
Alex Poms, William Crichton, Pat Hanrahan, Kayvon Fatahalian
SIGGRAPH 2018 (to appear)
Slang: Language Mechanisms for Building Extensible Real-time Shading Systems
Yong He, Kayvon Fatahalian, Tim Foley
SIGGRAPH 2018 (to appear)
Yong He, Kayvon Fatahalian, Tim Foley
SIGGRAPH 2018 (to appear)
HydraNets: Specialized Dynamic Architectures for Efficient Inference
Ravi Mullapudi, William R. Mark, Noam Shazeer, Kayvon Fatahalian
CVPR 2018 (to appear)
Ravi Mullapudi, William R. Mark, Noam Shazeer, Kayvon Fatahalian
CVPR 2018 (to appear)
Shader Components: Modular and High Performance Shader Development
Yong He, Tim Foley, Teguh Hofstee, Haomin Long, Kayvon Fatahalian
SIGGRAPH 2017
Yong He, Tim Foley, Teguh Hofstee, Haomin Long, Kayvon Fatahalian
SIGGRAPH 2017
Automatically Scheduling Halide Image Processing Pipelines
Ravi Teja Mullapudi, Andrew Adams, Dillon Sharlet, Jonathan Ragan-Kelley, Kayvon Fatahalian
SIGGRAPH 2016
Ravi Teja Mullapudi, Andrew Adams, Dillon Sharlet, Jonathan Ragan-Kelley, Kayvon Fatahalian
SIGGRAPH 2016
A System for Rapid Exploration of Shader Optimization Choices
Yong He, Tim Foley, Kayvon Fatahalian
SIGGRAPH 2016
Yong He, Tim Foley, Kayvon Fatahalian
SIGGRAPH 2016
LED Street Light Research Project Part II: New Findings
Stephen Quick, Donald Carter, Kayvon Fatahalian, Cynthia Limauro
CMU Technical Report, Summer 2016
Stephen Quick, Donald Carter, Kayvon Fatahalian, Cynthia Limauro
CMU Technical Report, Summer 2016
The Rise of Mobile Visual Computing Systems
Kayvon Fatahalian
IEEE Pervasive Computing, April/June 2016
Kayvon Fatahalian
IEEE Pervasive Computing, April/June 2016
Automatically Splitting a Two-Stage Lambda Calculus
Nicolas Feltman, Carlo Anguili, Umut A. Acar, Kayvon Fatahalian
European Symposium on Programming (ESOP) 2016
Nicolas Feltman, Carlo Anguili, Umut A. Acar, Kayvon Fatahalian
European Symposium on Programming (ESOP) 2016
KrishnaCam: Using a Longitudinal, Single-Person, Egocentric Dataset for Scene Understanding Tasks
Krishna Kumar Singh, Kayvon Fatahalian, Alexei Efros
WACV 2016
Krishna Kumar Singh, Kayvon Fatahalian, Alexei Efros
WACV 2016
A System for Rapid, Automatic Shader Level-of-Detail
Yong He, Tim Foley, Natalya Tatarchuk, Kayvon Fatahalian
SIGGRAPH Asia 2015
Yong He, Tim Foley, Natalya Tatarchuk, Kayvon Fatahalian
SIGGRAPH Asia 2015
Aggregate G-Buffer Anti-Aliasing
Cyril Crassin, Morgan McGuire, Kayvon Fatahalian, Aaron Lefohn
I3D 2015
(An updated and extended version of the paper appears in TVCG 2016.)
Cyril Crassin, Morgan McGuire, Kayvon Fatahalian, Aaron Lefohn
I3D 2015
(An updated and extended version of the paper appears in TVCG 2016.)
Extending the Graphics Pipeline with Adaptive, Multi-Rate Shading
Yong He, Yan Gu, Kayvon Fatahalian
SIGGRAPH 2014
Yong He, Yan Gu, Kayvon Fatahalian
SIGGRAPH 2014
Self-Refining Games using Player Analytics
Matt Stanton, Ben Humberston, Brandon Kase, James O'Brien, Kayvon Fatahalian, Adrien Treuille
SIGGRAPH 2014
Matt Stanton, Ben Humberston, Brandon Kase, James O'Brien, Kayvon Fatahalian, Adrien Treuille
SIGGRAPH 2014
Near-exhaustive Precomputation of Secondary Cloth Effects
Doyub Kim, Woojong Koh, Rahul Narain, Kayvon Fatahalian, Adrien Treuille, James O'Brien
SIGGRAPH 2013
Doyub Kim, Woojong Koh, Rahul Narain, Kayvon Fatahalian, Adrien Treuille, James O'Brien
SIGGRAPH 2013
Efficient BVH Construction via Approximate Agglomerative Clustering
Yan Gu, Yong He, Kayvon Fatahalian, Guy Blelloch
High Performance Graphics 2013
Yan Gu, Yong He, Kayvon Fatahalian, Guy Blelloch
High Performance Graphics 2013
SRDH: Specializing BVH Construction and Traversal Order Using Representative Shadow Ray Sets
Nicolas Feltman, Minjae Lee, Kayvon Fatahalian
High Performance Graphics 2012
Nicolas Feltman, Minjae Lee, Kayvon Fatahalian
High Performance Graphics 2012
Evolving the Real-Time Graphics Pipeline for Micropolygon Rendering
Kayvon Fatahalian, Stanford University Ph.D. Dissertation, 2011
Kayvon Fatahalian, Stanford University Ph.D. Dissertation, 2011
Reducing Shading on GPUs using Quad-Fragment Merging
Kayvon Fatahalian, Solomon Boulos, James Hegarty, Kurt Akeley, William R. Mark, Henry Moreton, Pat Hanrahan
SIGGRAPH 2010
Kayvon Fatahalian, Solomon Boulos, James Hegarty, Kurt Akeley, William R. Mark, Henry Moreton, Pat Hanrahan
SIGGRAPH 2010
Space-Time Hierarchical Occlusion Culling for Micropolygon Rendering with Motion Blur
Solomon Boulos, Edward Luong, Kayvon Fatahalian, Henry Moreton, Pat Hanrahan
High Performance Graphics 2010
Solomon Boulos, Edward Luong, Kayvon Fatahalian, Henry Moreton, Pat Hanrahan
High Performance Graphics 2010
Hardware Implementation of Micropolygon Rasterization with Motion and Defocus Blur
John S. Brunhaver, Kayvon Fatahalian, Pat Hanrahan
High Performance Graphics 2010
John S. Brunhaver, Kayvon Fatahalian, Pat Hanrahan
High Performance Graphics 2010
A Lazy Object-Space Shading Architecture With Decoupled Sampling
Christopher A. Burns, Kayvon Fatahalian, William R. Mark
High Performance Graphics 2010
Christopher A. Burns, Kayvon Fatahalian, William R. Mark
High Performance Graphics 2010
DiagSplit: Parallel, Crack-Free, Adaptive Tessellation for Micropolygon Rendering
Matthew Fisher, Kayvon Fatahalian, Solomon Boulos, Kurt Akeley, William R. Mark, Pat Hanrahan
SIGGRAPH Asia 2009
Matthew Fisher, Kayvon Fatahalian, Solomon Boulos, Kurt Akeley, William R. Mark, Pat Hanrahan
SIGGRAPH Asia 2009
Data-Parallel Rasterization of Micropolygons with Defocus and Motion Blur
Kayvon Fatahalian, Edward Luong, Solomon Boulos, Kurt Akeley, William R. Mark, Pat Hanrahan
High Performance Graphics 2009
Kayvon Fatahalian, Edward Luong, Solomon Boulos, Kurt Akeley, William R. Mark, Pat Hanrahan
High Performance Graphics 2009
GRAMPS:
A Programming Model for Graphics Pipelines
Jeremy Sugerman, Kayvon Fatahalian, Solomon Boulos, Kurt Akeley, Pat Hanrahan
Transactions on Graphics (TOG) January 2009
Jeremy Sugerman, Kayvon Fatahalian, Solomon Boulos, Kurt Akeley, Pat Hanrahan
Transactions on Graphics (TOG) January 2009
A Closer Look at GPUs
Kayvon Fatahalian and Mike Houston
Communications of the ACM. Vol. 51, No. 10 (October 2008)
(also published as "GPUs: A Closer Look": ACM Queue. March/April. 2008)
Kayvon Fatahalian and Mike Houston
Communications of the ACM. Vol. 51, No. 10 (October 2008)
(also published as "GPUs: A Closer Look": ACM Queue. March/April. 2008)
A Portable Runtime Interface for Multi-level Memory Hierarchies
Mike Houston, Ji Young Park, Manman Ren, Timothy J. Knight, Kayvon Fatahalian, Alex Aiken, William J. Dally, Pat Hanrahan
PPOPP 2008
Mike Houston, Ji Young Park, Manman Ren, Timothy J. Knight, Kayvon Fatahalian, Alex Aiken, William J. Dally, Pat Hanrahan
PPOPP 2008
Compilation for Explicitly Managed Memory Hierarchies
Timothy J. Knight, Ji Young Park, Manman Ren, Mike Houston, Mattan Erez, Kayvon Fatahalian, Alex Aiken, William J. Dally, Pat Hanrahan
PPOPP 2007
Timothy J. Knight, Ji Young Park, Manman Ren, Mike Houston, Mattan Erez, Kayvon Fatahalian, Alex Aiken, William J. Dally, Pat Hanrahan
PPOPP 2007
Sequoia: Programming the Memory Hierarchy
Kayvon Fatahalian, Timothy J. Knight, Mike Houston, Mattan Erez, Daniel R Horn, Larkhoon Leem, Ji Young Park, Manman Ren, Alex Aiken, William J. Dally, Pat Hanrahan
Supercomputing 2006
Kayvon Fatahalian, Timothy J. Knight, Mike Houston, Mattan Erez, Daniel R Horn, Larkhoon Leem, Ji Young Park, Manman Ren, Alex Aiken, William J. Dally, Pat Hanrahan
Supercomputing 2006
Understanding the Efficiency of GPU Algorithms for Matrix-Matrix Multiplication
Kayvon Fatahalian, Jeremy Sugerman, Pat Hanrahan
Graphics Hardware 2004
Kayvon Fatahalian, Jeremy Sugerman, Pat Hanrahan
Graphics Hardware 2004
Brook for GPUs: Stream Computing on Graphics Hardware
Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, Pat Hanrahan
SIGGRAPH 2004
Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, Pat Hanrahan
SIGGRAPH 2004
Precomputing Interactive Dynamic Deformable Scenes
Doug L. James and Kayvon Fatahalian
SIGGRAPH 2003
Doug L. James and Kayvon Fatahalian
SIGGRAPH 2003
Real-Time Global Illumination of Deformable Objects
Undergraduate Senior Research Thesis (Carnegie Mellon University). 2003.
Advised by Doug James
Undergraduate Senior Research Thesis (Carnegie Mellon University). 2003.
Advised by Doug James
PAST PROJECTS
Self-Refining Interactive Games (graphics with 100's of machines and a lot of latency)
How do we build platforms that take graphics applications from one
user on a single GPU to 10,000 machines and one million users in the
cloud? Even though computer graphics has always been at the vanguard
of parallel computing, there has been little success using modern
cloud-based computing resources to improve interactive experiences. In
this project we asked the question, how could we leverage the massive
storage and batch processing capabilities of the cloud to generate new
forms of interactive worlds -- and we took a "precompute everything"
approach to doing so. Since one cannot precompute everything about
an complex interactive world, the challenge is to determine what is
most important to precompute, so these parts can be presented
to the user with the highest-quality graphics. We find that by
recording statistics of users playing a game, we can build a model of
user behavior, and then concentrate large-scale, cloud-based
precomputation of graphics and physics around the states that users
are most likely to encounter. The result is a
self-refining game whose dynamics improve with play, ultimately
providing realistically rendered, rich fluid dynamics in real time on
a mobile device. For more detail, see our work applied these ideas
to
cloth simulation
and fluid
simulation.
A Real-Time Micropolygon Rendering Pipeline (evolving the GPU pipeline for tiny triangles)
GPUs will soon have the compute horsepower to render scenes
containing cinematic-quality surfaces in real-time. Unfortunately,
if they render these subpixel polygons (micropolygons) using the
same techniques as they do for large triangles today, GPUs will
perform extremely inefficiently. Instead of trying to parallelize
Pixar's Reyes micropolygon rendering system, we're taking a hard
look at how the existing Direct3D 11 rendering pipeline, and GPU
hardware implementations, must evolve to render micropolygon
workloads efficiently in a high-throughput system. Changes to
software interfaces, algorithms, and HW design are fair game! Slides
describing what we've learned can be found in
this SIGGRAPH
course talk or in my
dissertation: Evolving
the Real-Time Graphics Pipeline for Micropolygon Rendering.
GRAMPS (a framework for heterogeneous parallel programming)
There are two ways to think about GRAMPS. Graphics folks should
think of GRAMPS as a system for building custom graphics pipelines.
We simply gave up on adding more and more configurable knobs to
existing pipelines like OpenGL/Direct3D and instead allow the
programmer to programmatically define a custom pipeline with an
arbitrary number of stages connected by queues.
To non-graphics folks, GRAMPS is a stream programming system that
embraces heterogeneity in underlying architecture and
anticipates streaming workloads that exhibit both regular and
irregular (dynamic) behavior. The GRAMPS runtime dynamically
schedules GRAMPS programs onto architectures containing a mixture of
compute-optimized cores, generic CPU cores, and fixed-function
processing units.
The Sequoia
Programming Language ("Programming the Memory Hierarchy")
Sequoia is a hierarchical stream programming language that arose
from the observation that expressing locality, not parallelism is
the most important responsibility of parallel application
programmers in scientific/numerical domains. Sequoia presents a
parallel machine as an abstract hierarchy of memories and gives the
programmer explicit control over data locality and communication
through this hierarchy using first-class language constructs
(basically, Sequoia supports nested kernels and streams of streams).
Sequoia programs have run on a variety of exposed-communication
architectures such as clusters, the CELL processor, GPUs, and even
supercomputing clusters at Los Alamos. The best way to learn about
Sequoia is to read
our SC06
paper. You can also learn more at the
Sequoia project page.
Brook/Merrimac (stream processing for scientific computing)
I helped out with
the BrookGPU
(abstracting the GPU as a stream processor for numerical computing)
and Merrimac Streaming
Supercomputer projects.
SUPPORT
My work is supported by the National Science Foundation, the Heinz Foundation, and by INTEL, NVIDIA, QUALCOMM, GOOGLE, ADOBE, FACEBOOK, ACTIVISION, and APPLE.