The latest generation of graphics hardware supports fully programmable vertex and pixel/fragment operations, but programming this hardware at a low level is difficult and time consuming. To address this problem, we have developed a complete real-time procedural shading system that compiles a high-level shading language to programmable vertex and fragment hardware, as described in a separate publication. In this paper, we describe in detail the algorithms used by this system to generate and optimize fragment code for NVIDIA's register combiner architecture and show that our compiler generates efficient code. The register combiner architecture has some similarities to VLIW CPU architectures, so we compare our compilation algorithms to those described in the literature for VLIW CPU architectures. We also discuss some of the lessons we learned from building and using this compiler that may be useful to the designers of future programmable graphics hardware.