GPUBench Test: Instrissue

Back to Main Page
Instrissue measures the throughput (rate of issue) of many ARB fp1.0 math instructions by issuing long programs containing only a single type of instruction. No textung operations occur. This test is intended to allow the hardware to run at peak rates. Whenever possible, direct dependencies between consecutives instructions are removed by processing in parallel on two sets of registers. For example, the test generates:
MAD R0, R0, R0;
MAD R1, R1, R1;
MAD R0, R0, R0;
MAD R1, R1, R1;
As opposed to the following, which features a read after write dependency which might prevent hardware capable of performing multiple MAD instructions per clock from running at its peak rate.
MAD R0, R0, R0;
MAD R0, R0, R0;
MAD R0, R0, R0;
MAD R0, R0, R0;
Common Functionality
No range of sizes is available in Instrissue, but framebuffer size is specified via --size. 1,2,3, and 4 component arithmetic and framebuffer formats are permitted as specified by the --components option. Rasterization of fragments can be controlled by the --render option, which specifies whether a screen covering quad or a large triangle is rasterized to generate fragments.
Test Specific Details
Note that although specifying the number of per texel components is common to many tests, in Instrissue this option also determines the write mask on the arithmetic instructions. Thus, specifying the number of components to be 1 is tell the card to do scalar math even when issuing 4-wide vector instructions. Some cards are able to optimize in this situation to improve instruction throughput.
The performance of a single type of instruction can be tested using --instruction or many different instructions can be tested in a single program execution using --all (test all instructions) or --few (test a smaller set of commonly used instructions). --length sets the number of times a particular instruction is executed in the shader program. For some complex ARB, instructions, the driver will explode the instruction into many architecure specific ones. As a result, it requires trial and error to determine the number of instructions an ARB program can contain, and still be under the hardware's instruction limit after the shader has been processed by the driver. The --maxlength option gets around this problem by finding the largest shader (less than the length specified by the --length option) for a given instruction that fits within the resource limitations of the hardware.
Example Usage
Measure throughput of all instructions by generating shaders with 16 instructions.
instrissue -size 512 -all -length 16
Measure throughput of MAD instructions using a shader with 32 MADS.
instrissue -size 512 -instruction MAD -length 32
Measure throughput of the SIN instruction using the longest shader the hardware supports (capping length at 512)
instrissue -size 512 -instruction SIN -maxlength -length 512

Commandline Usage

Usage: gpubench\bin\instrissue.exe <options>
  -a, --all
            Test all instructions
  -c, --components=SIZE
            number of components for texture and
            render target. (4)
  -r, --render=STRING
            Specifies how to render the quad
            quad:     issues exact quad
            triangle: issues large triangle (default)
  -n, --nocomments
            No comments, just the facts.
  -v, --viewprogram
            Only output the program that is generated for the test.
  -i, --instruction=STRING
            Specifies which instruction to test.
            Can be: ADD SUB MUL MAD EX2 LG2 POW FLR FRC
                    RSQ RCP SIN COS SCS DP3 DP4 XPD
  -l, --length=SIZE
             number of instructions to put in fragment program.
  -s, --size=SIZE
             size of framebuffer.
  -f, --few
             Only run through a few important instructions.
  -m  --maxlength
            Find maximum number of instr allowed for valid shader.

GPUBench was developed at the Stanford University Graphics Lab.