GPUBench Test: Floatbandwidth

Back to Main Page
Description
Floatbandwidth measures input bandwidth from floating point textures. This is a powerful test that can reveal a lot about a GPU's memory subsystem and texture caches. The test generates shaders that fetch data using a variety of access patterns. The shaders perform a number of fetches, and then perform a minimal number of math instructions to ensure that all values fetches are used in generating an output value. This is done so that driver optimized will not eliminate fetch instructions whose results are never used.
Common Functionality
Many of commandline options for this test are present for in other GPUBench tests as well. Input textures are created with the same (SIZExSIZE) dimensions as the framebuffer, thus the test rasterizes SIZE*SIZE fragments. The --min and --max options set the range of sizes to be tested (in increments of --step or, if --exponential is specified, texture/framebuffer size is doubled each time until reaching the maximum size). Supplying a range of values is useful for comparison as well as graph generation purposes.
1,2,3, and 4 component textures are permitted as specified by the --components option. Rasterization of fragments can be controlled in two ways: --render specifies whether a screen covering quad or large triangle is rasterized to generate fragments. If --chunksize [NUM] is provided, instead of a single primitive being drawn, multiple primitives are issued, each covering NUMxNUM sections of the screen. We hoped to emulate CPU-style blocking with this feature.
To reduce timing noise, specify that the test should be repeated a large number of times using --iters.
Test Specific Details
Floatbandwidth can measure both nondependent and dependent texturing (dependent texturing enabled by --dependent). When dependent texturing is enabled, the shader first performs a fetch from an index texture to obtain a value that is used as the texture coordinate when accessing the input textures. When dependent texturing is enabled, lookups into the input textures are performed using an interpolated texture coordinate. --fetches specifies the number of unique input textures to access. When computing input bandwidth in dependent texturing mode, bytes read from the index texture are included in the bandwidth estimate, thus, to obtain a more accurate value of the bandwidth from just the input textures, a larger number of fetches may need to be used.
Use -access to specify the access pattern into the input textures. Valid patterns are single (each fragment accesses texel (0,0) from the input textures), seq (fragment (x,y) accesses texel (x,y) from each input texture, random (texture coordinates are randomly generated), and strided (skipping texels of the input textures in a regular pattern). Random and strided access must be used with dependent texturing enabled.
Just as a single program run can perform tests over a range of texture sizes, you can also perform tests over a range of strides in both the x and y dimensions. --minskipx and --maxskipx set this range in the x direction, and --minskipy and --maxskipy do the same in the y direction. If skip in the x direction is set to 5, and skip in the y direction is set to 2, then the following texels will be accessed:
(0,0)  (5,0)  (10,0) ...
(0,2)  (5,2)  (10,2) ...
(0,4)  (5,4)  (10,4) ...
However, since both the input textures and the output framebuffer are the same size, in order to prevent running of the end of the texture, the maximum number of unique pixels accessed in each direction must be specified using --samplesx and --samplesy. In general, fragment (x,y) accesses texel:
( stridex*(x % samplesx), stridey*(y % samplesy) )
And the stride and samples values must be selected so that running off the end of the input textures is avoided. The test will notify the user if this is the case, and exit with an error.
Example Usage
Measuring cache bandwidth of the system, using 4 512x512 textures as input:
floatbandwidth -m 512 -x 512 -a single -f 4
Measuring streaming bandwidth of the system, in dependent texturing mode using 2 512x512 textures as input:
floatbandwidth -m 512 -x 512 -a sequential -d -f 2
Random access (GOPS bandwidth) from 8 textures.
floatbandwidth -m 512 -x 512 -a random -d -f 8
Access every other texel in the x and y directions
floatbandwidth -m 512 -x 512 -a strided -d -minskipx 2 -maxskipx 2 -minskipy 2 -maxskipy 2 -samplesx 256 -samplesy 256
Measuring bandwidth using strides in the x direction, always hitting same texel in y
floatbandwidth -m 512 -x 512 -a strided -d -minskipx 0 -maxskipx 32 -samplesx 16 -samplesy 1

Commandline Usage

Usage: gpubench\bin\floatbandwidth.exe <options>
  Options
  -m, --min=SIZE
             min quad size to test (default: 512)
  -x, --max=SIZE
             max quad size to test (512)
  -s, --step=SIZE
             step size from min to max (1)
  -c, --components=SIZE
            number of components for texture and
            render target. (4)
  -r, --render=STRING
            Specifies how to render the quad
            quad:     issues exact quad
            triangle: issues large triangle (default)
  -e, --exponential
            flag to turn on exponential stepping
  -k, --chunksize=SIZE
            Chunk the rendering to SIZExSIZE blocks
  -n, --nocomments
            No comments, just the facts.
  -i, --iters
            Number of times to repeat rendering in a test. (200)
  -a, --access=STRING
            Specifies data access pattern.
            single: same texel each time
            seq: streaming pattern (DEFAULT)
            random: random access (dependent texturing only)
            strided: strided access (dependent texturing only)
  -d, --dependent
            Dependent texturing mode.
  -j  --onlymaxfetch
            Do only number of fetches specified by -f.
  -f, --fetches=NUM
            Number of texture fetches to perform per shader. (1)
  -j  --onlymaxfetch
            Do only number of fetches specified by -f.
  -g, --minskipx=NUM
            Min skip in x direction. (0)
  -h, --maxskipx=NUM
            Max skip in x direction. (0)
  -o, --minskipy=NUM
            Min skip in y direction. (0)
  -p, --maxskipy=NUM
            Max skip in y direction. (0)
  -y, --samplesx=NUM
            Unique texture samples in x direction (per row of texture). (1)
  -z, --samplesy=NUM
            Unique texture samples in y direction (per col of texture). (1)
  -v  --viewfp
            View the fragment program being generated (don't run program).



GPUBench was developed at the Stanford University Graphics Lab.