GPUBench Test: Floatbandwidth
Back to Main Page
Description
Floatbandwidth measures input bandwidth from floating
point textures. This is a powerful test that can reveal a lot about a
GPU's memory subsystem and texture caches. The test generates shaders
that fetch data using a variety of access patterns. The shaders
perform a number of fetches, and then perform a minimal number of math
instructions to ensure that all values fetches are used in generating
an output value. This is done so that driver optimized will not
eliminate fetch instructions whose results are never used.
Common Functionality
Many of commandline options for this test are present for in other
GPUBench tests as well. Input textures are created with the same
(SIZExSIZE) dimensions as the framebuffer, thus the test rasterizes
SIZE*SIZE fragments. The --min
and --max
options set the range of sizes to be tested (in increments of
--step
or, if --exponential
is specified,
texture/framebuffer size is doubled each time until reaching the
maximum size). Supplying a range of values is useful for comparison
as well as graph generation purposes.
1,2,3, and 4 component textures are permitted as specified by the
--components
option. Rasterization of fragments can be
controlled in two ways: --render
specifies whether a
screen covering quad or large triangle is rasterized to generate
fragments. If --chunksize [NUM]
is provided, instead of
a single primitive being drawn, multiple primitives are issued, each
covering NUMxNUM sections of the screen. We hoped to emulate
CPU-style blocking with this feature.
To reduce timing noise, specify that the test should be repeated a
large number of times using --iters
.
Test Specific Details
Floatbandwidth can measure both nondependent and dependent
texturing (dependent texturing enabled by --dependent
).
When dependent texturing is enabled, the shader first performs a fetch
from an index texture to obtain a value that is used as the texture
coordinate when accessing the input textures. When dependent
texturing is enabled, lookups into the input textures are performed
using an interpolated texture coordinate. --fetches
specifies the number of unique input textures to access. When
computing input bandwidth in dependent texturing mode, bytes read from
the index texture are included in the bandwidth estimate, thus, to
obtain a more accurate value of the bandwidth from just the input
textures, a larger number of fetches may need to be used.
Use -access
to specify the access pattern into the input
textures. Valid patterns are single
(each fragment
accesses texel (0,0) from the input textures), seq
(fragment (x,y) accesses texel (x,y) from each input texture,
random
(texture coordinates are randomly generated), and
strided
(skipping texels of the input textures in a
regular pattern). Random and strided access must be used with
dependent texturing enabled.
Just as a single program run can perform tests over a range of texture
sizes, you can also perform tests over a range of strides in both the
x and y dimensions.
--minskipx
and
--maxskipx
set this range in the x direction, and
--minskipy
and
--maxskipy
do the same in the
y direction. If skip in the x direction is set to 5, and skip in the
y direction is set to 2, then the following texels will be accessed:
(0,0) (5,0) (10,0) ...
(0,2) (5,2) (10,2) ...
(0,4) (5,4) (10,4) ...
However, since both the input textures and the output framebuffer are
the same size, in order to prevent running of the end of the texture,
the maximum number of unique pixels accessed in each direction must be
specified using
--samplesx
and
--samplesy
.
In general, fragment (x,y) accesses texel:
( stridex*(x %
samplesx), stridey*(y % samplesy) )
And the stride and
samples values must be selected so that running off the end of the
input textures is avoided. The test will notify the user if this is
the case, and exit with an error.
Example Usage
Measuring cache bandwidth of the system, using 4 512x512 textures as input:
floatbandwidth -m 512 -x 512 -a single -f 4
Measuring streaming bandwidth of the system, in dependent texturing
mode using 2 512x512 textures as input:
floatbandwidth -m 512 -x 512 -a sequential -d -f 2
Random access (GOPS bandwidth) from 8 textures.
floatbandwidth -m 512 -x 512 -a random -d -f 8
Access every other texel in the x and y directions
floatbandwidth -m 512 -x 512 -a strided -d -minskipx 2 -maxskipx 2 -minskipy 2 -maxskipy 2 -samplesx 256 -samplesy 256
Measuring bandwidth using strides in the x direction, always hitting same texel in y
floatbandwidth -m 512 -x 512 -a strided -d -minskipx 0 -maxskipx 32 -samplesx 16 -samplesy 1
Commandline Usage
Usage: gpubench\bin\floatbandwidth.exe <options>
Options
-m, --min=SIZE
min quad size to test (default: 512)
-x, --max=SIZE
max quad size to test (512)
-s, --step=SIZE
step size from min to max (1)
-c, --components=SIZE
number of components for texture and
render target. (4)
-r, --render=STRING
Specifies how to render the quad
quad: issues exact quad
triangle: issues large triangle (default)
-e, --exponential
flag to turn on exponential stepping
-k, --chunksize=SIZE
Chunk the rendering to SIZExSIZE blocks
-n, --nocomments
No comments, just the facts.
-i, --iters
Number of times to repeat rendering in a test. (200)
-a, --access=STRING
Specifies data access pattern.
single: same texel each time
seq: streaming pattern (DEFAULT)
random: random access (dependent texturing only)
strided: strided access (dependent texturing only)
-d, --dependent
Dependent texturing mode.
-j --onlymaxfetch
Do only number of fetches specified by -f.
-f, --fetches=NUM
Number of texture fetches to perform per shader. (1)
-j --onlymaxfetch
Do only number of fetches specified by -f.
-g, --minskipx=NUM
Min skip in x direction. (0)
-h, --maxskipx=NUM
Max skip in x direction. (0)
-o, --minskipy=NUM
Min skip in y direction. (0)
-p, --maxskipy=NUM
Max skip in y direction. (0)
-y, --samplesx=NUM
Unique texture samples in x direction (per row of texture). (1)
-z, --samplesy=NUM
Unique texture samples in y direction (per col of texture). (1)
-v --viewfp
View the fragment program being generated (don't run program).