GPUBench Test: Floatbandwidth
Back to Main Page
Floatbandwidth measures input bandwidth from floating
point textures. This is a powerful test that can reveal a lot about a
GPU's memory subsystem and texture caches. The test generates shaders
that fetch data using a variety of access patterns. The shaders
perform a number of fetches, and then perform a minimal number of math
instructions to ensure that all values fetches are used in generating
an output value. This is done so that driver optimized will not
eliminate fetch instructions whose results are never used.
Many of commandline options for this test are present for in other
GPUBench tests as well. Input textures are created with the same
(SIZExSIZE) dimensions as the framebuffer, thus the test rasterizes
SIZE*SIZE fragments. The
options set the range of sizes to be tested (in increments of
--step or, if
--exponential is specified,
texture/framebuffer size is doubled each time until reaching the
maximum size). Supplying a range of values is useful for comparison
as well as graph generation purposes.
1,2,3, and 4 component textures are permitted as specified by the
--components option. Rasterization of fragments can be
controlled in two ways:
--render specifies whether a
screen covering quad or large triangle is rasterized to generate
--chunksize [NUM] is provided, instead of
a single primitive being drawn, multiple primitives are issued, each
covering NUMxNUM sections of the screen. We hoped to emulate
CPU-style blocking with this feature.
To reduce timing noise, specify that the test should be repeated a
large number of times using
Test Specific Details
Floatbandwidth can measure both nondependent and dependent
texturing (dependent texturing enabled by
When dependent texturing is enabled, the shader first performs a fetch
from an index texture to obtain a value that is used as the texture
coordinate when accessing the input textures. When dependent
texturing is enabled, lookups into the input textures are performed
using an interpolated texture coordinate.
specifies the number of unique input textures to access. When
computing input bandwidth in dependent texturing mode, bytes read from
the index texture are included in the bandwidth estimate, thus, to
obtain a more accurate value of the bandwidth from just the input
textures, a larger number of fetches may need to be used.
-access to specify the access pattern into the input
textures. Valid patterns are
single (each fragment
accesses texel (0,0) from the input textures),
(fragment (x,y) accesses texel (x,y) from each input texture,
random (texture coordinates are randomly generated), and
strided (skipping texels of the input textures in a
regular pattern). Random and strided access must be used with
dependent texturing enabled.
Just as a single program run can perform tests over a range of texture
sizes, you can also perform tests over a range of strides in both the
x and y dimensions.
set this range in the x direction, and
do the same in the
y direction. If skip in the x direction is set to 5, and skip in the
y direction is set to 2, then the following texels will be accessed:
(0,0) (5,0) (10,0) ...
(0,2) (5,2) (10,2) ...
(0,4) (5,4) (10,4) ...
However, since both the input textures and the output framebuffer are
the same size, in order to prevent running of the end of the texture,
the maximum number of unique pixels accessed in each direction must be
In general, fragment (x,y) accesses texel:
( stridex*(x %
samplesx), stridey*(y % samplesy) )
And the stride and
samples values must be selected so that running off the end of the
input textures is avoided. The test will notify the user if this is
the case, and exit with an error.
Measuring cache bandwidth of the system, using 4 512x512 textures as input:
floatbandwidth -m 512 -x 512 -a single -f 4
Measuring streaming bandwidth of the system, in dependent texturing
mode using 2 512x512 textures as input:
floatbandwidth -m 512 -x 512 -a sequential -d -f 2
Random access (GOPS bandwidth) from 8 textures.
floatbandwidth -m 512 -x 512 -a random -d -f 8
Access every other texel in the x and y directions
floatbandwidth -m 512 -x 512 -a strided -d -minskipx 2 -maxskipx 2 -minskipy 2 -maxskipy 2 -samplesx 256 -samplesy 256
Measuring bandwidth using strides in the x direction, always hitting same texel in y
floatbandwidth -m 512 -x 512 -a strided -d -minskipx 0 -maxskipx 32 -samplesx 16 -samplesy 1
Usage: gpubench\bin\floatbandwidth.exe <options>
min quad size to test (default: 512)
max quad size to test (512)
step size from min to max (1)
number of components for texture and
render target. (4)
Specifies how to render the quad
quad: issues exact quad
triangle: issues large triangle (default)
flag to turn on exponential stepping
Chunk the rendering to SIZExSIZE blocks
No comments, just the facts.
Number of times to repeat rendering in a test. (200)
Specifies data access pattern.
single: same texel each time
seq: streaming pattern (DEFAULT)
random: random access (dependent texturing only)
strided: strided access (dependent texturing only)
Dependent texturing mode.
Do only number of fetches specified by -f.
Number of texture fetches to perform per shader. (1)
Do only number of fetches specified by -f.
Min skip in x direction. (0)
Max skip in x direction. (0)
Min skip in y direction. (0)
Max skip in y direction. (0)
Unique texture samples in x direction (per row of texture). (1)
Unique texture samples in y direction (per col of texture). (1)
View the fragment program being generated (don't run program).