GPUBench Test: Fetchcosts
Back to Main Page
Fetchcosts is an yet another bandwidth analysis test. It
generates shaders that can reveal the cost (in terms of number of MAD
instructions) of texture fetches when accessing data in a variety of
patterns. The goal of this test is to allow the user to specify an
access pattern that is similar to that of a hypothetical or real life
shader program, and to determine roughly how many instructions a
shader would need to perform in order to be compute limited. Thus,
given a texture access type, and a shader's ratio of texturing
instructions to arithmetic ones, it becomes possible to estimate the
performance of a shader.
--size specifies the size of the framebuffer and
input textures. At this time only 4-component textures and buffers
--render specifies whether a
screen covering quad or large triangle is rasterized to generate
To reduce timing noise, specify that the test should be repeated a
large number of times using
Test Specific Details
Fetchcosts shader perform begin by performing a number of texture
fetches. It is easy to specify complex dependent texturing scenarios.
--dependentlevels is used to specify how many levels of
dependent texturing to use.
--fetches option specifies
the number of fetches to perform at each dependent level, thus
(dependentlevels+1)*fetches texture lookups are done in
the shader. The value obtained from the first lookup on each
dependent level is used as the coordinate for lookups into textures at
the next level. Using
--access, texture access can be
single access, (texel 0,0),
(sequential reading), and
random. The values of stored
in textures are set so that these access patterns apply to dependent
texture fetches as well (eg. when in random access mode, random values
are stored in the textures). However, if random access will never
occur in the first level of texturing. Instead of accessing 'fetches'
unique textures in each level, 'fetches' accesses from the same
texture will be performed when
The texture accesses form the first part of the generated shader. The
texturing instructions are followed by a series of MAD instructions to
create a shader whose total length matches the specified instruction
the range of lengths of shaders to test in a particular test run.
When the number of MAD instructions grows long enough, the shader will
be compute limited, and running time will be a function of the number
of instructions. When the instruction count is short enough that the
shader is bandwidth limited, execution time remains largely dependent
on the type and number texturing operations performed. As an example,
in the graph below, it is very clear when the shader passes the
threshold of being bandwidth limited to being compute limited. Each
of the 4 lines corresponds to a different number of texture accesses.
As expected, it takes more instructions to become compute limited in
the case where more textures are accessed at the beginning of the
Generates a shader with 30 instrucitons, with 3 levels of dependent
texturing, 3 fetches per level using random access:
fetchcosts -s 512 -a random -f 3 -d 3 -m 30 -x 30
Runs tests on shaders ranging from length 5 to 60, performing
sequential access from 5 unqiue textures:
fetchcosts -s 512 -a seq -f 5 -d 0 -m 5 -x 60
Usage: gpubench\bin\fetchcosts.exe <options>
Specifies how to render the quad
quad: issues exact quad
triangle: issues large triangle (default)
No comments, just the facts.
Only output the program that is generated for the test.
size of framebuffer (512)
Specifies data access pattern.
single: same texel each time
seq: streaming pattern (DEFAULT)
random: random access (dependent texturing only)
Number of times to repeat rendering in a test. (100)
Perform each fetch from the same texture.
minimum number of instructions to test. This is a count
of the total number of instructions in the shader,
including texture fetches.
maximum number of instructions to test. Will test programs
from mininstr to maxinstr in length.
number of dependent texturing levels. The value obtained from
the first texture in a level will serve as the index for
Number of fetched to perform per dependent texturing level, or
the *total* number of fetches if the number of dependent levels