Direct3D 9 API Interceptor
This is a program that intercepts all Direct3D 9 (D3D9) commands issued by a running application. These commands are then dispatched to a separate program that interprets and responds to them. The commands can be forwarded to the operating system, discarded, or modified as desired.
The idea that enables this interception is that Direct3D (like most Windows APIs) is dynamically linked. Furthermore, Windows first searches for dynamic link libraries (DLLs) in the application's local directory. Therefore, to intercept calls to the D3D9 dynamic link library, d3d9.dll, all that is required is to make a custom version of d3d9.dll and put it in the application's local directory, which is exactly what this framework does. All intercepted calls are passed to a separate DLL, d3d9Callback.dll. This DLL decides what actions to take as a result of the command stream. It is also allowed to modify or discard the command stream before the interceptor sends it to the "real" version of d3d9.dll that was the intended target of these calls, which dispatches the commands to the graphics card.
D3D is a very object-based system. Almost all API calls are made by calling a member
function of an existing object, which may in turn return a new object. To be able
to intercept all API calls, the interceptor needs to create a wrapper object for
every D3D object on the OS version of d3d9.dll, which we will call a "base object".
All objects that the host application has access to are wrapper objects, and every
wrapper object has a pointer to the base object it is wrapping. Namespaces are used
to differentiate between wrapper and base objects - wrapper objects are in the
D3D9Wrapper namespace and base objects are in the
namespace. As an example, the host application calling
looks something like this:
- App has an object of type
D3D9Wrapper::IDirect3DDevice9and invokes its
D3D9Wrapper::IDirect3DDevice9is wrapping a
_BaseDevice, and calls
_BaseDevice->CreateTexture(...), producing a new
D3D9Wrapper::IDirect3DTexture9is created and wrapped around the resulting
- The created texture is reported to d3d9Callback.dll
D3D9Wrapper::IDirect3DTexture9object is returned to the app
To make it easier to interface with the application, the interceptor supports a simple console overlay for the application. This can display simple things like the FPS to complicated things that describe what d3d9Callback.dll considers to be the current screen state. This can help rapidly track down any number of inconsistencies between the state of the host application and what the interceptor thinks the current state is. As an example, a screenshot taken when I am running my Starcraft 2 AI in debug mode is shown below. The console lets me rapidly see what the AI wants to do, what actions it thinks succeeded or failed, a list of actions it has taken recently, etc.
By itself the interceptor does nothing, but the ability to intercept the graphics stream enables a variety of applications. At a high level, most of my interceptor applications start by understanding the meaning of relevant render calls in the host application and inferring some general principles about each frame. For example, in Spider Solitaire, it learns what cards are currently displayed on the screen and where they are located. This can be a challenging task, but fortunately only important render calls need to be understood; the specifics of how weather effects are rendered in Starcraft 2 is unlikely to impact any decisions that an AI might make and can be safely ignored. After the interceptor application understands what's going on in each frame, there are a variety of ways that knowledge can be used. Below are some examples:
- Object Insertion — New objects can be inserted into the environment. For example, in World of Warcraft, a 3D character could be inserted that tells you the direction you character should go to complete their current quest, or where you need to stand during a certain phase in a boss fight. Enemy players might have an icon placed over their heads to help identify their class at a large distance. In Starcraft 2, a giant red ball of fire might be put on the ground where a nuclear missile is going to land.
- Object Modification — Existing objects in the environment can have their rendering modified. For example, in World of Warcraft certain graphical effects such as pools of acid on the ground can be changed to be solid neon green, making them much easier to avoid. In Starcraft 2, allied and enemy units could be fully tinted with their player colors to make allies and enemy units more noticeable. Cloaked units might be modified to be a solid color instead of just a background shimmer.
- Automation — By sending keyboard and mouse commands back to the host application, the host application can be partially or fully automated. This is pretty self-explanatory. In World of Warcraft it could automatically join and play battlegrounds, arbitrage the auction house, or heal allies in a raid. In Starcraft 2 it could automatically play a specific custom map, or multiplayer games. This approach to automation is rather unconventional; most automation programs ("bots") function by reading data directly from the application's virtual address space. Such an approach is much more efficient and often less work, however it is sensitive to both the application's layout in memory and is comparatively easy to patch against. A graphics based approach is both closer to a human user, easier to generalize between different applications, and less sensitive to changes to the game's code.
- Performance Tests — Graphics is almost invariably the performance bottleneck of high-tier video games, and low frame rates result in choppy and unpleasant play. Thus making video games run quickly is important to developers. Unfortunately, well designed graphics pipelines, especially those that compile to multiple platforms, often hide the underlying API from developers, which can sometimes make it difficult to see when inefficient operations such as very small triangles, vertex buffer locks, or redundant state changes are occurring. The interceptor on the other hand serves as a ground truth record of all graphics API calls (PIX for Windows, which comes with the DirectX SDK, can perform a similar function, but in my experience can be rather unstable.) The interceptor can also be used to modify the pipeline and observe the results; for example, all textures could be clamped at 128x128 to test whether texture fetch on the GPU is a significant bottleneck.
- Content Capture and Playback — The interceptor has access to all the graphics objects used by the application. This makes it a simple matter to store all the textures, geometry, and shaders used by the application into separate model files for other purposes. Taking this one step further, all content for a frame could be dumped to a file to allow playback of an entire frame for offline analysis, without having to constantly replay the game to a certain situation. Again, this functionality is similar to that offered by PIX for Windows.
Below are some very simple examples of these kind of modifications for Starcraft 2. All of these use texture replacement, which it should be kept in mind is just one of countlessly many possible modifications.
Player Color Tinting — When rendering units, texture index 0 is the unit's texture and texture index 1 is the unit's player color mask. By replacing texture 1 with a solid texture, we can tint the units with their color, which can help more easily point out which units are allied and which are enemies, especially in tightly packed fights:
Checkerboarding — Here texture 0 for units has been replaced with a green-white checkerboard texture.
Fire — Here texture 1 for all render events has been replaced with this fire texture.
My Starcraft 2 AI is the main reason I wrote this interceptor and the most successful and polished project that uses the framework. Some other projects I have used it for include:
- Warcraft 3 AI — While I was waiting for Starcraft 2 to come out I wrote an AI for Warcraft 3 for practice and to see what kind of problems I would encounter (Warcraft 3 uses Direct3D 8, but the idea is exactly the same.) It only did one build with one race, but it was a good starting point for the Starcraft 2 AI and I carried many of the successful ideas into my design for Starcraft 2, such as the AI's threading model and the way mouse and keyboard input is handled.
- World of Warcraft Pathing — Trying to do pathing in a game like World of Warcraft from the underlying geometry is very complicated. Using the interceptor, I built up a model of all the local geometry the player has ever seen. Each frame, the currently observed geometry is matched against the internal map to both update the map and figure out where the player is. This map is then displayed in a separate 3D window, and you can click on a location in this environment and the bot will attempt to path the player to this location. Briefly, it works by laying a fine grid over the geometry, estimating which nodes can connect to which other nodes by looking at the local slope, then running A* and doing some basic simplifications on the result. Ultimately I wanted this to be able to path even in architecturally complex environments like cities and automatically build up a complete 3D map of the world, but unfortunately pathing is a very challenging in the presence of buildings, lag, and enemies. While this application was useful to write and worked passably well, it often got stuck in loops — for example, it would take many tries to get up the Ironforge ramp without falling off.
- Content Capture — This program saves all graphics content in a game to a file. This data is then datamined as part of my graduate thesis to build a probability model over "real" scenes, which can ultimately be used to help artists model scenes faster (see this paper). It can also act just like PIX for Windows; when you press a button it captures all render calls for the next frame for later playback and analysis. Unfortunately the content capture framework can only capture models and textures that actually go through the graphics pipeline; it will miss objects that are not rendered because they are occlusion culled by the application or are too distant to be viewed. Below is an example of the system capturing Dalaran to a scene file, then rendering it in an offline scene viewer:
This code is all based off my BaseCode. Specifically you will need the contents of Includes.zip on your include path, Libraries.zip on your library path, and DLLs.zip on your system path. As mentioned above, this code does nothing with the intercepted calls; it just passes them onto d3d9Callback.dll which changes depending on the target application. See the Starcraft 2 AI for a project that produces d3d9Callback.dll. The wrapper uses a simple configuration file, WrapperParameters.txt, which needs to be in the same directory as the DLL.D3D9Interceptor.zip (includes project file)
D3D9Interceptor Code Listing
Config.h, Web Version
d3d9Callback.h, Web Version
d3d9CallbackEmpty.h, Web Version
d3d9CallbackStructures.h, Web Version
d3d9Wrapper.cpp, Web Version
d3d9Wrapper.h, Web Version
Direct3D9Functions.h, Web Version
Direct3DBaseTexture9Functions.h, Web Version
Direct3DDevice9Functions.h, Web Version
Direct3DIndexBuffer9Functions.h, Web Version
Direct3DResource9Functions.h, Web Version
Direct3DSurface9Functions.h, Web Version
Direct3DSwapChain9Functions.h, Web Version
Direct3DTexture9Functions.h, Web Version
Direct3DVertexBuffer9Functions.h, Web Version
Engine.cpp, Web Version
Engine.h, Web Version
Globals.cpp, Web Version
Globals.h, Web Version
Main.h, Web Version
Overlay.cpp, Web Version
Overlay.h, Web Version
PointerSet.h, Web Version
Total lines of code: 4110