Navigation: Up, Table of Contents, Bibliography, Index, Title Page

Search Structures

Introduction

This chapter presents the CGAL range tree, segment tree, and KD-tree data structures. The range tree is theoretically superior to the KD-tree, but the latter often seems to perform better. However, the range tree as implemented in CGAL is more flexible than the KD-tree implementation, in that it enables to layer together range trees and segment trees in the same data structure.

First the range and segment trees are introduced in sections reference through reference. Then the KD-tree is presented in Sections reference through reference.

Range and Segment Trees

This section presents d-dimensional range and segment trees. A one-dimensional range tree is a binary search tree on one-dimensional point data. Here we call all one-dimensional data types having a strict ordering (like integer and double) point data. d-dimensional point data are d-tuples of one-dimensional point data.

A one-dimensional segment tree is a binary search tree as well, but with one-dimensional interval data as input data. One-dimensional interval data is a pair (i.e., 2-tuple) (a,b), where a and b are one-dimensional point data of the same type and a< b. The pair (a,b) represents a half open interval [a,b). Analogously, a d-dimensional interval is represented by a d-tuple of one-dimensional intervals.

The input data type for a d-dimensional tree is a container class consisting of a d-dimensional point data type, interval data type or a mixture of both, and optionally a value type, which can be used to store arbitrary data. E.g., the d-dimensional bounding box of a d-dimensional polygon may define the interval data of a d-dimensional segment tree and the polygon itself can be stored as its value. An input data item is an instance of an input data type.

The range and segment tree classes are fully generic in the sense that they can be used to define multilayer trees. A multilayer tree of dimension (number of layers) d is a simple tree in the d-th layer, whereas the k-th layer, 1 k d-1, of the tree defines a tree where each (inner) vertex contains a multilayer tree of dimension d-k+1. The k-1-dimensional tree which is nested in the k-dimensional tree (T) is called the sublayer tree (of T). For example, a d-dim tree can be a range tree on the first layer, constructed with respect to the first dimension of d-dimensional data items. On all the data items in each subtree, a (d-1)-dimensional tree is built, either a range or a segment tree, with respect to the second dimension of the data items. And so on. Figures here explain the meaning of a sublayer tree graphically.

After creation of the tree, further insertions or deletions of data items are disallowed. The tree class does neither depend on the type of data nor on the concrete physical representation of the data items. E.g., let a multilayer tree be a segment tree for which each vertex defines a range tree. We can choose the data items to consist of intervals of type double and the point data of type integer. As value type we can choose string.

For this generality we have to define what the tree of each dimension looks like and how the input data is organized. For dimension k, 1 k 4, CGAL provides ready-to-use range and segment trees that can store k-dimensional keys (intervals resp.). These classes are defined in Section reference (Section reference resp.). These classes are parameterized with a traits class that defines, among other things, the key and interval type of a tree (cf. Section reference). In Section reference we give the requirements tree traits classes must fulfill.

In case you want to create a combined tree structure, refer Section reference. The classes described in that section enable you to define each layer of a tree separately. These tree classes are templatized with the type of the data item, the query interval and a traits class which is used as an interface to the data. In Section reference we describe the ready-to-use traits CGAL provides and the requirements a traits class has to fulfill. By this, the advanced user can develop his own traits classes.

We now give a short definition of the version of the range tree and segment tree implemented here. The presentation closely follows [dBvKOS97].

Definition of a Range Tree

A one-dimensional range tree is a binary search tree on one-dimensional point data. The point data of the tree is stored in the leaves. Each inner vertex stores the highest entry of its left subtree. The version of a range tree implemented here is static, which means that after construction of the tree, no elements be inserted or deleted. A d-dimensional range tree is a binary leaf search tree according to the first dimension of the d-dimensional point data, where each vertex contains a (d-1)-dimensional search tree of the points in the subtree (sublayer tree) with respect to the second dimension. See [dBvKOS97] and [Sam90] for more detailed information.

A d-dimensional range tree can be used to determine all d-dimensional points that lie inside a given d-dimensional interval (window_query). The pictures below show a two-dimensional and a d-dimensional range tree.

A two-dimensional range tree A
    d-dimensional range tree
A two-dimensional range tree. The tree is a binary search tree on the first dimension. Each sublayer tree of a vertex v is a binary search tree on the second dimension. The data items in a sublayer tree of v are all data items of the subtree of v A d-dimensional range tree. For each layer of the tree, one sublayer tree is illustrated.

Definition of a Segment Tree

A segment tree is a static binary search tree for a given set of coordinates. The set of coordinates is defined by the endpoints of the input data intervals. Any two adjacent coordinates build an elementary interval. Every leaf corresponds to an elementary interval. Inner vertices correspond to the union of the subtree intervals of the vertex. Each vertex or leaf v contains a sublayer type (or a list, if it is one-dimensional) that will contain all intervals I, such that I contains the interval of vertex v but not the interval of the parent vertex of v.

A d-dimensional segment tree can be used to solve the following problems:

An example of a one-dimensional segment tree and an example of a two-dimensional segment tree is shown below.

A one-dimensional segment
tree A
    d-dimensional segment tree
A one-dimensional segment tree. The segments and the corresponding elementary intervals are shown below the tree. The arcs from the nodes point to their subsets. A two-dimensional segment tree. The first layer of the tree is built according to the elementary intervals of the first dimension. Each sublayer tree of a vertex v is a segment tree according to the second dimension of all data items of v.

One possible application of a two-dimensional segment tree is the following. Given a set of convex polygons in two-dimensional space (CGAL::Polygon_2), we want to determine all polygons that intersect a given rectangular query window. Therefore, we define a two-dimensional segment tree, where the two-dimensional interval of a data item corresponds to the bounding box of a polygon and the value type corresponds to the polygon itself. The segment tree is created with a sequence of all data items, and a window query is performed. The polygons of the resulting data items are finally tested independently for intersections.

k-dimensional Range Trees

CGAL provides four range tree classes for k {1,...,4}.

k-dimensional Segment Trees

Tree Traits Class Implementations

This section provides range tree and segment tree traits class implementations for the CGAL kernel.

Set-like Tree Traits Classes

The following traits classes are set-like, since no data is associated to the keys.

Set-like Tree Traits Class for 2D Kernel Points

Set-like Tree Traits Class for 3D Kernel Points

Map-like Range Tree Traits Classes

The following traits classes are map-like, since they allow to associate data to the keys.

Map-like Range Tree Traits Class for 2D Kernel Points

Map-like Range Tree Traits Class for 3D Kernel Points

Map-like Segment Tree Traits Classes

The following traits classes are map-like, since they allow to associate data to the keys.

Map-like Segment Tree Traits Class for 2D Kernel Points

Map-like Segment Tree Traits Class for 3D Kernel Points

Tree Traits Class Requirements

This section describes the requirements for range and segment tree traits classes.

Creating an arbitrary multilayer tree

Now let us have a closer look on how a multilayer tree is built. In case of creating a d-dimensional tree, we handle a sequence of arbitrary data items, where each item defines a d-dimensional interval, point or other object. The tree is constructed with an iterator over this structure. In the i-th layer, the tree is built with respect to the data slot that defines the i-th dimension. Therefore, we need to define which data slot corresponds to which dimension. In addition we want our tree to work with arbitrary data items. This requires an adaptor between the algorithm and the data item. This is resolved by the use of traits classes, implemented in form of a traits class using function objects. These classes provide access functions to a specified data slot of a data item. A d-dimensional tree is then defined separately for each layer by defining a traits class for each layer.

Tree Traits class and its Requirements

Tree class Range_tree_d and class Segment_tree_d are templatized with a parameter Traits. Traits builds the interface between the tree and the data items. We now describe the traits class for a Range_tree_d layer and for a Segment_tree_d layer. The traits classes are implemented as template classes. If you do not want to use these traits classes you can also define your own class which has at least to provide the functionality of the traits class described below.

tree_point_traits - Requirements

Definition

tree_point_traits is a template class that provides an interface to data items.

Types

tree_interval_traits

Definition

tree_interval_traits is a template class that provides an interface to data items. It is similar to tree_interval_traits, except that it provides access to two data slots of the same type of each container class (Data, Window) instead of providing access to one data slot of container class Data and two data slots of class Window.

Types

Range_tree and Segment_tree

Definition

The tree classes were first intended to have a template argument defining the type of the sublayer tree. This leads to nested template arguments, where the internal class and function identifier got longer than a compiler dependent limit. Even for dimension 2 this happened. Therefore we chose another design. Now a tree is created with a prototype of the sublayer tree. With this prototype the sublayer tree can be cloned. The design pattern corresponds to the Prototype design pattern in  [GHJV95].

In this sense, an instance of a three-dimensional range tree (segment tree) would have to be created as a range tree (segment tree) with creation variable Sublayer_type s, which is a prototype of a two-dimensional range tree (segment tree). Because a range tree or a segment tree is expecting a prototype for its creation, a recursion anchor which builds dimension ``zero'' is needed. Tree_anchor described in section reference fulfills all these requirements. All tree classes (range tree, segment tree, tree anchor) are derived from an abstract base class Tree_base.

Additionally a range tree (segment tree) is build in function make_tree using iterators. The iterator concept is realized in the Standard Template Library and in many other libraries, see [MS96]. As long as the GNU, SUN and SGI compiler do not support template member functions, we only support member functions parameterized with iterators working on STL list, STL vector and C-arrays.

The trees are templatized with three arguments: Data, Window and Traits. Type Data defines the input data type and type Window defines the query window type. The tree uses a well defined set of functions in order to access data. These functions have to be provided by class Traits. The requirements are described in Section reference.

The trees allow to perform window queries, enclosing queries, and inverse range queries on the keys. Clearly, an inverse range query makes only sense in the segment tree. In order to perform an inverse range query, a range query of width has to be performed. We prefered not to offer an extra function for this sort of query, since the inverse range query is a special case of the range query. Furthermore, offering an inverse range query in the segment tree class implies offering this function also in the range tree class and having an extra item in the traits class that accesses the inverse range query point.

Sublayer_type

A Sublayer_type of class Range_tree_d or Segment_tree_d has to fulfill the following requirements: First of all, the class has to be derived from the abstract base class Tree_base and therefore has to provide methods make_tree, window_query, enclosing_query and is_inside with the same parameter types as the instantiated class Range_tree_d or Segment_tree_d, respectively. Furthermore a method bool is_anchor() has to be provided. If the Sublayer_type class builds a recursion anchor for class Segment_tree_d, this function is expected to return true, false otherwise.

Such a recursion anchor class is provided by the following class.

KD-trees

The implementation of the KD-tree is independent of the implementation of the rest of CGAL and can be used separately from CGAL.

The KD-tree class is parameterized with an interface class, that defines the interface between the KD-tree and the geometric primitives used. CGAL provides ready-made interface class that is presented in Section reference. The formal requirements for a class to be a KD-tree interface class is described in Section reference.

For a given set S = { p1, ..., pn } on n points in I Rd, it is sometimes useful to be able to answer orthogonal range-searching on S; namely, given an axis parallel query box B in I Rd, one would like to ``quickly'' determine the subset of points of S lying inside B.

Several data structures were suggested for that problem. Foremost among those, at least theoreticly, is the range-tree data structure with O(n logd(n)) preprocessing time, O(nlogd-1(n)) space, and O(logd(n) + k) query time, where k is the output size of the query. A theoreticly inferior data structure is the KD-tree, which offers O(n logn) preprocessing time, O(n) space, and O(n1-1/d) query time. The KD-tree is a binary tree constructed as follows: we compute the point pi of S such that its first coordinate is the median value among p11, ..., pn1, where pik denote the k-th coordinate of the i-th point of S. Let S1 denote all the points of S with first coordinate smaller than pi's first coordinate, and let S2 = S S1. Let T1,T2 be the KD-trees constructed recursively for S1, S2, respectively. The KD-tree of S is simply the binary tree having T1 as its left subtree and T2 as its right subtree. We apply this algorithm recursively, splitting the sets in the i-level of the KD-tree using the median point in the k-th coordinate, where k=(i mod n) + 1. See Figure reference for an illustration.

The resulting data structure has linear size, O(nlogn) preprocessing time, and can answer a query in O(n1-1/d +k) time, where k is the size of the query output. See [dBvKOS97].

Figure:  The partition of a set of points induced by a KD-tree of the points

In this section we define the class Kdtree_d that implements the CGAL KD-tree.

Kdtree_d<Interface_traits>

Kdtree_d<Interface_traits>::Box

KD-tree Default Interface Class

CGAL contains a default implementation for the KD-tree interface class. In this section we describe how to use it.

KD-tree implemented using the default interface may be applied to any standard class of point provided by CGAL.

The default interface class Kdtree_interface<Point> is templated with a parameter Point, which is required to supply the following methods:

There are two other default interface classes Kdtree_interface_2d<Point> and Kdtree_interface_3d<Point> which should be used when using 2D and 3D points from the CGAL kernel. This is done since the points in the kernel do not support changing their coordinates through direct access.

KD-tree Interface Class Specification

In this section we present the formal requirements for a KD-treeinterface class, that can be used to instantiate a variable of type Kdtree_d<Interface_traits>.

The KD-tree class is parameterized with the interface class Interface_traits which defines the abstract interface between the KD-tree and the data (i.e., points). The following requirement catalog lists the primitives, i.e., types, member functions etc., that must be defined for a class that can be used to parameterize KD-trees. Ready-made implementation is available and described in Section reference.


Navigation: Up, Table of Contents, Bibliography, Index, Title Page
The GALIA project. Jan 18, 2000.