Zippered Polygon Meshes from Range Images
Greg Turk and Marc Levoy
Computer Science Department
Stanford University
Abstract
Range imaging offers an inexpensive and accurate means for digitizing the
shape of three-dimensional objects. Because most objects self occlude, no
single range image suffices to describe the entire object. We present a
method for combining a collection of range images into a single polygonal
mesh that completely describes an object to the extent that it is visible from
the outside.
The steps in our method are: 1) align the meshes with each other using a
modified iterated closest-point algorithm, 2) zipper together adjacent meshes
to form a continuous surface that correctly captures the topology of the
object, and 3) compute local weighted averages of surface positions on all
meshes to form a consensus surface geometry.
Our system differs from previous approaches in that it is incremental; scans
are acquired and combined one at a time. This approach allows us to acquire
and combine large numbers of scans with minimal storage overhead. Our
largest models contain up to 360,000 triangles. All the steps needed to
digitize an object that requires up to 10 range scans can be performed using
our system with five minutes of user interaction and a few hours of compute
time. We show two models created using our method with range data from a
commercial rangefinder that employs laser stripe technology.
CR Categories: I.3.5 [Computer Graphics]: Computational Geometry and
Object Modelling.
Additional Key Words: Surface reconstruction, surface fitting, polygon
mesh, range images, structured light range scanner.
1 Introduction
This paper presents a method of combining multiple views of an object,
captured by a range scanner, and assembling these views into one unbroken
polygonal surface. Applications for such a method include:
¥ Digitizing complex objects for animation and visual simulation.
¥ Digitizing the shape of a found object such as an archaeological artifact for
measurement and for dissemination to the scientific community.
¥ Digitizing human external anatomy for surgical planning, remote
consultation or the compilation of anatomical atlases.
¥ Digitizing the shape of a damaged machine part to help create a
replacement.
There is currently no procedure that will allow a user to easily capture a
digital description of a physical object. The dream tool would allow one to
set an industrial part or a clay figure onto a platform, press a button, and
have a complete digital description of that object returned in a few minutes.
The reality is that much digitization is done by a user painstakingly touching
a 3D sensing probe to hundreds or thousands of positions on the object, then
manually specifying the connectivity of these points. Fortunately range
scanners offer promise in replacing this tedious operation.
A range scanner is any device that senses 3D positions on an objectÕs surface
and returns an array of distance values. A range image is an m´n grid of
distances (range points) that describe a surface either in Cartesian
coordinates (a height field) or cylindrical coordinates, with two of the
coordinates being implicitly defined by the indices of the grid. Quite a
number of measurement techniques can be used to create a range image,
including structured light, time-of-flight lasers, radar, sonar, and several
methods from the computer vision literature such as depth from stereo,
shading, texture, motion and focus. The range images used to create the
models in this paper were captured using structured light (described later),
but our techniques can be used with any range images where the
uncertainties of the distance values are smaller than the spacing between the
samples.
Range scanners seem like a natural solution to the problem of capturing a
digital description of physical objects. Unfortunately, few objects are simple
enough that they can be fully described by a single range image. For
instance, a coffee cup handle will obscure a portion of the cupÕs surface even
using a cylindrical scan. To capture the full geometry of a moderately
complicated object (e.g. a clay model of a cat) may require as many as a
dozen range images.
There are two main issues in creating a single model from multiple range
images: registration and integration. Registration refers to computing a rigid
transformation that brings the points of one range image into alignment with
the portions of a surface that is shares with another range image. Integration
is the process of creating a single surface representation from the sample
points from two or more range images.
Our approach to registration uses an iterative process to minimize the
distance between two triangle meshes that were created from the range
images. We accelerate registration by performing the matching on a
hierarchy of increasingly more detailed meshes. This method allows an
object to be scanned from any orientation without the need for a six-degree-
of-freedom motion device.
We separate the task of integration into two steps: 1) creating a mesh that
reflects the topology of the object, and 2) refining the vertex positions of the
mesh by averaging the geometric detail that is present in all scans. We
capture the topology of an object by merging pairs of triangle meshes that
are each created from a single range image. Merging begins by converting
two meshes that may have considerable overlap into a pair of meshes that
just barely overlap along portions of their boundaries. This is done by
simultaneously eating back the boundaries of each mesh that lie directly on
top of the other mesh. Next, the meshes are zippered together: the triangles
of one mesh are clipped to the boundary of the other mesh and the vertices
on the boundary are shared. Once all the meshes have been combined, we
allow all of the scans to contribute to the surface detail by finding the
consensus geometry. The final position of a vertex is found by taking an
average of nearby positions from each of the original range images. The
order in which we perform zippering and consensus geometry is important.
We deliberately postpone the refinement of surface geometry until after the
overall shape of the object has been determined. This eliminates
discontinuities that may be introduced during zippering.
The remainder of this paper is organized as follows. Section 2 describes
previous work on combining range images. Section 3 covers the basic
principles of a structured light range scanner. Section 4 presents the
automatic registration process. Section 5 describes zippering meshes into
one continuous surface. Section 6 describes how surface detail is captured
through consensus geometry. Section 7 shows examples of digitized models
and compares our approach to other methods of combining range data.
Section 8 concludes this paper by discussing future work.
2 Previous Work
There is a great deal of published work on registration and integration of
depth information, particularly in the vision literature. Our literature review
only covers work on registration or integration of dense range data captured
by an active range scanner, and where the product of the integration is a
polygon mesh.
2.1 Registration
Two themes dominate work in range image registration: matching of
ÒcreatedÓ features in the images to be matched, and minimization of
distances between all points on the surface represented by the two images.
In the first category, Wada and co-authors performed six degree of freedom
registration by matching distinctive facets from the convex hulls of range
images [Wada 93]. They computed a rotation matrix from corresponding
facets using a least squares fit of the normal vectors of the facets.
In the second category, Champleboux and co-workers used a data structure
called an octree-spline that is a sampled representation of distances to an
objectÕs surface [Champleboux 92]. This gave them a rapid way to
determine distances from a surface (and the distance gradient) with a low
overhead in storage. Chen and Medioni establish a correspondence between
points on one surface and nearby tangent planes on the other surface [Chen
92]. They find a rigid motion that minimizes the point-to-tangent collection
directly and then iterate. Besl and McKay use an approach they call the
iterated closest-point algorithm [Besl 92]. This method finds the nearest
positions on one surface to a collection of points on the other surface and
then transforms one surface so as to minimize the collective distance. They
iterate this procedure until convergence.
Our registration method falls into the general category of direct distance
minimization algorithms, and is an adaptation of [Besl 92]. It differs in that
we do not require that one surface be a strict subset of the other. It is
described in Section 4.
2.2 Integration
Integration of multiple range scans can be classified into structured and
unstructured methods. Unstructured integration presumes that one has a
procedure that creates a polygonal surface from an arbitrary collection of
points in 3-space. Integration in this case is performed by collecting
together all the range points from multiple scans and presenting them to the
polygonal reconstruction procedure. The Delaunay triangulation of a set of
points in 3-space has been proposed as the basis of one such reconstruction
method [Boissonnat 84]. Another candidate for surface reconstruction is a
generalization of the convex hull of a point set known as the alpha shape
[Edelsbrunner 92]. Hoppe and co-authors use graph traversal techniques to
help construct a signed distance function from a collection of unorganized
points [Hoppe 92]. An isosurface extraction technique produces a polygon
mesh from this distance function.
Structured integration methods make use of information about how each
point was obtained, such as using error bounds on a pointÕs position or
adjacency information between points within one range image. Soucy and
Laurendeau use a structured integration technique to combine multiple range
images [Soucy 92] that is similar in several respects to our algorithm. Given
n range images of an object, they first partition the points into a number of
sets that are called common surface sets. The range points in one set are
then used to create a grid of triangles whose positions are guided by a
weighted average of the points in the set. Subsets of these grids are stitched
together by a constrained Delaunay triangulation in one of n projections
onto a plane. We compare our method to SoucyÕs in Section 7.
3 Structured Light Range Scanners
In this section we describe the operating principles of range scanners based
on structured light. We do this because it highlights issues common to many
range scanners and also because the range images used in this article were
created by such a scanner.
3.1 Triangulation
Structured light scanners operate on the principle of triangulation (see Figure
1, left). One portion of the scanner projects a specific pattern of light onto
the object being scanned. This pattern of light is observed by the sensor of
the scanner along a viewing direction that is off-axis from the source of
light. The position of the illuminated part of the object is determined by
finding the intersection of the lightÕs projected direction and the viewing
direction of the sensor. Positions can be accumulated across the length of
the object while the object is moved across the path of the projected light.
Some of the patterns that have been used in such scanners include a spot, a
circle, a line, and several lines at once. Typically the sensor is a CCD array
or a lateral effect photodiode.
The scanner used for the examples in this paper is a Cyberware Model 3030
MS. It projects a vertical sheet of He-Ne laser light onto the surface of an
object. The laser sheet is created by spreading a laser beam using a
cylindrical lens into a sheet roughly 2 mm wide and 30 cm high. The sensor
of the Cyberware scanner is a 768 ´ 486 pixel CCD array. A typical CCD
image shows a ribbon of laser light running from the top to the bottom (see
Figure 2). A range point is created by looking across a scanline for the peak
intensity of this ribbon. A range pointÕs distance from the scanner (the
ÒdepthÓ) is given by the horizontal position of this peak and the vertical
position of the range point is given by the number of the scanline. Finding
the peaks for each scanline in one frame gives an entire column of range
points, and combining the columns from multiple frames as the object is
moved through the laser sheet gives the full range image.
3.2 Sources of Error
Any approach to combining range scans should attempt to take into account
the possible sources of error inherent in a given scanner. Two sources of
error are particularly relevant to integration. One is a result of light falling
on the object at a grazing angle. When the projected light falls on a portion
of the object that is nearly parallel to the lightÕs path, the sensor sees a dim
and stretched-out version of the pattern. Finding the center of the laser sheet
when it grazes the object becomes difficult, and this adds uncertainty to the
position of the range points. The degree of uncertainty at a given range
point can be quantified, and we make use of such information at several
stages in our approach to combining range images.
A second source of inaccuracy occurs when only a portion of the laser sheet
hits an object, such as when the laser sheet falls off the edge of a book that is
perpendicular to the laser sheet (see Figure 1, right). This results in a false
position because the peak-detection and triangulation method assumes that
the entire width of the sheet is visible. Such an assumption results in edges
of objects that are both curled and extended beyond their correct position.
This false extension of a surface at edges is an issue that needs to be
specifically addressed when combining range images.
3.3 Creating Triangle Meshes from Range Images
We use a mesh of triangles to represent the range image data at all stages of
our integration method. Each sample point in the m´n range image is a
potential vertex in the triangle mesh. We take special care to avoid
inadvertently joining portions of the surface together that are separated by
depth discontinuities (see Figure 3).
To build a mesh, we create zero, one or two triangles from four points of a
range image that are in adjacent rows and columns. We find the shortest of
the two diagonals between the points and use this to identify the two triplets
of points that may become triangles. Each of these point triples is made into
a triangle if the edge lengths fall below a distance threshold. Let s be the
maximum distance between adjacent range points when we flatten the range
image, that is, when we donÕt include the depth information (see Figure 3).
We take the distance threshold be a small multiple of this sampling distance,
typically 4s. Although having such a distance threshold may prevent joining
some range points that should in fact be connected, we can rely on other
range images (those with better views of the location in question) to give the
correct adjacency information.
This willingness to discard questionable data is representative of a deliberate
overall strategy: to acquire and process large amounts of data rather than
draw hypotheses (possibly erroneous) from sparse data. This strategy
appears in several places in our algorithm.
4 Registration of Range Images
Once a triangle mesh is created for each range image, we turn to the task of
bringing corresponding portions of different range images into alignment
with one another. If all range images are captured using a six-degree of
freedom precision motion device then the information needed to register
them is available from the motion control software. This is the case when
the object or scanner is mounted on a robot arm or the motion platform of a
precision milling machine. Inexpensive motion platforms are often limited
to one or two degrees of freedom, typically translation in a single direction
or rotation about an axis. One of our goals is to create an inexpensive
system. Consequently, we employ a registration method that does not
depend on measured position and orientation. With our scanner, which
offers translation and rotation around one axis, we typically take one
cylindrical and four translational scans by moving the object with the motion
device. To capture the top or the underside of the object, we pick it up by
hand and place it on its side. Now the orientation of subsequent scans
cannot be matched with those taken earlier, and using a registration method
becomes mandatory.
4.1 Iterated Closest-Point Algorithm
This section describes a modified iterated closest-point (ICP) algorithm for
quickly registering a pair of meshes created from range images. This
method allows a user to crudely align one range image with another on-
screen and then invoke an algorithm that snaps the position of one range
image into accurate alignment with the other.
The iterated closest-point of [Besl 92] cannot be used to register range
images because it requires that every point on one surface have a
corresponding point on the other surface. Since our scans are overlapping,
we seldom produce data that satisfies this requirement. Thus we have
developed our own variant of this algorithm. Its steps are:
1) Find the nearest position on mesh A to each vertex of mesh B.
2) Discard pairs of points that are too far apart.
3) Eliminate pairs in which either points is on a mesh boundary.
4) Find the rigid transformation that minimizes a weighted least-squared
distance between the pairs of points.
5) Iterate until convergence.
6) Perform ICP on a more detailed mesh in the hierarchy.
In step 1, it is important to note that we are looking for the 3-space position
Ai on the surface of mesh A that is closest to a given vertex Bi of mesh B
(see Figure 4). The nearest point Ai may be a vertex of A, may be a point
within a triangle, or may lie on a triangleÕs edge. Allowing these points Ai
to be anywhere on a C0 continuous surface means that the registration
between surfaces can have greater accuracy than the spacing s between
range points.
4.2 Constraints on ICP
Our ICP algorithm differs from BeslÕs in several ways. First, we have added
a distance threshold to the basic iterated closest-point method to avoid
matching any vertex Bi of one mesh to a remote part of another mesh that is
likely to not correspond to Bi. Such a vertex Bi from mesh B might be from
a portion of the scanned object that was not captured in the mesh A, and thus
no pairing should be made to any point on A. We have found that excellent
registration will result when this distance threshold is set to twice the
spacing s between range points. Limiting the distance between pairs of
corresponding points allows us to perform step 2 (eliminating remote pairs)
during the nearest points search in step 1.
The nearest points search can be accelerated considerably by placing the
mesh vertices in a uniform subdivision of space based on the distance
threshold. Because the triangle size is limited in the mesh creation step, we
can search over all triangles within a fixed distance and guarantee that we
miss no nearby portion of any triangle. Because we will use this constrained
nearest-point search again later, it is worth giving a name to this query. Let
nearest_on_mesh(P,d,M) be a routine that returns the nearest position on a
mesh M to a given point P, or that returns nothing if there is no such point
within the distance d.
Second, we have added the restriction that we never allow boundary points
to be part of a match between surfaces. Boundary points are those points
that lie on the edge of a triangle and where that edge is not shared by another
triangle. Figure 4 illustrates how such matches can drag a mesh in a
contrary direction to the majority of the point correspondences.
4.3 Best Rigid Motion
The heart of the iterated closest-point approach is in finding a rigid
transformation that minimizes the least-squared distance between the point
pairs. Berthold Horn describes a closed-form solution to this problem [Horn
87] that is linear in time with respect to the number of point pairs. HornÕs
method finds the translation vector T and the rotation R such that:
is minimized, where Ai and Bi are given pairs of positions in 3-space and Bc
is the centroid of the Bi. Horn showed that T is just the difference between
the centroid of the points Ai and the centroid of the points Bi. R is found by
constructing a cross-covariance matrix between centroid-adjusted pairs of
points. The final rotation is given by a unit quaternion that is the
eigenvector corresponding to the largest eigenvalue of a matrix constructed
from the elements of this cross-covariance matrix. Details can be found in
both [Horn 87] and [Besl 92].
As we discussed earlier, not all range points have the same error bounds on
their position. We can take advantage of an optional weighting term in
HornÕs minimization to incorporate the positional uncertainties into the
registration process. Let a value in the range from 0 to 1 called confidence
be a measure of how certain we are of a given range pointÕs position. For
the case of structured light scanners, we take the confidence of a point P on a
mesh to be the dot product of the mesh normal N at P and the vector L that
points from P to the light source of the scanner. (We take the normal at P to
be the average of the normals of the triangles that meet at P.) Additionally,
we lower the confidence of vertices near the mesh boundaries to take into
account possible error due to false edge extension and curl. We take the
confidence of a pair of corresponding points Ai and Bi from two meshes to
be the product of their confidences, and we will use wi to represent this
value. The problem is now to find a weighted least-squares minimum:
The weighted minimization problem is solved in much the same way as
before. The translation factor T is just the difference between the weighted
centroids of the corresponding points. The solution for R is described by
Horn.
4.4 Alignment in Practice
The above registration method can be made faster by matching increasingly
more detailed meshes from a hierarchy. We typically use a mesh hierarchy
in which each mesh uses one-forth the number of range points that are used
in the next higher level. The less-detailed meshes in this hierarchy are
constructed by sub-sampling the range images. Registration begins by
running constrained ICP on the lowest-level mesh and then using the
resulting transformation as the initial position for the next level up in the
hierarchy. The matching distance threshold d is halved with each move up
the hierarchy.
Besl and McKay describe how to use linear and quadratic extrapolation of
the registration parameters to accelerate the alignment process. We use this
technique for our alignment at each level in the hierarchy, and find it works
well in practice. Details of this method can be found in their paper.
The constrained ICP algorithm registers only two meshes at a time, and there
is no obvious extension that will register three or more meshes
simultaneously. This is the case with all the registration algorithms we
know. If we have meshes A, B, C and D, should we register A with B, then
B with C and finally C with D, perhaps compounding registration errors?
We can minimize this problem by registering all meshes to a single mesh
that is created from a cylindrical range image. In this way the cylindrical
range image acts as a common anchor for all of the other meshes. Note that
if a cylindrical scan covers an object from top to bottom, it captures all the
surfaces that lie on the convex hull of the object. This means that, for almost
all objects, there will be some common portions between the cylindrical scan
and all linear scans, although the degree of this overlap depends on the
extent of the concavities of the object. We used such a cylindrical scan for
alignment when constructing the models shown in this paper.
5 Integration: Mesh Zippering
The central step in combining range images is the integration of multiple
views into a single model. The goal of integration is to arrive at a
description of the overall topology of the object being scanned. In this
section we examine how two triangle meshes can be combined into a single
surface. The full topology of a surface is realized by zippering new range
scans one by one into the final triangle mesh.
Zippering two triangle meshes consists of three steps, each of which we will
consider in detail below:
1) Remove overlapping portions of the meshes.
2) Clip one mesh against another.
3) Remove the small triangles introduced during clipping.
5.1 Removing Redundant Surfaces
Before attempting to join a pair of meshes, we eat away at the boundaries of
both meshes until they just meet. We remove those triangles in each mesh
that are in some sense Òredundant,Ó in that the other mesh includes an
unbroken surface at that same position in space. Although this step removes
triangles from the meshes, we are not discarding data since all range points
eventually will be used to find the consensus geometry (Section 6). Given
two triangle meshes A and B, here is the process that removes their
redundant portions:
Repeat until both meshes remain unchanged:
Remove redundant triangles on the boundary of mesh A
Remove redundant triangles on the boundary of mesh B
Before we can remove a given triangle T from mesh A, we need to
determine whether the triangle is redundant. We accomplish this by
querying mesh B using the nearest_on_mesh() routine that was introduced
earlier. In particular, we ask for the nearest positions on mesh B to the
vertices V1, V2 and V3 of T. We will declare T to be redundant if the three
queries return positions on B that are within a tolerance distance d and if
none of these positions are on the boundary of B. Figure 7 shows two
overlapping surfaces before and after removing their redundant triangles. In
some cases this particular decision procedure for removing triangles will
leave tiny gaps where the meshes meet. The resulting holes are no larger
than the maximum triangle size and we currently fill them in an automatic
post-processing step to zippering. Using the fast triangle redundancy check
was an implementation decision for the sake of efficiency, not a necessary
characteristic of our zippering approach, and it could easily be replaced by a
more cautious redundancy check that leaves no gaps. We have not found
this necessary in practice.
If we have a measure of confidence of the vertex positions (as we do for
structured light scanners), then the above method can be altered to preserve
the more confident vertices. When checking to see if the vertices V1, V2
and V3 of T lie within the distance tolerance of mesh B, we also determine
whether at least two of these vertices have a lower confidence measure than
the nearby points on B. If this is the case, we allow the triangle to be
removed. When no more triangles can be removed from the boundaries of
either mesh, we drop this confidence value restriction and continue the
process until no more changes can be made. This procedure results in a pair
of meshes that meet along boundaries of nearly equal confidences.
5.2 Mesh Clipping
We now describe how triangle clipping can be used to smoothly join two
meshes that slightly overlap. The left portion of Figure 5 shows two
overlapping meshes and the right portion shows the result of clipping. Let
us examine the clipping process in greater detail, and for the time being
make the assumption that we are operating on two meshes that lie in a
common plane.
To clip mesh A against the boundary of mesh B we first need to add new
vertices to the boundary of B. Specifically, we place a new vertex wherever
an edge of a triangle from mesh A intersects the boundary of mesh B. Let Q
be the set of all such new vertices. Together, the new vertices in Q and the
old boundary vertices of mesh B will form a common boundary that the
triangles from both meshes will share. Once this new boundary is formed
we need to incorporate the vertices Q into the triangles that share this
boundary. Triangles from mesh B need only to be split once for each new
vertex to be incorporated (shown in Figure 5, right). Then we need to divide
each border triangle from A into two parts, one part that lies inside the
boundary of B that should be discarded and the other part that lies outside of
this boundary and should be retained (See Figure 5, middle). The vertices of
the retained portions of the triangle are passed to a constrained triangulation
routine that returns a set of triangles that incorporates all the necessary
vertices (Figure 5, right).
The only modification needed to extend this clipping step to 3-space is to
determine precisely how to find the points of intersection Q. In 3-space the
edges of mesh A might very well pass above or below the boundary of B
instead of exactly intersecting the boundary. To correct for this we
ÒthickenÓ the boundary of mesh B. In essence we create a wall that runs
around the boundary of B and that is roughly perpendicular to B at any given
location along the boundary. The portion of the wall at any given edge E is
a collection of four triangles, as shown in Figure 6. To find the intersection
points with the edges of A, we only need to note where these edges pass
through the wall of triangles. We then move this intersection point down to
the nearest position on the edge E to which the intersected portion of the
wall belongs. The rest of the clipping can proceed as described above.
5.3 Removing Small Triangles
The clipping process can introduce arbitrarily small or thin triangles into a
mesh. For many applications this does matter, but in situations where such
triangles are undesirable they can easily be removed. We use vertex deletion
to remove small triangles: if any of a triangleÕs altitudes fall below a user-
specified threshold we delete one of the triangleÕs vertices and all the
triangles that shared this vertex. We then use constrained triangulation to fill
the hole that is left by deleting these triangles (see [Bern 92]). We
preferentially delete vertices that were introduced as new vertices during the
clipping process. If all of a triangleÕs vertices are original range points then
the vertex opposite the longest side is deleted.
5.4 False Edge Extension
As described in Section 3.2, range points from a structured light scanner that
are near an objectÕs silhouette are extended and curled away from the true
geometry. These extended edges typically occur at corners. If there is at
least one scan that spans both sides of the corner, then our method will
correctly reconstruct the surface at the corner. Since we lower the
confidence of a surface near the mesh boundaries, triangles at the false edge
extensions will be eliminated during redundant surface removal because
there are nearby triangles with higher confidence in the scan that spans the
corner. For correct integration at a corner, it is the userÕs responsibility to
provide a scan that spans both sides of the corner. Figure 7 illustrates
correct integration at a corner in the presence of false edge extension.
Unfortunately, no disambiguating scan can be found when an object is
highly curved such as a thin cylinder.
Although the problem of false edge extension is discussed in the structured
light literature [Businski 92], we know of no paper on surface integration
from such range images that addresses or even mentions this issue. We are
also unaware of any other integration methods that will correctly determine
the geometry of a surface at locations where there are false extensions. Our
group has developed a method of reducing false edge extensions when
creating the range images (to appear in a forthcoming paper) and we are
exploring algorithms that will lessen the effect of such errors during
integration. It is our hope that by emphasizing this issue we will encourage
others to address this topic in future research on range image integration.
6 Consensus Geometry
When we have zippered the meshes of all the range images together, the
resulting triangle mesh captures the topology of the scanned object. This
mesh may be sufficient for some applications. If surface detail is important,
however, we need to fine-tune the geometry of the mesh.
The final model of an object should incorporate all the information available
about surface detail from each range image of the object. Some of this
information may have been discarded when we removed redundant triangles
during mesh zippering. We re-introduce the information about surface detail
by moving each vertex of our zippered mesh to a consensus position given
by a weighted average of positions from the original range images. Vertices
are moved only in the direction of the surface normal so that features are not
blurred by lateral motion. This is in contrast to unstructured techniques
which tend to blur small features isotropically. Our preference for averaging
only in the direction of the surface normal is based on the observation that
most points in range scans are generally accurately placed with respect to
other points in the same scan, but may differ between scans due to alignment
errors such as uncorrected optical distortion in the camera. Let M1, M2,...,
Mn refer to the original triangle meshes created from the range images.
Then the three steps for finding the consensus surface are:
1) Find a local approximation to the surface normal.
2) Intersect a line oriented along this normal with each original range image.
3) Form a weighted average of the points of intersection.
We approximate the surface normal N at a given vertex V by taking an
average over all vertex normals from the vertices in all the meshes Mi that
fall within a small sphere centered at V. We then intersect each of the
meshes Mi with the line passing through V along the direction N. Let P be
the set of all intersections that are near V. We take the consensus position of
V to be the average of all the points in P. If we have a measure of
confidence for positions on a mesh we use this to weight the average.
7 Results and Discussion
The dinosaur model shown in Figure 8 was created from 14 range images
and contains more than 360,000 triangles. Our integration method correctly
joined together the meshes at all locations except on the head where some
holes due to false edge extensions were filled manually. Such holes should
not occur once we eliminate the false extensions in the range images. The
dinosaur model was assembled from a larger quantity of range data
(measured either in number of scans or number of range points) than any
published model known to us. Naturally, we plan to explore the use of
automatic simplification methods with our models [Schroeder 92] [Turk 92]
[Hoppe 93]. Figure 9 shows a model of a phone that was created from ten
range images and contains over 160,000 triangles. The mesh on the right
demonstrates that the consensus geometry both reduces noise from the range
images without blurring the modelÕs features and also that it eliminates
discontinuities at zippered regions.
A key factor that distinguishes our approach from those using unstructured
integration ([Hoppe 92] and others) is that our method attempts to retain as
much of the triangle connectivity as is possible from the meshes created
from the original range images. Our integration process concentrates on a
one-dimensional portion of the mesh (the boundary) instead of across an
entire two-dimensional surface, and this makes for rapid integration.
Our algorithm shares several characteristics with the approach of Soucy and
Laurendeau, which is also a structured integration method [Soucy 92]. The
most important difference is the order in which the two methods perform
integration and geometry averaging. SoucyÕs method first creates the final
vertex positions by averaging between range images and then stitches
together the common surface sets. By determining geometry before
connectivity, their approach may be sensitive to artifacts of the stitching
process. This is particularly undesirable because their method can create
seams between as many as 2n common surface sets from n range images.
Such artifacts are minimized in our approach by performing geometry
averaging after zippering.
In summary, we use zippering of triangle meshes followed by refinement of
surface geometry to build detailed models from range scans. We expect that
in the near future range image technology will replace manual digitization of
models in several application areas.
8 Future Work
There are several open problems related to integration of multiple range
images. One issue is how an algorithm might automatically determine the
next best view to capture more of an objectÕs surface. Another important
issue is merging reflectance information (including color) with the geometry
of an object. Maybe the biggest outstanding issue is how to create higher-
order surface descriptions such as Bezier patches or NURBS from range
data, perhaps guided by a polygon model.
Acknowledgments
We thank David Addleman, George Dabrowski and all the other people at
Cyberware for the use of a scanner and for educating us about the issues
involved in the technology. We thank all the members of our scanner group
for numerous helpful discussions. In particular, Brian Curless provided
some key insights for interpreting the range data and also wrote code to help
this work. Thanks to Phil Lacroute for help with the color figures. This
work was supported by an IBM Faculty Development Award, The Powell
Foundation, and the National Science Foundation under contract CCR-
9157767.
References
[Bern 92] Bern, Marshall and David Eppstein, ÒMesh Generation and
Optimal Triangulation,Ó Technical Report P92-00047, Xerox Palo Alto
Research Center, March 1992.
[Besl 92] Besl, Paul J. and Neil D. McKay, ÒA Method of Registration of 3-
D Shapes,Ó IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 14, No. 2 (February 1992), pp. 239-256.
[Boissonnat 84] Boissonnat, Jean-Daniel, ÒGeometric Structures for Three-
Dimensional Shape Representation,Ó ACM Transactions on Graphics, Vol.
3, No. 4 (October 1984), pp. 266-286.
[Businski 92] Businski, M., A. Levine and W. H. Stevenson, ÒPerformance
Characteristics of Range Sensors Utilizing Optical Triangulation,Ó IEEE
National Aerospace and Electronics Conference, Vol. 3 (1992), pp. 1230-
1236.
[Champleboux 92] Champleboux, Guillaume, Stephane Lavallee, Richard
Szeliski and Lionel Brunie, ÒFrom Accurate Range Imaging Sensor
Calibration to Accurate Model-Based 3-D Object Localization,Ó Proceedings
of the IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, Champaign, Illinois, June 15-20, 1992, pp. 83-89.
[Chen 92] Chen, Yang and Gerard Medioni, ÒObject Modelling by
Registration of Multiple Range Images,Ó Image and Vision Computing, Vol.
10, No. 3 (April 1992), pp. 145-155.
[Edelsbrunner 92] Edelsbrunner, Herbert and Ernst P. MŸcke, ÒThree-
dimensional Alpha Shapes,Ó Proceedings of the 1992 Workshop on Volume
Visualization, Boston, October 19-20, 1992, pp. 75-82.
[Hoppe 92] Hoppe, Hugues, Tony DeRose, Tom Duchamp, John McDonald
and Werner Stuetzle, ÒSurface Reconstruction from Unorganized Points,Ó
Computer Graphics, Vol. 26, No. 2 (SIGGRAPH Õ92), pp. 71-78.
[Hoppe 93] Hoppe, Hugues, Tony DeRose, Tom Duchamp, John McDonald
and Werner Stuetzle, ÒMesh Optimization,Ó Computer Graphics
Proceedings, Annual Conference Series (SIGGRAPH Õ93), pp. 19-26.
[Horn 87] Horn, Berthold K. P., ÒClosed-Form Solution of Absolute
Orientation Using Unit Quaternions,Ó Journal of the Optical Society of
America. A, Vol. 4, No. 4 (April 1987), pp. 629-642.
[Schroeder 92] Schroeder, William J., Jonathan A. Zarge and William E.
Lorensen, ÒDecimation of Triangle Meshes,Ó Computer Graphics, Vol. 26,
No. 2 (SIGGRAPH Õ92), pp. 65-70.
[Soucy 92] Soucy, Marc and Denis Laurendeau, ÒMulti-Resolution Surface
Modeling from Multiple Range Views,Ó Proceedings of the IEEE Computer
Society Conference on Computer Vision and Pattern Recognition,
Champaign, Illinois, June 15-20, 1992, pp. 348-353.
[Turk 92] Turk, Greg, ÒRe-Tiling Polygonal Surfaces,Ó Computer Graphics,
Vol. 26, No. 2 (SIGGRAPH Õ92), pp. 55-64.
[Wada 93] Wada, Nobuhiko, Hiroshi Toriyama, Hiromi T. Tanaka and
Fumio Kishino, ÒReconstruction of an Object Shape from Multiple
Incomplete Range Data Sets Using Convex Hulls,Ó Computer Graphics
International Õ93, Lausanne, Switzerland, June 21-25, 1993, pp. 193-203.