This page describes our current implementation of camera calibration.
Design Philosophy
Our goal is to keep calibration accurate, efficient and simple enough to perform as often
as required by our applications. We would like to be able to do geometric calibration
for every data set we acquire, with minimal assumptions on camera placement and capabilities.
Hence, we abide by the following principles:
We found it was easy to meet these criteria with a straightforward extension of
[1] to multiple cameras. By having our cameras view a planar calibration grid of known
geometry, we obtain several 3D-2D correspondences. Given these, we can obtain the optimal
camera parameters by solving a nonlinear least squares minimization problem.
Feature Detection and Correspondences
Our calibration grid consists of several squares, whose corners are the feature points we
locate in the images. The corners are located in three stages:
We use Horatio,
a vision library from the University of Surrey for doing the feature detection. The detected
corners are grouped into quadrilaterals and clustered into rows and columns. If enough
quadrilaterals are not found, or there are too many spurious ones, or the ones detected do not
match up to the known geometry of the grid, we simply abandon the image rather than incur the
risk of an incorrect match. Experiments with thousands of images (several light fields) show
that the feature detector never generates incorrect correspondences and locates about
90% of visible squares in the image on average. It is capable of handling partial occlusions
of the grid. An example of the feature detection is shown below.
Edge detection | ||
Line Fitting | ||
Final squares |
Nonlinear Optimization (Bundle Adjustment)
Having obtained several pairs of 3D point-2D image pairs, we seek the model parameters that
best fit the observed data. This is easily formulated a nonlinear minimization problem in terms
of the calibration parameters (and the motion of the planar calibration grid) that minimizes
the pixel reprojection error.
For an initial
guess, we compute the calibration parameters of each camera separately using implementations of
Zhang's algorithm [2] [3]. Complete details of this stage are
provided in [4], here we merely sketch the necessity of exploiting sparsity
in the optimization.
The nonlinear minization is a fairly large scale problem: suppose we have N cameras, taking S synchronized snapshots. Each camera is modeled by 12 parameters, in addition, there are S-1 rigid motions of the calibration grid. (The position of the grid in the first snapshot is used to define the global coordinate system.) This leads to a 12N+6S-6 dimensional problem: these are the number of parameters we are solving for. Suppose that on an average, we extract P point correspondences from a snapshot, then we have a total of D=P*N*S observations. Typically, we would have about N=100 cameras, S=15 snapshots, and about P=120 point correspondences per image. This is a 1284-dimensional search, with a 180,000 x 1284 jacobian. In double precision, this would require about 1.7 gigabytes of storage, and take impractically long to compute.
Fortunately, it is easy to show that the jacobian is very sparse. Each row of the jacobian corresponds to one ray through a point on the calibration grid and the center of one camera in the array. The only parameters this ray depends on are the 12 parameters or the camera, and the 6 representing the pose of the calibration grid. This means that each row of the jacobian can have at most 18 nonzero entries. Consequently, the effective width is a constant, independent of the number of cameras or snapshots. Exploiting this sparsity is necessary for a feasible implementation of the optimization.
Histogram of reprojection errors
Results
We calibrated an array of 85 cameras from 8 snapshots of the calibration grid. This involved
a 1062-dimensional search, given a total of 86824 3D-2D pairs. Below we show the error statistics,
and visualizations of computed camera geometry.
RMS error 0.3199 pixels Mean error 0.2647 pixels Median error 0.2342 pixels Standard Deviation 0.1796 pixels Avg. Planarity Deviation 0.9636 cm
Camera centers, as computed by calibration, projected onto a photo of the array
itself. The projection was approximated by a homography computed by manually specifying
control points on the photograph to map the camera centers to.
Control points (clicked manually)
Superimposed control points and camera projections
Cummulative distribution of reprojection errors
2D plot of reprojection errors, MATLAB .fig file
3D plot of camera centers, MATLAB .fig file
TO DOs
References
Zhenghyou Zhang, MSR Tech. Report MSR-TR-98-71
Jean-Yves Bouget.
Vaibhav Vaish (CS 205 Winter 2002-3 class project)
Vaibhav Vaish
Last update: May 7th, 2003.