This page describes our current implementation of camera calibration.
Our goal is to keep calibration accurate, efficient and simple enough to perform as often
as required by our applications. We would like to be able to do geometric calibration
for every data set we acquire, with minimal assumptions on camera placement and capabilities.
Hence, we abide by the following principles:
Feature Detection and Correspondences
Our calibration grid consists of several squares, whose corners are the feature points we
locate in the images. The corners are located in three stages:
Nonlinear Optimization (Bundle Adjustment)
Having obtained several pairs of 3D point-2D image pairs, we seek the model parameters that
best fit the observed data. This is easily formulated a nonlinear minimization problem in terms
of the calibration parameters (and the motion of the planar calibration grid) that minimizes
the pixel reprojection error.
For an initial
guess, we compute the calibration parameters of each camera separately using implementations of
Zhang's algorithm  . Complete details of this stage are
provided in , here we merely sketch the necessity of exploiting sparsity
in the optimization.
The nonlinear minization is a fairly large scale problem: suppose we have N cameras, taking S synchronized snapshots. Each camera is modeled by 12 parameters, in addition, there are S-1 rigid motions of the calibration grid. (The position of the grid in the first snapshot is used to define the global coordinate system.) This leads to a 12N+6S-6 dimensional problem: these are the number of parameters we are solving for. Suppose that on an average, we extract P point correspondences from a snapshot, then we have a total of D=P*N*S observations. Typically, we would have about N=100 cameras, S=15 snapshots, and about P=120 point correspondences per image. This is a 1284-dimensional search, with a 180,000 x 1284 jacobian. In double precision, this would require about 1.7 gigabytes of storage, and take impractically long to compute.
Fortunately, it is easy to show that the jacobian is very sparse. Each row of the jacobian corresponds to one ray through a point on the calibration grid and the center of one camera in the array. The only parameters this ray depends on are the 12 parameters or the camera, and the 6 representing the pose of the calibration grid. This means that each row of the jacobian can have at most 18 nonzero entries. Consequently, the effective width is a constant, independent of the number of cameras or snapshots. Exploiting this sparsity is necessary for a feasible implementation of the optimization.
Histogram of reprojection errors
We calibrated an array of 85 cameras from 8 snapshots of the calibration grid. This involved
a 1062-dimensional search, given a total of 86824 3D-2D pairs. Below we show the error statistics,
and visualizations of computed camera geometry.
RMS error 0.3199 pixels Mean error 0.2647 pixels Median error 0.2342 pixels Standard Deviation 0.1796 pixels Avg. Planarity Deviation 0.9636 cm
Camera centers, as computed by calibration, projected onto a photo of the array
itself. The projection was approximated by a homography computed by manually specifying
control points on the photograph to map the camera centers to.
Control points (clicked manually)
Superimposed control points and camera projections
Cummulative distribution of reprojection errors
2D plot of reprojection errors, MATLAB .fig file
3D plot of camera centers, MATLAB .fig file
Zhenghyou Zhang, MSR Tech. Report MSR-TR-98-71
Vaibhav Vaish (CS 205 Winter 2002-3 class project)
Last update: May 7th, 2003.
Histogram of reprojection errors