Robust Multi-camera Calibration

CS 205 Project Proposal

Abstract:

Camera calibration is the determination of the relationship between a the 3D position of a point in the world and the 2D pixel coordinates of its image in the camera. In this project, we explore extension of one algorithm for calibrating a single camera to calibrating an array of 128 cameras. Our primary goal is implementing a global, nonlinear optimization procedure to compute "optimal" values of all camera parameters. We hope to achieve more accuracy and stability this way, as opposed to using existing software to calibrate each camera separately.



Contents


Introduction:

The Stanford Graphics Lab is building an array of 128 cameras for high performance imaging applications [1]. Many applications we are targetting require very precise calibration of the cameras. Currently, we are using third-party implementations [2][3] of Zhang's algorithm [4] to calibrate each camera independently. However, nobody to our knowledge has tried to calibrate so many cameras together, and we would like to study whether we can do better than simply calibrate each camera individually. In the subsequent sections we give a brief overview of camera calibration, and drawbacks of our exisiting approach. Thereafter, we map out the the goals we wish to achieve, and how techniques from CS 205 could be useful for us. The class project would involve implementing our ideas, and analysing theoretical and experimental improvements.


Camera Calibration:

We model a camera as a pin-hole device, that projects the 3-dimensional world onto a 2-dimensional image plane. The parameters we need to calibrate are the position (3 parameters) of the pinhole, the orientation of the image plane (3 degrees of rotation) and four internal parameters that define the geometry of the pinhole with respect to the image plane. They describe the focal length of the camera and the offset of the optical axis from the center of the image. Since a real camera with a lens is not exactly a pinhole device, we should also model the "distortion" caused by a lens to varying degrees of precision.

All experimental techniques for calibrating a camera begin with photographing an object of known geometry with the camera to be calibrated. The idea is to get a number of correspondences between points in the world whose position (ie 3-d coordinates) are known, and the pixel coordinates of their image in the camera. Each world coordinates - pixel coordinates pair gives us one equation in terms of the camera parameters. The system of equations is solved for the camera parameters that give a "best fit" solution, by a nonlinear minimization of an error function.


The Curent System:

We use a plane with a patterned grid of known geometry to calibrate the cameras. The calibration procedure involves placing the grid before the cameras in several different orientations. All cameras take a picture of the grid for each orientation we place it in. The more the orientations in which the grid is photographed, the more the equations we get for the camera parameters. Each equation describes the relation between a feature point on the claibration grid, and the pixel coordinates of its image in the camera.

Camera Calibration Overview

(Click on the images to see them at full resolution)

Different orientations of the calibration grid seen by the camera. Each position of the grid gives us constraints on the camera parameters.


Square corners on the grid, located by a feature detector, are marked with a cross (+). The computed calibration parameters are used to reproject the grid geometry back on the image. Reprojected corners are marked with a circle (o).


A 3-d view of the computed positions of the grid with respect to the camera.


Calibration error: the difference in pixel coordinates between the position of grid corners observed in the image and that predicted by the computed calibration parameters.

The calibration software [2, 3] we use reads the images of the calibration grid acquired by a single camera and estimates the optimal values of camera parameters by minimizing a nonlinear error function. We use this to calibrate each camera in our array, independent of the others. But this has some obvious drawbacks:

  1. We do not enforce mutli-camera constraints available to us - based on the fact that all cameras are seeing the same calibration grid going through the same set of orientations. By adding equations that incorporates these constraints, we think we could achieve greater accuracy.
  2. Improper model selection, or lack of sufficient images can drive the optimizer into absurdly erroneous values for camera parameters. Often, this occurs because the software is trying to model the camera lens distortion with too many parameters. This leads to a degeneracy in the solution space, where instead of a unique solution there is a space of infinitely many solutions.

We would like to implement a calibration system that tries to resolve these issues.


Goals:

The aim is to implement a nonlinear optimization procedure, which searches for optimal parameters for all cameras simulataneously. We would like to experiment with this to:
  1. determine whether calibrating all cameras simultaneously achieves greater consistancy or accuracy.
  2. automatically determine the desirable number of model parameters to use, to achieve stability without compromising accuracy.
  3. study conditioning of the system to give useful feedback to the user, such as whether a particular camera should be shown more images of the grid.


References: