Applet: Andrew Adams, Nora Willett
Text: Marc Levoy
The lens of a photographic camera is typically a complex assembly of multiple lenses, sometimes more than a dozen. Nevertheless, for the purpose of studying how the light leaving points in the scene is imaged (focused) to points inside the camera, the laws of geometrical optics allows us to replace this assembly with a single lens of appropriate shape. In this applet, we use this thin lens approximation to explore some of the relationships between the scene and its image.
The focusing of light by a glass lens composed of opposing spherical surfaces was first worked out by Johannes Kepler in about 1604. His analysis, based on careful observation of how thin beams of light bent as they entered and left the glass, was entirely empirical. A proper mathematical treatment of this bending required Snell's law of refraction at the interface between two different materials (air and glass in this case). This law wasn't discovered until 1621. With Snell's law in hand, Carl Friederich Gauss worked out a clever geometric procedure for tracing these rays. This procedure, now called Gaussian ray tracing or Gauss's ray diagram, is implemented in the applet above.
At the center of the applet is a thin lens. It is conventional to place the scene to the left of the lens in a ray diagram and the image to the right. Formally, we call these regions object space and image space, respectively. To see Gauss's procedure in action, drag the large gray dot in object space to a position along the topmost line in the blue grid. A set of three light paths should appear, drawn in red. One travels rightwards parallel to the optical axis (the black line perpendicular to the lens surfaces at the center of the lens), then bends downward at the lens, passing through a point on the optical axis marked with a black tick. (Actually, it bends twice, once as it enters the glass, and once as it leaves the glass. In the thin lens approximation, it is customary to draw these two bends as a single bend in the middle of the lens.)
One of the implications of Snell's law is that light traveling along all paths parallel to the optical axis in object space will be bent by a spherical lens such that they pass through this tick mark. This distance of this mark from the center of the lens is called the focal length of the lens. (Strictly speaking, this rule is true only if the slope of the ray after bending by the lens is not too steep. This condition is called the paraxial approximation.) A second path leaving the large gray dot passes straight through the lens without bending. (Actually, like the first ray we considered this ray bends twice, but the net effect is that the ray continues at the same slope, merely offset slightly in position. In the thin lens approximation, one ignores this offset.) Another implication of Snell's law is that the bending of light at a glass-air interface does not depend on the direction the light is traveling. This means that we can apply the geometry of the first ray to trace a third ray, which leaves the gray dot, passes through a tick mark one focal length to the left of the lens, then bends at the lens to become parallel to the optical axis on the right side of the lens. These three rays meet at a single point. In fact, it can be shown (yes, through Snell's law) that all rays leaving the gray dot, regardless of direction, will meet in image space at this same point (asuming they manage to hit the lens at all). This point is the (focused) image of the gray dot.
Drag the gray dot to each position on the blue grid. Note that for each position, the rays leaving it are brought to a focus by the lens at some point in image space (on the right). This means that the lens is not just bringing a plane in object space into focus at a plane in image space; it is bringing every point in object space into focus at some point in image space. In other words, the lens is making a focused copy of object space in image space. Why don't we see everything in focus when we take a picture using a camera? Because the camera's film or sensor chip occupies only one plane in image space. Thus, it is taking a 2D "slice" of this 3D image. Within that slice, only points in object space that lie on one plane will be captured in focus. Other points in object space are indeed brought to a focus inside the camera, but at positions in front of or behind the sensor. We can "find" those focused positions if we move the sensor or, nearly equivalently, the lens.
A second thing to notice is that points in object space (on the left) that are further from the lens are focused to positions in image space (on the right) that are closer to the lens. Thus, if you have only a 2D sensor to capture image space, as practical cameras do, then you should move the sensor closer to the lens to capture points that are further away from the lens in object space, and vice versa. This relationship between object distance and image distance is captured in the Gaussian lens formula, which we consider in another applet.
A third thing to notice, and perhaps the most interesting, is that points in object space (on the left) whose distance from the lens varies but whose distance from the optical axis is constant (i.e. a horizontal line in the blue grid) are focused to positions in image space whose distance from the optical axis varies, forming one of the slanted lines in the blue grid at right. (Be careful not to interpret the blue grid at right as a perspective view of a plane coming out of your computer screen. It looks like one, for a good reason that we'll discuss in a moment, but it's not. Everything depicted in this applet is happening on a single plane parallel to the surface of your screen.) The conclusion one draws from these grids is that objects of fixed size but at different depths in the scene produce images of different sizes in the camera. On one hand this fact hardly seems surprising. We are merely saying that objects that are far away from the lens will be recorded as smaller in a photograph taken by a camera. Everybody knows this. What's more surprising is that if an object appears slightly out of focus in a photograph, moving the sensor forward or backward to bring it into focus will also change its size in the photograph!
Don't believe us? Try it for yourself by manually focusing an SLR while you study the position of an out-of-focus feature in the scene. As you refocus the camera, features will move towards or away from the center of the field of view. The effect is strongest if the lens is focused close to the camera, or is a macro lens. Returning to the applet, look at the spray of rays leaving the focused image of the gray dot (on the right side of the lens). If the sensor is not located at this focused position, this spray shows the position and shape of the blur (a.k.a. the bokeh) of our out-of-focus view of the gray dot. Note that this blur moves further from the optical axis as it travels rightward. (An image-space telecentric lens does not suffer from this effect, but photographic cameras are not telecentric. A microscope is object-space telecentric, which has a slightly different effect, but that's another story.)
As we have said, a lens copies the scene from 3D object space into 3D image space, performing a certain transformation as it does so. Readers with a background in computer graphics will look at the shape of the blue grid on the right, and they'll see a frustum, a truncated pryamid. Indeed, a cube centered around the optical axis in object space is transformed by a lens into a frustum shape in image space, where the apex of the frustum lies on the optical axis one focal length from the lens. In other words, lenses perform a 3D perspective transformation. In computer graphics this transformation is performed using a certain 4x4 matrix. Using simple algebra, one can show that this matrix is identical to the ray transfer matrices found in advanced optics textbooks.