A white paper submitted to the
National Science Foundation's Digital Libraries Initiative
Marc Levoy and Hector Garcia-Molina
Computer Science Department
Stanford University
December 4, 1999
(revised March 27, 2000)
Recent improvements in laser rangefinder technology, together with algorithms for combining multiple range and color images, allow us to reliably and accurately digitize the external shape and surface characteristics of many physical objects. Examples include machine parts, design models, toys, and artistic and cultural artifacts. Although the technologies required to create digital archives of 2D artifacts have matured substantially in the last ten years [Lesk97], the jump from two to three dimensions poses new problems. These are problems of both scale and substance, and they touch on every aspect of digital archiving: storage, indexing, searching, distribution, viewing, and piracy protection. To prevent our libraries and museums from being blindsided by technological change, we should begin addressing these problems now. Furthermore, to insure that the solutions we adopt are useful and scalable, we should incorporate into our research programs the construction of several non-toy 3D digital archives.
In this white paper, we enumerate and discuss the challenges of creating
digital archives of 3D artworks. We also propose solutions to some of these
problems. In the "sidebars" (sections with italicized titles), we illustrate
these problems with examples taken from the Digital Michelangelo Project, and
we identify particular solutions that we at Stanford would like to try
implementing, using the 3D data acquired during this project as a testbed.
3D scanning produces large datasets. During our year in Italy we acquired 250 gigabytes of raw data. Over the next year or so, we will clean up this data, edit it, and assemble it to create 3D geometric models with color. We estimate that these models will occupy another 250 gigabytes. Since the technology for assembling scans into models is immature, we consider it important that scholars continue to have access to the raw data. Thus, the total size of our archive will be about 500 gigabytes. Although this database is currently the largest of its type in the world, we believe that with continued development and commercialization of 3D digitization technologies, archives of this size will soon become commonplace, and will even grow in size.
The enormous size of 3D archives makes it unlikely that many libraries will be sufficiently wealthy to maintain local copies of them. Certainly, online storage is out of the question. At the current price of $6 per gigabyte for a low-end hard drive, it would cost $3,000 to keep our 500-gigabyte database "spinning". No library can spend this much archiving a single work. Offline storage media, like 8-gigabyte DVD disks, offers an attractive alternative. However, our database would occupy over 60 DVD disks, which if nothing else represents a significant investment in shelf space.
We therefore believe that 3D archives must be stored centrally and distributed on demand to libraries. The centralized server might be at the museum that owns the work, at a centralized library (like the U.S. Library of Congress), or elsewhere. Although centralizing the storage of 3D works eliminates the cost of redundant copies, it does not eliminate the cost of the primary archive, which is high. This solution also forces a long-term relationship between the serving institution and the client library, one that has monetary implications. Moreover, if the serving institution closes, then the library has failed in its role of providing a safe, permanent archive of valuable works for posterity.
Another problem related to storage of 3D works is insuring the longevity of the archive. Unlike images, which can be printed on paper at high resolution to insure their survival in the event a digital archive becomes unreadable or undecodable, 3D models have no natural printed representation. Of course, rapid prototyping technologies (such as stereolithography) could be used to make physical replicas [Curless, 1996] of small objects, but the cost of making such replicas is high, the replicas do not capture color or surface finish, and replicas cannot be made of large objects except at a reduced scale and quality. Thus, it is important to preserve the digital models themselves. Due to the size and complexity of the 3D models, techniques that are being developed for archiving digital information will have to be extended [Cooper, 1999; Garrett 1996].
One approach that has been suggested is to save not just the bytes, but also the software needed to interpret the bytes, with the idea that this software can be recompiled on future machines [Crespo, 1999]. In the case of graphical data, this implies archiving not only the source code for file readers and interactive viewers, but also for a layered hierarchy of high-level and low-level graphics packages. In the case of interactive 3D graphics, these packages often interact not only with the operating system, but also with graphics acceleration hardware and with display and input devices. These interactions greatly complicate this "emulation" approach.
One more problem worth mentioning is that because 3D scanning is a young technology, there are methodologies and parameters associated with a 3D digitization effort that should be recorded and made available to users of the data. There is no clear concensus on how such "metadata" should be represented or stored [Weibel, 1995]. The current "technology" - README files - is inadequate; better methods will need to be developed.
If 3D works are centrally archived, and to some extent even if they are not, techniques must be developed for indexing the works within each client library. Since artwork is inherently graphical, and modern computers make graphics easy, it makes sense to create graphical catalogs for digitized art. For 2D works, local indexing can be accomplished by catalogs containing thumbnail images. For 3D works, these thumbnails should be interactive, allowing the library user to examine the work from all angles. However, they should also be lightweight enough for browsing on a low-cost PC, even a laptop or handheld computer. Unfortunately, computers capable of rotating 3D models in real-time are expensive, and even low-detail 3D models of complex objects require a lot of memory, both in the computer and on disk.
One possible solution to this quandary is to use techniques from the relatively new field of image-based rendering [IBR]. Image-based rendering generates views of a 3D object from a set of previously computed (or photographed) images. Since only pixels are being manipulated, not 3D models, these techniques are efficient, and their computational cost is independent of object complexity.
Although image-based rendering techniques are still in their infancy, they have
already found application in the commercial sector. For example, in an effort
to make online merchandise catalogues more compelling and useful than their
printed counterparts, many retailers are experimenting with texture-mapped 3D
models (such as Meta Creations's MetaStream format) or animation flipbooks
(such as Apple's QuickTime VR, a primitive form of image-based rendering [Chen,
1995]). We expect explosive growth in this area in the coming years.
Cataloguing Michelangelo's statues using light fields
We propose exploring the use of light fields for cataloguing archives of 3D artworks. Such catalogues would be authored by the museum or central server, stored on each library's server (although downloading on demand is also possible), and viewed on an inexpensive PC or laptop. To test our idea, we propose to build a prototype catalog for the statues of Michelangelo where the entries are interactive light fields built from renderings of our computer models of the statues. |
Although some museums have joined licensing cooperatives [MDLC], they are as a whole unsure what economic model to use to license and distribute their artistic patrimony. The reasons for this uncertainty are multifold: the cost of digitizing artifacts is high, the cost of maintaining and distributing a digital archive is high, and the economic value of digital representations of two-dimensional works is uncertain. One reason for this uncertainty is that photographic replicas of important paintings and drawings abound, many of them unlicensed, so the liklihood of realizing substantial royalties from digital replicas of these artworks is low.
The situation for three-dimensional artworks is different, however. Until a few years ago no famous sculptures had been scanned, and of those that have been scanned, only a handful have been used to manufacture replicas for sale. Although replicas of famous statues (such as Michelangelo's David) abound, these are based on handmade clay models and are of poor quality. Thus, the potential economic value of digital representations of 3D artifacts is high. Indeed, the U.S. market for replicas of statuary and other 3D artifacts, much of it conducted through mail-order catalogs, exceeds a billion dollars annually.
Providing physical protection for valuable intellectual property is a hard problem composed of many parts. One aspect is secure archiving. If due to their large size 3D archives are kept in only a few central locations, rather than in libraries worldwide, then maintaining security for the data is simplified. However, centralized archiving does not entirely solve the security problem; it only shifts it. In order to be useful, an archive must be downloadable and/or viewable. To protect data during these operations, researchers have suggested some combination of 3D digital watermarking, piracy detection, and piracy prevention. Let us consider each in turn.
Michelangelo's David: a driving application for piracy preventionPiracy is a real and constant danger in the Digital Michelangelo Project. The David is arguably the most famous statue in the world. Our model, accurate to 0.25mm, is not likely to be superceded for many years, if ever. In theory, we are permitted to release this model to scholars for non-commercial use. However, it is obvious to us that if our computer model were widely distributed, it would soon be pirated. Before long, we would be able to buy a bootlegged DVD copy for $25 or a simulated marble replica for $250. The replica would even bear a tag saying it was the true David, made from the Stanford data. From there, it would find its way into TV commercials, video games, etc. Aside from landing one of us (Levoy) in an Italian jail, this would poison the well for future digitization projects. Museums worldwide will undoubtedly take note of how we handle intellectual property protection for our models of Michelangelo's statues. Distributing them to scholars, while at the same time protecting them against piracy, is a very hard problem. |
One of the main reasons for digitizing an archive is to make it searchable using a computer, and 3D archives are no exception. Just as you would search a text archive for a given word or phrase, or search a 2D archive for a given image, you might search a 3D archive for a given shape. There are many applications for such a capability. One, already mentioned, is to search for pirated copies of a copyrighted model. On a more positive note, a doctor might want to search a 3D anatomical atlas for bone anomalies having a certain shape, a classical scholar might want to search the world's museums for Greek vases having a certain profile, ichthyologist might want to search a 3D field guide for fish whose shape matches a given sonar signature. Although search techniques for images is a rapidly growing field, there has been relatively little research on 3D shape matching. Note that in this discussion, we are talking about searches driven by given target images (or shapes), not by keywords.
Shape searching is similar to image searching in many respects. In both cases, candidate images (or shapes) can be translated, rotated, scaled or otherwise transformed relative to the target image (or shape), and in both cases, candidates can be present in conjunction with other objects, requiring segmentation. In some ways, shape searching is easier than image searching: there is no need to worry about perspective viewpoint, occlusions (shapes don't "block" other shapes the way superimposed images do), or differences in tone or color. In other ways, shape searching is harder: the same shape can be represented by a variety of geometric primitives (polygons, B-splines, voxels, etc.), there are a variety of file formats and no widely accepted standards, and the cost of each comparison is high.
One obvious approach to searching for shapes is to borrow techniques from the
image processing literature, such as the notion of computing and comparing
vectors of salient shape descriptors (like moments, topological genus (number
of holes), etc.). For the slightly different problem of testing for matches
between two shapes, one might use surface-to-surface geometric alignment
algorithms [Pulli, 1999], which one of us (Levoy) used when aligning 3D scans
of the statues of Michelangelo.
The Forma Urbis Romae: a driving application for shape matching
As a driving application for our investigation of shape searching and matching, we propose using our ongoing attempt to piece together the jigsaw puzzle of the Forma Urbis Romae. During the same year that one of us (Levoy) scanned the statues of Michelangelo, a project was mounted to digitize all 1,163 fragments of the Forma Urbis Romae, a giant map of ancient Rome carved onto marble slabs circa 200 A.D. This map is probably the single most important document on ancient Roman topography, and piecing its fragments together has been one of the great unsolved problems of classical archaeology [Carettoni, 1960]. The fragments of the Forma Urbis present many clues to the would-be puzzle solver. We plan search not only for matches between the 3D shapes of the fragments, but also between the 2.5D patterns of incisions on their top surfaces, which depict the streets and buildings of the ancient city. We also plan to search for matches between the incisions on the Forma Urbis fragments and all known 2D plans of modern excavations in the city. To support such a wide variety of searches, especially hybrid searches involving multiple datatypes, as well as searches that go beyond the data we acquired while in Italy, we will need powerful and flexible search engines. |
The 3D scanning industry, although currently immature, is doubling in size every year. Within the library and museum communities, the jump from 2D digitization to 3D digitization is already beginning. This jump poses serious research challenges, most of which remain unexplored. In this paper, we have attempted to enumerate these challenges, driven to some extent by the very real roadblocks we have encountered in the Digital Michelangelo Project.
We expect great synergy between this new effort and the ongoing Stanford Digital Library Project. For instance, the ongoing project is already exploring archival repositories for digital libraries, and a repository of 3D models would provide a new application. This new application would force us to design for much larger repositories, and to develop new techniques specifically for preserving 3D models. Similarly, the ongoing project has explored copy detection schemes for text and images, which might be extended to 3D models. The ongoing project has developed mechanisms for digital library interoperation (e.g., the InfoBus). Again, the rich 3D models of the Digital Michaelangelo Project will provide new interoperability challenges. For example, can this new type of information be served through the library interfaces that have already been developed, or do these interfaces need to be extended in new ways?
Beraldin, J.-A., Blais, F., Cournoyer, L., Rioux, M., El-Hakim, S.F., Rodell, R., Bernier, F., Harrison, N., Digital 3D imaging system for rapid response on remote sites, Proc. Second International Conference on 3-D Digital Imaging and Modeling, IEEE Computer Society Press, 1999, pp. 34-43.
Carettoni, G., Colini, A., Cozza, L., Gatti, G., La Pianta Marmorea di Roma Antica (The Marble Map of Ancient Rome), Commune di Rome, 1960.
Chen, S.E., QuickTime VR - An Image-Based Approach to Virtual Environment Navigation, Proc. SIGGRAPH '95 (Los Angeles, CA, August 6-11, 1995). In Computer Graphics Proceedings, Annual Conference Series, 1995, ACM SIGGRAPH, pp. 29-38.
Cooper, B., Crespo, A., Garcia-Molina, H., Implementing a Reliable Digital Object Archive, available at http://www-db.stanford.edu/pub/papers/arpaper.ps.
Crespo, A., Garcia-Molina, H., Modeling Archival Repositories for Digital Libraries, Stanford Computer Systems Laboratory Technical Report, 1999.
Curless, B., Levoy, M., A Volumetric Method for Building Complex Models from Range Images, Proc. SIGGRAPH '96 (New Orleans, LA, August 5-9, 1996). In Computer Graphics Proceedings, Annual Conference Series, 1996, ACM SIGGRAPH, pp. 303-312.
The Digital Michelangelo Project [DigMichProj], www.graphics.stanford.edu/projects/mich.
Garrett, J., Waters, D., Preserving Digital Information: Report of the Task Force on Archiving of Digital Information, May, 1996, accessible at http://www.rlg.org/ArchTF/.
Levoy, M., Hanrahan, P., Light Field Rendering, Proc. SIGGRAPH '96 (New Orleans, LA, August 5-9, 1996). In Computer Graphics Proceedings, Annual Conference Series, 1996, ACM SIGGRAPH, pp. 31-42.
Workshop on Image-Based Modeling and Rendering [IBR], March 23-25, 1998, Stanford University, at http://graphics.stanford.edu/workshops/ibr98/.
Lesk, M., Practical Digital Libraries: Books, Bytes, and Bucks, Morgan Kaufman, 1997.
Museum Digital Library Collection (MDLC), www.museumlicensing.org.
Ohbuchi, R., Masuda, H., Aono, M., Watermarking three-dimensional polygon models through geometric and topological modifications, IEEE Journal on Selected Areas in Communications, Vol. 16, No. 4, May, 1998, pp. 551-559.
Praun, E., Hoppe, H., Finkelstein, A., Robust Mesh Watermarking, Proc. SIGGRAPH '99 (Los Angeles, August 8-13, 1999). In Computer Graphics Proceedings, Annual Conference Series, 1999, ACM SIGGRAPH, pp. 49-56.
Pulli, K., Multiview registration for large datasets, Second International Conference on 3D Digital Imaging and Modeling, 1999.
Rushmeier, H., Bernardini, F., Mittleman, J., Taubin, G., Acquiring input for rendering at appropriate levels of detail: digitizing a Pietà, Proc. 9th Eurographics Rendering Workshop, Springer-Verlag, 1998, pp. 81-92.
Weibel, S., Metadata: the foundations of resource description, D-Lib Magazine, July, 1995.