Historical Notes on Texture Synthesis

Many people have commented on the similarity between Wei&Levoy's SIGGRAPH 2000 paper [1] and Efros&Leung's ICCV 99 paper [2]. To some people it even appeared that [1] is a direct derivation of [2]. This, unfortunately, is not true. I would like to explain this by describing the historical process of how I came up with my ideas in [1]. Hopefully this can help clarify some of the confusions in terms of citations for texture synthesis. In addition, this presents an interesting history of how my original ideas came up.

My texture synthesis research started as a term project in 1998 at David Heeger's class "Psych 267: Vision and Image Processing". David has a very interesting paper on texture synthesis [9], and that prompted my interest in texture synthesis. Around that time there are two additional state-of-the-art texture synthesis papers: one by Jeremy Debonet [7] and another by Eero Simoncelli [8]. My initial attempt is to use the ideas about joint relationship in [8] to improve the image quality in [7], which has boundary artifacts since the relationship between pixels at the same pyramid level is not considered. This, unfortunately, was considered an incremental idea and resulted in my first ever paper submission+rejection by SIGGRAPH 99. (Another reason for the rejection is that Marc was in Italy scanning statues/dealing with museum guards and cannot comment on my writing, which was horrible at that time.)

Despite the rejection, I did get some good comments from the reviews and I decided to continue on it. I conducted a little bit of literature survey and found some very good papers by Kris Popat [4, 5]. I was surprised by the image quality in [4, 5], which seemed better than De Bonet's later publication in SIGGRAPH 97 [7]. The only tiny problem with [4, 5] is that it can be pretty slow. (I was actually lucky enough to meet Kris in person to discuss this issue in Xerox Parc around early 1999.) However, the idea of clustering in [4, 5] looked awfully similar to Tree-structured VQ to me and I thought it might be a good way for acceleration. (I was again lucky enough to be able to discuss this with Robert Gray, the founder/co-founder of TSVQ, in my department.) My experiments with TSVQ were pretty promising and I published a paper in SIBGRAPHI 1999 for this idea [3]. This was before the time of Efros&Leung [2] and, contrary to what it appeared, I actually have the TSVQ algorithm before seeing [2]!

One problem with [3] is that the result images can be somehow noisy since TSVQ only searches one branch of the entire tree. To improve image quality, the next natural step is to search more parts of the tree and, in the extreme case, you can search the entire tree and this reduces to exhaustively searching all texture neighborhoods. (This is partly suggested by Pat Hanrahan in our only 1-1 meeting during my Ph.D. career.) This is the basic idea of the SIGGRAPH 2000 paper [1] and the idea of exhaustive searching looks very similar to Efros&Leung [2]. The several differences between our exhaustive/unaccelerated algorithm and [2] are: (1) we use a fixed neighborhood as in [4, 5] rather than a variable sized neighborhood in [2], and (2) we use a deterministic search procedure while [2] uses probability sampling. These two differences appear to be minor but they have several important implications. The use of a fixed neighborhood enables all kinds of processing such as clustering in [4, 5] and TSVQ in [1], and the deterministic nature of our algorithm is essential to guarantee the frame-to-frame coherence if texture synthesis is applied for rendering animations.

I found [2] as I was collecting the citations in preparing [1], and, honestly, I was shocked by the similarity between [2] and [3] when I first saw it. I was also impressed by the beauty and simplicity of [2]; it is wonderfully free of unnecessarily complex math, providing sharp contrast with other Markov-Random-Field texture synthesis papers. The applications of temporal texture and texture replacement in [1] are also inspired by [2].

Therefore, it is technically incorrect to say [1] is derived from [2]. [1] is in fact extended from [4, 5] and should be considered a concurrent work as [2], with the hard evidence of [3]. This trend of concurrency actually continued to SIGGRAPH 2001 with the appearance of two almost identical papers [10, 11] in the texturing session.

Side Notes

References

[1] "Fast Texture Synthesis using Tree-structured Vector Quantization", by Li-Yi Wei and Marc Levoy. In Proceedings of SIGGRAPH 2000.

[2] "Texture Synthesis by Non-parametric Sampling", by Alexei A. Efros and Thomas K. Leung. In Proceedings of ICCV 99.

[3] "Deterministic Texture Analysis and Synthesis using Tree Structure Vector Quantization", by Li-Yi Wei. In Proceedings of SIBGRAPHI 1999.

[4] "Cluster-based probability model and its application to image and texture processing", by Kris Popat and Rosalind W. Picard. IEEE Transactions on Image Processing, February 1997.

[5] "Novel cluster-based probability model for texture synthesis, classification, and compression", by Kris Popat and Rosalind W. Picard. Proc. SPIE Visual Communications '93, Cambridge, Massachusetts, 1993.

[6] Texture Synthesis based on Markov Random Fields. See the citations in [1] for more details.

[7] Multiresolution Sampling Procedure for Analysis and Synthesis of Texture Images, by Jeremy De Bonet. In Proceedings of SIGGRAPH 1997.

[8] Texture Characterization via Joint Statistics of Wavelet Coefficient Magnitudes, by Eero P Simoncelli and Javier Portilla. In Proc. 5th Int'l Conference on Image Processing, 1998.

[9] "Pyramid Based Texture Analysis/Synthesis", by David Heeger and James Bergen. In Proceedings of SIGGRAPH 95.

[10] "Texture Synthesis over Arbitrary Manifold Surfaces", by Li-Yi Wei and Marc Levoy. In Proceedings of SIGGRAPH 2001.

[11] "Texture Synthesis on Surfaces", by Greg Turk. In Proceedings of SIGGRAPH 2001.


liyiwei@graphics.stanford.edu