Conditionally accepted to SIGGRAPH Asia 2015

Abstract

We present a novel method to generate 3D scenes that allow the same activities as real environments captured through noisy and incomplete 3D scans. As robust object detection and instance retrieval from low-quality depth data is challenging, our algorithm aims to model semantically correct rather than geometrically accurate object arrangements. Our core contribution is a new scene synthesis technique which, conditioned on a coarse geometric scene representation, models functionally similar scenes using prior knowledge learned from a scene database. The key insight underlying our scene synthesis approach is that many real-world environments are structured to facilitate specific human activities, such as sleeping or eating. We represent scene functionalities through virtual agents that associate object arrangements with the activities for which they are typically used. When modeling a scene, we first identify the activities supported by a scanned environment. We then determine semantically plausible arrangements of virtual objects retrieved from a shape database constrained by the observed scene geometry. For a given 3D scan, our algorithm produces a variety of synthesized scenes which support the activities of the captured real environments. In a perceptual evaluation study, we demonstrate that our results are judged to be visually appealing and functionally comparable to manually designed scenes.





3D Model Database: Meshes(6.9G) | Textures(284M)

The meshes file (wss.models.zip) is an zip of 12490 (MODEL_ID.obj.gz, MODEL_ID.mtl) pairs, plus a rooms.zip, where the *.obj.gz format is the zipped *.obj file. The textures file (wss.texture.zip) is an zip of 22985 *.jpg images.


3D Model Annotations from Previous Work: Categories(574K) | Scales(2.3M)

The model categories file is a tab-separated text file where each line has the fields MODEL_ID and category. The model scales file is is a comma-separated text file with a header where the "fullId" column corresponds to MODEL_ID and the "unit" column gives the virtual unit scale of the model in meters.


3D Scene Databases: from Previous Work plus Scenes(810K)

The scenes file is a json file.


Interaction Maps Defined over 3D Model Database: Interaction Maps(810K)

The interaction maps file is an zip of interaction maps (in *.arv format) defined over 394 models. The *.arv format is binary dump of the annotations. Please refer to our code for parsing them.


3D Scans and Results

Here are the input scans (in *.ply format) and results (in *.??? format)

[Computer Desk: Scan|Results]

[Single Dining Table: Scan|Results]

[Double Dining Table: Scan|Results]

[Living Room: Scan|Results]

[Studio Apartment: Scan|Results]

[Student Office: Scan|Results]

[Bedroom: Scan|Results]


Acknowledgements

We thank Angela Dai for helping with scanning and the video voice over. We also gratefully acknowledge the support from NVIDIA Corporation for hardware donations. This research is co-funded by the Max Planck Center for Visual Computing and Communications and NSFC grant 61202221.