Complete scene reconstruction from single view RGBD is a challenging task, requiring estimation of scene regions occluded from the captured depth surface. We propose that scene-centric analysis of human motion within an indoor scene can reveal fully occluded objects and provide functional cues to enhance scene understanding tasks. Captured skeletal joint positions of humans, utilised as naturally exploring active sensors, are projected into a human-scene motion representation. Inherent body occupancy is leveraged to carve a volumetric scene occupancy map initialised from captured depth, revealing a more complete voxel representation of the scene. To obtain a structured box model representation of the scene, we introduce unique terms to an object detection optimisation that overcome depth occlusions whilst deriving from the same depth data. The method is evaluated on challenging indoor scenes with multiple occluding objects such as tables and chairs. Evaluation shows that human-centric scene analysis can be applied to effectively enhance state-of-the-art scene understanding approaches, resulting in a more complete representation than single view depth alone.


Towards Complete Scene Reconstruction from Single-View Depth and Human Motion
Sam Fowler, Hansung Kim and Adrian Hilton
BMVC 2017




			title = {Towards Complete Scene Reconstruction from Single-View Depth and Human Motion},
			author = {Fowler, S. and Kim, H. and Hilton, A.},
			booktitle = {BMVC},
			year = {2017}


This work was supported by the EPSRC Programme Grant S3A: Future Spatial Audio for an Immersive Listener Experience at Home (EP/L000539/1) and the BBC as part of the BBC Audio Research Partnership. We thank Teo DeCampos for his assistance in capturing data and continued support.

Related Work by Sam Fowler

Human-Centric Scene Understanding from Single View 360 Video, 3DV 2018
Affordance Surface Segmentation from Video of Human Activity in Indoor Scenes