Simultaneous semantically coherent object-based long-term 4D scene flow estimation, co-segmentation and reconstructionis proposed exploiting the coherence in semantic class labels both spatially, between views at a single time instant, andtemporally, between widely spaced time instants of dynamic objects with similar shape and appearance. In this paper wepropose a framework for spatially and temporally coherent semantic 4D scene flow of general dynamic scenes from multipleview videos captured with a network of static or moving cameras. Semantic coherence results in improved 4D scene flowestimation, segmentation and reconstruction for complex dynamic scenes. Semantic tracklets are introduced to robustlyinitializethesceneflowinthejointestimationandenforcetemporalcoherencein4Dflow,semanticlabellingandreconstructionbetween widely spaced instances of dynamic objects. Tracklets of dynamic objects enable unsupervised learning of long-termflow, appearance and shape priors that are exploited in semantically coherent 4D scene flow estimation, co-segmentation andreconstruction. Comprehensive performance evaluation against state-of-the-art techniques on challenging indoor and outdoorsequences with hand-held moving cameras shows improved accuracy in 4D scene flow, segmentation, temporally coherentsemantic labelling, and reconstruction of dynamic scenes.


Semantically Coherent 4D Scene Flow of Dynamic Scenes
Armin Mustafa and Adrian Hilton
IJCV 2019


Data used in this work can be found in the CVSSP 3D Data Repository.


			author = {Mustafa, A. and Kim, H. and Hilton, A.},
			title = {Semantically Coherent 4D Scene Flow of Dynamic Scenes},
			journal = {International Journal of Computer Vision},
			year = {2019},
			issn = {1573-1405}


This research was supported by the Royal Academy of Engineering Research Fellowship RF-201718-17177, and the EPSRC Platform Grant on Audio-Visual Media Research EP/P022529.