Multi-view video acquisition is widely used for reconstruction and free-viewpoint rendering of dynamic scenes by directly resampling from the captured images. This paper addresses the problem of optimally resampling and representing multi-view video to obtain a compact representation without loss of the view-dependent dynamic surface appearance. Spatio-temporal optimisation of the multi-view resampling is introduced to extract a coherent multi-layer texture map video. This resampling is combined with a surface-based optical flow alignment between views to correct for errors in geometric reconstruction and camera calibration which result in blurring and ghosting artefacts. The multi-view alignment and optimised resampling results in a compact representation with minimal loss of information allowing high-quality free-viewpoint rendering. Evaluation is performed on multi-view datasets for dynamic sequences of cloth, faces and people. The representation achieves >90% compression without significant loss of visual quality.


Optimal Representation of Multi-View Video
Marco Volino, Dan Casas, John Collomosse and Adrian Hilton


Data used in this work can be found in the CVSSP Data Repository.


This research was supported by the EU-FP7 project RE@CT, BBC/EPSRC iCase Studentship and EPSRC Visual Media Platform Grant EP/F02827X. The authors would also like to thank Martin Klaudiny for providing the face and cloth datasets used in the evaluation.