We present an approach to multi-person 3D pose estimation and tracking from multi-view video. Following independent 2D pose detection in each view, we: (1) correct errors in the output of the pose detector; (2) apply a fast greedy algorithm for associating 2D pose detections between camera views; and (3) use the associated poses to generate and track 3D skeletons. Previous methods for estimating skeletons of multiple people suffer long processing times or rely on appearance cues, reducing their applicability to sports. Our approach to associating poses between views works by seeking the best correspondences first in a greedy fashion, while reasoning about the cyclic nature of correspondences to constrain the search. The associated poses can be used to generate 3D skeletons, which we produce via robust triangulation. Our method can track 3D skeletons in the presence of missing detections, substantial occlusions, and large calibration error. We believe ours is the first method for full-body 3D pose estimation and tracking of multiple players in highly dynamic sports scenes. The proposed method achieves a significant improvement in speed over state-of-the-art methods.


Multi-person 3D Pose Estimation and Tracking in Sports
Lewis Bridgeman, Marco Volino, Jean-Yves Guillemaut and Adrian Hilton
5th International Workshop on Computer Vision in Sports (CVsports) at CVPR 2019


	author = {Bridgeman, Lewis and Volino, Marco and Guillemaut, Jean-Yves and Hilton, Adrian},
	title = {Multi-Person 3D Pose Estimation and Tracking in Sports},
	booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
	month = {June},
	year = {2019}


This research was supported by EPSRC Grant (EP/N50977/1).