This dataset is a part of the following publication (workshop):
"Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data"
Moira Shooter, Charles Malleson, Adrian Hilton
IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), CV4Animals, 17th of June 2024
The work was presented at the CV4Animals workshop
A multi-modal dataset 3DDogs-Lab was captured indoors, featuring various dog breeds trotting on a walkway.
It includes data from optical marker-based mocap systems, RGBD cameras, IMUs, and a pressure mat. While
providing high-quality motion data, the presence of optical markers and limited background diversity make the
captured video less representative of real-world conditions. To address this, we created 3DDogs-Wild, a
naturalised version of the dataset where the optical markers are in-painted and the subjects are placed in
diverse environments, enhancing its utility for training RGB image-based pose detectors.
The capture included 64 dogs, each performing three trials of walking and three of trotting. In light of
technical challenges with the capture hardware, it was not possible to obtain valid recordings for all
participants. As a result, there is a reduction in the number of usable subjects and trials available for
analysis. The final dataset contained a total of 37 subjects and 143 valid recordings.
All the data can be downloaded by clicking the following link: 3DDogs2024_full.tar.gz
You can also download the 3DDogs-Lab and
3DDogs-Wild separately, using the following
links: 3DDogsLab2024.tar.gz,
3DDogsWild2024.tar.gz
The structure of the content is stated bellow:
3DDogs-Lab
Raw RGBD data containing video sequences of dogs.
The subfolders are formatted in the following way:
d<subjectID>_<triaNumber><direction>/Images/<camID>/<frameNumber>.png
<subjectID>: The ID of the dog.
<trialNumber>: The trail number during data capture.
<camID>: The ID of the camera, meaning the camera that was capturing the data.
<frameNumber>: The frame number.
Text files containing the 3D ground truth from the optical marker system. The 3D ground truth is given in both global and camera coordinate systems.
3DDogs-Wild
JSON files containing the frames where the dogs are visible in camera frame along with the bounding box coordinates. These are used for trimming the raw video sequences.
In-the-wild RGB data containing trimmed video sequences of dogs. Two subfolders with different backgrounds.
The alphamattes to compose the final in-the-wild dataset.
Text files including the individuals used for training, validation and testing.
scripts
License file is in license.txt
@article{Shooter_2024_CV4Animals,
author = {Shooter, Moira and Malleson, Charles and Hilton, Adrian},
title = {Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), CV4Animals},
month = {June},
year = {2024},
}
This work was partially supported by Mars Petcare and the Leverhulme Trust Early Career Fellowship scheme. The authors would like to thank Alasdair Cook and Constanza Gómez Álvarez for accommodating the RGBD capture system as part of a larger study. The authors would also like to thank the owners of dogs included in the study, as well as Nicholas Gladman and Samantha Clifton for their assistance in dog recruitment and data collection.
For any questions about this dataset, please contact Moira Shooter.