Dataset Overview

The NAVVS dataset includes a variety of short volumetric sounding actions. It provides a valuable resource for multimodal research and testing under realistic conditions. The dataset includes ten different actions designed with both semantic and acoustic diversity. For each action, four 2-seconds takes are available to provide a total of forty audio-visual clips.

The scenes were captured at the Centre for Vision, Speech & Signal Processing (CVSSP) of the University of Surrey (UK) with the aid of multiple cameras and multiple microphones. For more information, please refer to the paper (see below) or contact the authors.

Along with the final clips' volumetric textured instances and the audio stereo mix, additional data is provided. This includes: the separated microphones' audio channels, raw images from the 16 UHD cameras, binary masks, camera calibration data, coarse visual hull reconstruction, and volumetric stereo refinement.



Paper


NAVVS IEEEVR21 paper

NAVVS IEEEVR21 poster




License

The datasets are free for research use only.

This agreement must be confirmed by a senior representative of your organisation. To access and use this data you agree to the following conditions:

The copyright of the NAVVS dataset is owned by The Centre for Vision Speech and Signal Processing, University of Surrey, UK. The data should not be redistributed. Permission is hereby granted to use the NAVVS dataset for academic purposes only, provided that it is referenced in publications related to its use as follows:

H. Stenzel, D. Berghi, M. Volino and P.J.B. Jackson, "Naturalistic audio-visual volumetric sequences dataset of sounding actions for six degree-of-freedom interaction," 2021 IEEE Conference on Virtual Reality and 3D User Interfaces Abstract and Workshop (VRW), 2021, pp. 637-638, doi: 10.1109/VRW52623.2021.00201.

    @inproceedings{Stenzel:IEEEVR:2021,
        AUTHOR = "Stenzel, Hanne and Berghi, Davide and Volino, Marco and Jackson, Philip J.B.",
        TITLE = "Naturalistic audio-visual volumetric sequences dataset of sounding actions 
		for six degree-of-freedom interaction",
        BOOKTITLE = "2021 IEEE Conference on Virtual Reality and 3D User Interfaces 
		Abstracts and Workshops (VRW)",
        YEAR = "2021",
        PAGES = 637-638,
        DOI = 10.1109/VRW52623.2021.00201
    }
	    
To request access to the NAVVS Dataset, or for other queries please contact:

Acknowledgments

Thanks to actors Hannah Finnimore and Kajsa Sunstrom, technician Phil Foster, and recording assistant Tom Mungall. Work supported by InnovateUK (105168) ’Polymersive: Immersive video production tools for studio and live events’.