Sobieraj, Iwona, Rencker, Lucas and Plumbley, Mark D (2018) Orthogonality-regularized masked NMF for learning on weakly labeled audio data In: IEEE ICASSP 2018, 15 - 20 April 2018, Calgary, Alberta, Canada.

Abstract Non-negative Matrix Factorization (NMF) is a well established tool for audio analysis. However, it is not well suited for learning on weakly labeled data, i.e. data where the exact timestamp of the sound of interest is not known. In this paper we propose a novel extension to NMF, that allows it to extract meaningful representations from weakly labeled audio data. Recently, a constraint on the activation matrix was proposed to adapt for learning on weak labels. To further improve the method we propose to add an orthogonality regularizer of the dictionary in the cost function of NMF. In that way we obtain appropriate dictionaries for the sounds of interest and background sounds from weakly labeled data. We demonstrate that the proposed Orthogonality-Regularized Masked NMF (ORM-NMF) can be used for Audio Event Detection of rare events and evaluate the method on the development data from Task2 of DCASE2017 Challenge.

Link to full paper ⤧  Next post Kong et al. (2018b) ⤧  Previous post Kong et al. (2018a)